Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The near-RT RIC (RIC for short) Self-Health-Check flow fulfills the requirement that all systems need to monitor their own health – internal subsystems, hosted software, and external interfaces. 

Internal Self-Check - At configurable intervals, RIC is to trigger Health-Check requests to its internal common platform modules and hosted xAPPs.  Each platform module and each xAPP are required to support Health-Check requests and to perform a self-check. 

Platform Modules and xAPPs

The specific Health-Check validations are as follows (see flow diagram on Figure 1):

Alarms and Notifications - Based on Health-Check results, the RIC is required to maintain a list of anomaly conditions - alarms and alerts - that represent the state of its health.  Alarm/Alert conditions are to be raised and sent as notifications.


Self-Check of Platform Modules and xAPPs

The RIC is responsible to check the health of RIC Platform modules and xAPP instances hosted on the RIC:

  • Ability to internally initiate Health-checks Health of RIC platform modules –Ability to initiate a Health-check on each of the common platform modules within the RIC (e.g., logging, tracing, conflict manager, xAPP manager, subscription management, O1 Termination, A1 Mediator, etc.), store results and declare alarm/alert conditions [5 in Figure 1]O1 Termination, A1 Mediator, E2 Termination, E2 Manager, xAPP Manager, Subscription Manager, etc.).
  • Internal self-checks are to be done at default intervals.  Intervals are to be configurable during run-time.  
  • Each platform module is required to support a health-check request.  Initially, the modules may simply need to send a response message to indicate that the connectivity is still up and messaging pathway still operational.  (In later releases, additional diagnostics may be needed to ensure RIC lifecycle management is robust and carrier-grade.)  
  • Self-check results on platform modules are to be logged.


Alarms, Clearings and Notifications

  • Anomaly conditions can be encountered as part of the self-check process or during normal operation (e.g., cannot send message to another module via RMR).  
  • For each anomaly condition, the RIC and/or that RIC platform module needs to determine the severity and whether it is mappable to an alarm type.  
  • Alarms found in either case (self-check or normal operation) require a notification to be sent immediately via the O1 VES interface.
  • Alarms are to be stored and captured as part of the alarm list of the RICAbility for each common platform module in the RIC to perform a self-check [6]
  • Implementation Option[1]: The self-check can potentially leverage Kubernetes Liveness and Readiness probes. Liveness probes can be configured to execute a command, issue a http-get, and open a TCP socket against the container/pod.   Readiness probes can be configured to ensure the pod is ready before allowing it handle traffic.  To further check a module’s (pod) ability to communicate with other modules over RMR (RIC Message Router), each module could subscribe to its own topic, send a hello-world message regularly to itself and ensure it can send and receive messages.
  • Any alarm/alert conditions or clearing of alarms/alerts are sent immediately via the O1 VES interface. [7-8]RMR
  • Health of xAPPs
    • Ability of RIC to invoke Health-check requests to each of the xAPP instances deployed on the RIC [9]
    • Ability of each xAPP to perform Health-checks on itself and respond back to the RIC [10]
      • Implementation Option: See Implementation Option above for platform modules.
    • Any alarm/alert conditions or clearing of alarms/alerts are sent immediately via the O1 VES interface. [7-8]
    • normalize alarms conditions

External Interfaces

For external interfaces, the RIC is responsible to check its interface functions - O1 Termination, A1 Mediator, and E2 Termination modules.  

In addition, heartbeats or keep-alive signals over O1 are verified by the NB clients invoking the O1.  

The RIC also checks heartbeat message come from RAN resources over the E2 interface.

...

To support this flow, a new Health-Check functional block within the RIC is being proposed, which can be implemented as a separate software module or , as a distributed function across one or more existing modules, and/or as existing capabilities already available from the underlying container infrastructure such as Kubernetes' container/pod lifecycle management.  The Health-Check functional block has to perform the following:

  • Perform health-checks on the underlying common RIC platform functions/modules and on xAPP instances hosted on the RIC (self-checks at configured intervals and on-demand requests)
  • Map failures and anomalies to alarms and alerts
  • Send out notifications for alarms and alerts
  • Determine the state of the RIC based on alarms and alerts
  • Store health-check results for queries
  • Clear alarms and alerts when conditions clear

...

Figure 1 below shows the flow of RIC Self-Checks – regular heartbeats over O1 and A1, the Health-Check Module initiating health-check requests within the RIC to assess its overall health, and issuing alarms/alerts, as appropriate based on health results.<<insert sequence diagram and


PlantUML Macro

...

PlantUML Macro
border3
aligncenter
titleRIC Self-Check
@startuml
Autonumber
Skinparam sequenceArrowThickness 2
skinparam ParticipantPadding 5
skinparam BoxPadding 10

Box SMO #gold
    Participant SMO_O1 as “O1” <<OAM>>
    Participant RPGE as “Non-RT RIC” <<NONRTRIC>>
End box

Box “O-RAN RIC” #lightpink
    Participant A1TERM as “A1 MED” <<RIC>>
    Participant O1TERM as “O1 TERM” <<RIC>>
    Participant HC as "HealthCk Function" <<RIC>>
    Participant MOD as "Platform Modules" <<RIC>>
    Participant xAPP as “xAPPs” <<RICAPP>>
    Participant E2SIM as “E2 Node” <<SIM>>
End box

=== Alarms/Alerts from individual RIC Platform Modules and xAPPs ==
Note over RPGE,HC
Note: Apart from Health-Checks, a platform module or xAPP may generate an alarm/alert (or
its clearing) when it encounters a failure/error (e.g., failure to reach another module)
End Note

MOD -> O1TERM : Platform Module Alarm/Alert/Clear
O1TERM -> SMO_O1 : <<O1VES>> Alarm/Alert or Clear
xAPP -> O1TERM : xAPP Alarm/Alert/Clear
O1TERM -> SMO_O1 : <<O1VES>> Alarm/Alert or Clear

=== RIC Self-Checks @ Regular Intervals ==

Note over A1TERM,HC #lightsalmon
 Support HealthCheck Telemetry (FM, Heartbeat, PM)
End note

Note over HC 
 RIC Self-Checks Initiated
End note

Group loop for each Platform Module
HC -> MOD : Perform HealthCheck
Note Left 
 Support Platform Module HealthCheck
End Note
MOD -> HC : HealthCheck Status
End

HC -> O1TERM : Platform Module Alarm/Alert/Clear
O1TERM -> SMO_O1 : <<O1VES>> Alarm/Alert or Clear


Group loop for each xAPP instance deployed
HC -> xAPP : Perform HealthCheck
xAPP -> HC : HealthCheck Status
Note Left #lightsalmon
 Support xAPP HealthCheck
End note
End

HC -> O1TERM : xAPP Alarm/Alert/Clear
O1TERM -> SMO_O1 : <<O1VES>> Alarm/Alert or Clear

HC -> E2SIM : <<E2>> Generate Report
Note Left #lightsalmon : Support E2 Test Message Processing
E2SIM -> HC : <<E2>> REPORT
HC -> HC :
Note Right : Calculate report size response times as PM measures

HC -> O1TERM : E2 Alarm/Alert/Clear
O1TERM -> SMO_O1 : <<O1VES>> Alarm/Alert or Clear

HC -> O1TERM : Store all HC results in yang model
Note Left #lightsalmon : Support Alarm Retrieval from SMO, Dashboard



Alt If HC is performed on-demand, make file available to client
Note over RPGE : Publish Results
HC -> O1TERM : Performance File Available
O1TERM -> SMO_O1 : <<O1VES>> HealthCheck Performance File Available
SMO_O1 -> O1TERM : <<O1>>Get PM Report File 
SMO_O1 -> SMO_O1 : Make PM File Available for Sharing

End


@enduml

...

[1] Implementation options are suggested at the use case level, to be further fleshed out refined/finalized during user stories phase.

...