Related Jira

OAM-1 - Getting issue details... STATUS



OAM interface specification

The figure below shows the generic principals of reading and writing operations via TLS/NetConf/Yang for configuration management. 

Subject of this discussion is about implementing a mechanisme, which ensures that the NetConf session is closed after NetConf operations. O-RAN-SCO1-Interface: Terminate NetConf Session  Management-Service (MnS) ProviderManagedElement  Management-Service (MnS) ConsumerOAM Controller «NetConf»Server «NetConf»Server «NetConf»Client «NetConf»Client NetConf Operations (simplyfied) [01]tcp/tls/netconf/830Establish NetConf Session (hello, ...) [02]NETCONF <rpc><edit-config> or others [03]NETCONF <rpc-reply><OK> or <rpc-reply><rpc-error><error-xxx> [04]Terminate NetConf SessionLicenseApache 2.0Terminate NetConf SessionThanks to plantUml!2019-07-27 | o-ran-sc.org

Typical NetConf client behavior

Usually NetConf clients do not terminate the NetConf session automatically. With respect to ONAP-CCSDK/OpenDaylight; once ODL is aware of a NetConf server by creating a mountpoint in odl-netconf-topology, it has an automated mechanism trying to connect that NetConf server. In case the NetConf session is lost, it automatically will re-connect. 

Such mechanism needs to be "disabled" and the NetConf Session termination request must be implemented.

Proposed solution

Add an optional attribute called 'permanet-connection' to the netconf-node-topology.yang to grouping 'netconf-node-connection-parameters' with type BOOLEAN and default value TRUE. 

This way applications controlling the NetConf connection can distinguish between the currently existing behavior and the new O1-interface-behavior. Old implementations are not affected, because the new attribute is optional and its default value is according to the current behavior. 


Proposed modification on network-node-topology.yang:


Proposed change
 ...
 149         }
 new
 new         leaf permanent-connection {
 new             config true;
 new             type boolean;
 new             default true;
 new             description "If false, the connector disconnects, by sending <close-session> 
                              after a transaction is completed - usually after the response 
							  of <unlock>." 
 new         }
 150 
 151         leaf connection-timeout-millis {
 152             description "Specifies timeout in milliseconds after which connection must be established.";
 ...


  • No labels

8 Comments

  1. What method is proposed to restart the session ?

    Appliations doing this with a customer netconf adapter in ODL basically end up creating and deleting the connection - the permanent-connection flag implies there is a matching function to have ODL restart the connenction otherwise the requesting client could simply delete the ODL connection after it is no longer needed without any changes to ODL.

    1. Brian, not sure if I got your point. Are you thinking about a software bundle between odl-netconf (client) and the NetConf-Server of network functions? If so, it would act from ODL point of view as NetConf-server and would terminate the SSH - right?. In any case from the NetConf-Server point of view running on the network function, the whole NetConf-Hello-Message process must be processed each time a new connection is created. Am I right? Please see Architecture discussion for Non-Permanent NetConf Sessions

  2. What I do not understand is why does O-RAN feel the need to terminate the NETCONF session every time it thinks it is done with the configuration? How does it know when the next configuration might be requested? And is it more efficient for ODL or any other NETCONF client to terminate and re-establish the connection every time a configuration change needs to be made?

    1. In O-RAN the RAN architecture becomes disaggregated. Assuming a hybrid model to the O-RU, the current RAN number of elements will increase at least an additional order of magnitude. This would then be in the millions of devices in a large network. The number of servers alone, required to maintain millions of SSH sessions is an overhead. It is not expected that there will be continuous configuration changes, as this would destabilize the network. Fine grain tuning is expected to occur through the near-RT RIC with temporary adjustments over the E2 interface. Therefore it is not really cost effective to have the servers required to maintain permanent NETCONF sessions to millions of devices, when there really is nothing for the controller to do with the device.

      1. In that case, why not let the underlying protocol do what it does best? You do have idle timeout in TCP that will detect that there is no activity on a NETCONF session, and close the connection. This assumes of course that a keepalive timer has not been set on the TCP connection, which can be done as a policy. But in either case, the semantics of the underlying protocol deciding when a connection is inactive, how many resources are in use and can be closed, is better than a NETCONF client trying to guess when the next configuration change might happen. An <unlock> is a good indication of when the particular transaction is done, but not an indicator of whether the NETCONF client is done with making all the changes it needs to do.

        1. Agreed that 1 connection = 1 transaction would be a terrible way to implement this kind of feature.

          The "use what TCP provides us" idea is sound, however the current behavior is an automatically re-establishing connection.  Allowing connecitons to simply time out would keep the impact to the current behavior minimal, to be sure.

      2. There are some very large assumptions being made here – specifically regarding the number of servers required to maintain SSH sessions.  No one has yet produced this data, done the tuning, done it at a reasonably representative scale, let alone compared it to the massive cost of re-establishing that SSH+Netconf handshake every time a service needs to <get> or <get-config> from the million-plus elements.

        Lumina Networks will be performing this data collection over the next few months, making it public, and providing tuning and recommendations.  Then we can have a proper discussion based on the data, and determine not only the controller-to-element ratio, but specific costs for such transactions (at scale) for comparison.   In my significant experience with massively-scaled Netconf, "tuning is everything." 

        The 2nd part of the conversation is more interesting, re: how often we'll actually be sending transactions over the netconf connection.  This needs to include both "sunny day" and "rainy day" scenarios.


        1. Precisely. That was going to be by next question.

          There is a cost that is associated with closing and establishing a new connection every time a config change needs to be pushed. Have we evaluated that cost vs. keeping the NETCONF session up all the time (or at least till TCP times it out)?

          Reminds me of HTTP(S) 1.0 to 1.1 where we went from carrying one payload per connection to persistent connections.