1 Reply Latest reply on Aug 22, 2018 3:07 AM by Thirunavukkarasu Selvaraj

    Shared/Unique Config and Monitoring

    Bob Van Valzah New Member

      We want to maintain a single config file shared across several applications, yet we also want to have different parameters for some apps than others. I have a foggy memory that the sanctioned way to do this is for every UM app to set an application name before context creation? Then pattern matching in the config file can tune settings to taste, right?

       

      Please refresh my memory of the call to set the application name.

       

      Is there any other mechanism to select portions of a centralized config file on a per-application basis?

       

      I think Henry Wong mentioned that there were two different receiver NAK generation stats--one for NAKs sent the first time and one for "ReNAKs" that had to be repeated. I might've misunderstood. What are those stats?

       

           Thanks,

       

      Bob

        • 1. Re: Shared/Unique Config and Monitoring
          Thirunavukkarasu Selvaraj Active Member

          Hi Bob,

           

          Long time. Hope you're doing well

           

          XML Config:

           

          Yes, application name should be set before any other UM objects are created. This is primarily for the library to load the appropriate config parameters for the different objects (context/source/receiver/event queue). There are few ways you could set the application name:

          • API: The application could call the lbm_config_xml_file() or lbm_config_xml_string() to set the XML config. The API takes the application name as its second argument.
          • Environment variable: You could set the environment variables LBM_XML_CONFIG_FILENAME and LBM_XML_CONFIG_APPNAME.

          All our UM documents are available online now. Here's the link to the UM 6.11 Configuration Guide. Please refer to the section 1.4 XML Configuration Files.

           

          From here you could create different templates and apply them to the applications. Templates could be applied at multiple levels - application, context, sources, topics, receivers/wildcard receivers, event queues, etc. You could also apply multiple templates to the same object.

           

          NAK Stats:

           

          I believe Henry might be talking about the below stats:

          • nak_tx_min: Minimum number of times per lost message that a receiver transport transmitted a NAK, i.e., the lowest value collected so far. A value greater than 1 indicates a chronically lossy network.
          • nak_tx_mean: Mean number of times per lost message that a receiver transport transmitted a NAK. Ideally this should be at or near 1. A higher value indicates a lossy network. This is an exponentially weighted moving average (weighted to more recent) for accumulated NAKs per lost message.
          • nak_tx_max: Maximum number of times per lost message that a receiver transport transmitted a NAK, i.e., the highest value collected so far. A value higher than 1 suggests that there may have been some unrecoverable loss on the network during the sample period. A significantly high value (compared to the mean number) implies an isolated incident.

           

          There are few other stats useful in identifying cry baby receiver(s) clogging the RX:

          • ncfs_rx_delay: Number of NCFs received with reason code "rx_delay". When a source transport's retransmit rate limiter prevents it from immediately retransmitting any more lost datagrams, it responds to a NAK by sending an "NCF rx_delay", then queues the retransmission for a later send. The receiver transport should wait for the retransmission and not immediately send another NAK. If this count is high, one or more crybaby receiver transports may be clogging the source transport's retransmit queue.
          • ncfs_shed: Number of NCFs received with reason code "shed". When a source transport's retransmit queue and rate limiter are both at maximum, it responds to a NAK by sending an "NCF shed", and does not retransmit. The receiver transport should wait, then send another NAK. If this count is high, one or more crybaby receiver transports may be clogging the source transport's retransmit queue.
          • nak_stm_min: Minimum time (in milliseconds), i.e., the shortest time recorded so far for a lost message to be recovered. If this time is greater than configuration option transport_lbtrm_nak_backoff_interval, it may be taking multiple NAKs to initiate retransmissions, indicating a lossy network.
          • nak_stm_mean: Mean time (in milliseconds) in which loss recovery was accomplished. This is an exponentially weighted moving average (weighted to more recent) for accumulated measured recovery times. Ideally this field should be as close to your minimum recovery time (nak_stm_min, above) as possible. High mean recovery times indicate a lossy network.
          • duplicate_data: Number of duplicate LBT-RM datagrams received. A large number can indicate a lossy network, primarily due to other receiver transports requesting retransmissions that this receiver transport has already successfully received. Such duplicates require extra effort for filtering, and this should be investigated.

          I hope this what you're looking for.

           

          Thanks!

           

          Kind regards,

          Thiru

          1 of 1 people found this helpful