RepoNVIDIANVIDIApublished Apr 18, 2024seen 1w

NVIDIA/nv-bmc-shmem

C++

Open original ↗

Captured source

source ↗
published Apr 18, 2024seen 1wcaptured 1whttp 200method plain

NVIDIA/nv-bmc-shmem

Description: Shared memory IPC on BMC

Language: C++

License: Apache-2.0

Stars: 15

Forks: 6

Open issues: 0

Created: 2024-04-18T15:54:14Z

Pushed: 2026-06-13T15:49:51Z

Default branch: develop

Fork: no

Archived: no

README:

Library for Shared Memory based Sensor Aggregation

Author: Chinmay Shripad Hegde Rohit Pai

Created: 2023-12-27

Problem Description

In the existing architecture bmcweb makes Object manager based D-Bus calls and processes all the properties in the response to filter out and prepare the properties required for MRD URIs. With this approach we have performance impacts such as % PID loop outliers and maximum MRD TAT crossing the SLAs.

Background and References

Shared memory sensor repository acts like a cache for bmweb where all the sensor data and metric values required to prepare the response for MRD requests are readily available. The repository is periodically updated by the sensor aggregator library which gets the data from individual sensor producers. Sensor producers will use shared memory APIs to insert and update the MRD values.This brings in significant reduction in processing time of MRD requests in bmcweb which will help to improve the SLAs.

References

| Content | Link | | --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Redfish Telemetry Performance | https://docs.google.com/presentation/d/1Srm3HF2QK7uQGdDESiIBHUAMPdG0nOXfHvY94QH18uo/edit#slide=id.g25866484049_0_0 | | TAL Fasibility Study and Recommendation for Umbriel | https://docs.google.com/document/d/1UanEEKkR5ffHC1_QCb55f9HONJ3zH9e8ocVhRK0gc6Q/edit#heading=h.vvyjokuu77rq | | Requirements | https://docs.google.com/document/d/1W--n4Mu_twg8rsMzq8JtJmyqOaWnEVHluvHbr0BVvEE/edit | | Telemetry SLAs | https://docs.google.com/document/d/1nP8F1Xc6kB_qU5xSr0NDQ5lQL1bTkEux0O3Rrmg8PUw/edit#heading=h.1jwg24jxwkrr |

Requirements

External API requirements

  • API for data producer to update telemetry objects in shared memory.
  • API for client such as bmcweb to read telemetry objects for a given metric

namespace.

Shared Memory internal requirements:

  • Shmem library must support creation of shared memory segments which can be

used as fast IPC between multiple processes.

  • The Shmem library must abstract the internal memory layout required to save

the objects.

  • Shmem library must support unified object access APIs.
  • Shemem library must support efficient access to single object.
  • Shemem library APIs must ensure data integrity and protection in a multi

process environment.

  • Shemem library must provide CLI tool which can be invoked from shell to read,

updated and erase objects in the shared memory segments.

Proposed Design

┌─────────────────────────────────┐
│ │
┌────────┐ │ ┌──────────────────────────┐ │
│ │ │ │ │ │
│ Client ├──────────┼──► getAllMRDValues │ │
│ │ │ │ │ │
└────────┘ │ └────────────▲─────────────┘ │
│ │ │ ┌─────────────────┐
│ └──────Read───────┼─────────────► │
│ ┌──────Write──────┼─────────────► Shared Memory │
│ │ │ └─────────────────┘
│ ┌────────────┴─────────────┐ │
┌────────────┐ │ │ - namespaceInit │ │
│ │ │ │ - updateTelemetry │ │
│Producer - 1├───┬──┼──► │ │
│ │ │ │ └──────────────────────────┘ │
└────────────┘ │ │ │
│ │ Shared Memory Library │
┌────────────┐ │ │ │
│ │ │ └─────────────────────────────────┘
│Producer - 2├───┘
│ │
└────────────┘

Sensor Producer Workflow

Producers will use the sensor aggregator library APIs to

  • Initialize the producer name
  • Create Sensor Objects in the shared memory
  • Update Value and/OR Timestamp values

Bmcweb Workflow

  • Sensor Aggregator library provides an API to get all objects from a MRD

namespace.

  • Based on the MRD requested in the URI, the library creates a list of shared

memory namespaces whose metric type matches with the requested metric type in the URI. From each of the shared memory regions read all objects and aggregate them.

Configuration json for metric property mapping

There would be a configuration file which maps MRD namespace to all the shared memory namespaces exposed by each of the sensor provider services. SensorNamespaces key contains rules for each of the sensor namespace which is combination of Namespace, ObjectpathKeywords and PropertyList.

Note: By default shm_namespace_config.json file present in configurations directory will be used. Override this file in your platform recipe file based on the requirement.

If you need to add new property to existing metric

  • Identify the namespace
  • Verify the object path and add if new object paths is necessary
  • Update Property in the identfied section

If you need to add a new namespace

  • Update namespace and size in [shm_mapping.json](#shared-memory-mapping-json)

file

  • Add a new entry in shm_namespace_config.json file with namespace, object

path keyword and property details

An example of this configuration file is given below.

{
"SensorNamespaces": [
{
"Namespace": "PlatformEnvironmentMetrics",
"ObjectpathKeywords": "sensors/temperature",
"PropertyList": ["Value"]
},
{
"Namespace": "MemoryMetrics",
"ObjectpathKeywords": "memory",
"PropertyList": [
"MemoryConfiguredSpeedInMhz",
"Utilization",
"ceCount",
"ueCount",
"RowRemappingFailureState",
"ceRowRemappingCount",
"ueRowRemappingCount"
]
}
]
}

Shared memory mapping json

This file will have updating producer names for all the sensor namespaces. For each namespace

  • Producers key will contain list of processes which updates shared memory for

the given namespace.

  • Shared memory size should be specified in bytes with field SizeInBytes

Note: By default shm_mapping.json file present in configurations directory will be used. Override this file in your platform recipe file based on the requirement.

SizeInBytes is currently defined as below. The calculation is based on used size in HGX platform for each of the metrics namespaces.

Used size is calculated using managed_shared_memory.get_free_memory() api. Based on the used size it's adjusted to it's near 2's power value in bytes.

| Namespace | Size in system KB | Allocated size | | -------------------------------- | ----------------- | -------------- | | MemoryMetrics_0 | 46.1328125 | 128 KB | | NVSwitchMetrics_0 | 40.4453125 | 128 KB | | NVSwitchPortMetrics_0 | 894.4765625 | 1024 KB | | PlatformEnvironmentMetrics_0 | 55.921875...

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine infrastructure repo with low traction.