NVIDIA/nv-bmc-shmem
C++
Captured source
source ↗NVIDIA/nv-bmc-shmem
Description: Shared memory IPC on BMC
Language: C++
License: Apache-2.0
Stars: 15
Forks: 6
Open issues: 0
Created: 2024-04-18T15:54:14Z
Pushed: 2026-06-13T15:49:51Z
Default branch: develop
Fork: no
Archived: no
README:
Library for Shared Memory based Sensor Aggregation
Author: Chinmay Shripad Hegde Rohit Pai
Created: 2023-12-27
Problem Description
In the existing architecture bmcweb makes Object manager based D-Bus calls and processes all the properties in the response to filter out and prepare the properties required for MRD URIs. With this approach we have performance impacts such as % PID loop outliers and maximum MRD TAT crossing the SLAs.
Background and References
Shared memory sensor repository acts like a cache for bmweb where all the sensor data and metric values required to prepare the response for MRD requests are readily available. The repository is periodically updated by the sensor aggregator library which gets the data from individual sensor producers. Sensor producers will use shared memory APIs to insert and update the MRD values.This brings in significant reduction in processing time of MRD requests in bmcweb which will help to improve the SLAs.
References
| Content | Link | | --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | | Redfish Telemetry Performance | https://docs.google.com/presentation/d/1Srm3HF2QK7uQGdDESiIBHUAMPdG0nOXfHvY94QH18uo/edit#slide=id.g25866484049_0_0 | | TAL Fasibility Study and Recommendation for Umbriel | https://docs.google.com/document/d/1UanEEKkR5ffHC1_QCb55f9HONJ3zH9e8ocVhRK0gc6Q/edit#heading=h.vvyjokuu77rq | | Requirements | https://docs.google.com/document/d/1W--n4Mu_twg8rsMzq8JtJmyqOaWnEVHluvHbr0BVvEE/edit | | Telemetry SLAs | https://docs.google.com/document/d/1nP8F1Xc6kB_qU5xSr0NDQ5lQL1bTkEux0O3Rrmg8PUw/edit#heading=h.1jwg24jxwkrr |
Requirements
External API requirements
- API for data producer to update telemetry objects in shared memory.
- API for client such as bmcweb to read telemetry objects for a given metric
namespace.
Shared Memory internal requirements:
- Shmem library must support creation of shared memory segments which can be
used as fast IPC between multiple processes.
- The Shmem library must abstract the internal memory layout required to save
the objects.
- Shmem library must support unified object access APIs.
- Shemem library must support efficient access to single object.
- Shemem library APIs must ensure data integrity and protection in a multi
process environment.
- Shemem library must provide CLI tool which can be invoked from shell to read,
updated and erase objects in the shared memory segments.
Proposed Design
┌─────────────────────────────────┐ │ │ ┌────────┐ │ ┌──────────────────────────┐ │ │ │ │ │ │ │ │ Client ├──────────┼──► getAllMRDValues │ │ │ │ │ │ │ │ └────────┘ │ └────────────▲─────────────┘ │ │ │ │ ┌─────────────────┐ │ └──────Read───────┼─────────────► │ │ ┌──────Write──────┼─────────────► Shared Memory │ │ │ │ └─────────────────┘ │ ┌────────────┴─────────────┐ │ ┌────────────┐ │ │ - namespaceInit │ │ │ │ │ │ - updateTelemetry │ │ │Producer - 1├───┬──┼──► │ │ │ │ │ │ └──────────────────────────┘ │ └────────────┘ │ │ │ │ │ Shared Memory Library │ ┌────────────┐ │ │ │ │ │ │ └─────────────────────────────────┘ │Producer - 2├───┘ │ │ └────────────┘
Sensor Producer Workflow
Producers will use the sensor aggregator library APIs to
- Initialize the producer name
- Create Sensor Objects in the shared memory
- Update Value and/OR Timestamp values
Bmcweb Workflow
- Sensor Aggregator library provides an API to get all objects from a MRD
namespace.
- Based on the MRD requested in the URI, the library creates a list of shared
memory namespaces whose metric type matches with the requested metric type in the URI. From each of the shared memory regions read all objects and aggregate them.
Configuration json for metric property mapping
There would be a configuration file which maps MRD namespace to all the shared memory namespaces exposed by each of the sensor provider services. SensorNamespaces key contains rules for each of the sensor namespace which is combination of Namespace, ObjectpathKeywords and PropertyList.
Note: By default shm_namespace_config.json file present in configurations directory will be used. Override this file in your platform recipe file based on the requirement.
If you need to add new property to existing metric
- Identify the namespace
- Verify the object path and add if new object paths is necessary
- Update Property in the identfied section
If you need to add a new namespace
- Update namespace and size in [shm_mapping.json](#shared-memory-mapping-json)
file
- Add a new entry in
shm_namespace_config.jsonfile with namespace, object
path keyword and property details
An example of this configuration file is given below.
{
"SensorNamespaces": [
{
"Namespace": "PlatformEnvironmentMetrics",
"ObjectpathKeywords": "sensors/temperature",
"PropertyList": ["Value"]
},
{
"Namespace": "MemoryMetrics",
"ObjectpathKeywords": "memory",
"PropertyList": [
"MemoryConfiguredSpeedInMhz",
"Utilization",
"ceCount",
"ueCount",
"RowRemappingFailureState",
"ceRowRemappingCount",
"ueRowRemappingCount"
]
}
]
}Shared memory mapping json
This file will have updating producer names for all the sensor namespaces. For each namespace
Producerskey will contain list of processes which updates shared memory for
the given namespace.
- Shared memory size should be specified in bytes with field
SizeInBytes
Note: By default shm_mapping.json file present in configurations directory will be used. Override this file in your platform recipe file based on the requirement.
SizeInBytes is currently defined as below. The calculation is based on used size in HGX platform for each of the metrics namespaces.
Used size is calculated using managed_shared_memory.get_free_memory() api. Based on the used size it's adjusted to it's near 2's power value in bytes.
| Namespace | Size in system KB | Allocated size | | -------------------------------- | ----------------- | -------------- | | MemoryMetrics_0 | 46.1328125 | 128 KB | | NVSwitchMetrics_0 | 40.4453125 | 128 KB | | NVSwitchPortMetrics_0 | 894.4765625 | 1024 KB | | PlatformEnvironmentMetrics_0 | 55.921875...
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Routine infrastructure repo with low traction.