coreweave/perftest
forked from linux-rdma/perftest
Captured source
source ↗coreweave/perftest
Description: Infiniband Verbs Performance Tests
Language: C
License: NOASSERTION
Stars: 1
Forks: 0
Open issues: 1
Created: 2023-06-12T07:28:00Z
Pushed: 2025-08-21T15:00:51Z
Default branch: master
Fork: yes
Parent repository: linux-rdma/perftest
Archived: no
README: Open Fabrics Enterprise Distribution (OFED) Performance Tests README
=============================================================================== Table of Contents =============================================================================== 1. Overview 2. Installation 3. Notes on Testing Methodology 4. Test Descriptions 5. Running Tests 6. Known Issues
=============================================================================== 1. Overview =============================================================================== This is a collection of tests written over uverbs intended for use as a performance micro-benchmark. The tests may be used for HW or SW tuning as well as for functional testing.
The collection contains a set of bandwidth and latency benchmark such as:
- Send - ib_send_bw and ib_send_lat
- RDMA Read - ib_read_bw and ib_read_lat
- RDMA Write - ib_write_bw and ib_write_lat
- RDMA Atomic - ib_atomic_bw and ib_atomic_lat
- Native Ethernet (when working with MOFED2) - raw_ethernet_bw, raw_ethernet_lat
Please post results/observations to the openib-general mailing list. See "Contact Us" at http://openib.org/mailman/listinfo/openib-general and http://www.openib.org.
=============================================================================== 2. Installation =============================================================================== -After cloning the repository a perftest directory should appear in your current directory
-Cloning example : git clone , In our situation its --> git clone https://github.com/linux-rdma/perftest.git
-After cloning, Follow this commands:
-cd perftest/
-./autogen.sh
-./configure Note:If you want to install in a specific directory use the optional flag --prefix= , e.g: ./configure --prefix=
-make
-make install
-All of the tests will appear in the perftest directory and in the install directory. =============================================================================== 3. Notes on Testing Methodology ===============================================================================
- The benchmarks use the CPU cycle counter to get time stamps without context
switch. Some CPU architectures (e.g., Intel's 80486 or older PPC) do not have such capability.
- The latency benchmarks measure round-trip time but report half of that as one-way
latency. This means that the results may not be accurate for asymmetrical configurations.
- On all unidirectional bandwidth benchmarks, the client measures the bandwidth.
On bidirectional bandwidth benchmarks, each side measures the bandwidth of the traffic it initiates, and at the end of the measurement period, the server reports the result to the client, who combines them together.
- Latency tests report minimum, median and maximum latency results.
The median latency is typically less sensitive to high latency variations, compared to average latency measurement. Typically, the first value measured is the maximum value, due to warmup effects.
- Long sampling periods have very limited impact on measurement accuracy.
The default value of 1000 iterations is pretty good. Note that the program keeps data structures with memory footprint proportional to the number of iterations. Setting a very high number of iteration may have negative impact on the measured performance which are not related to the devices under test. If a high number of iterations is strictly necessary, it is recommended to use the -N flag (No Peak).
- Bandwidth benchmarks may be run for a number of iterations, or for a fixed duration.
Use the -D flag to instruct the test to run for the specified number of seconds. The --run_infinitely flag instructs the program to run until interrupted by the user, and print the measured bandwidth every 5 seconds.
- The "-H" option in latency benchmarks dumps a histogram of the results.
See xgraph, ygraph, r-base (http://www.r-project.org/), PSPP, or other statistical analysis programs.
*** IMPORTANT NOTE: When running the benchmarks over an Infiniband fabric, a Subnet Manager must run on the switch or on one of the nodes in your fabric, prior to starting the benchmarks.
Architectures tested: i686, x86_64, ia64
=============================================================================== 4. Benchmarks Description ===============================================================================
The benchmarks generate a synthetic stream of operations, which is very useful for hardware and software benchmarking and analysis. The benchmarks are not designed to emulate any real application traffic. Real application traffic may be affected by many parameters, and hence might not be predictable based only on the results of those benchmarks.
ib_send_lat latency test with send transactions ib_send_bw bandwidth test with send transactions ib_write_lat latency test with RDMA write transactions ib_write_bw bandwidth test with RDMA write transactions ib_read_lat latency test with RDMA read transactions ib_read_bw bandwidth test with RDMA read transactions ib_atomic_lat latency test with atomic transactions ib_atomic_bw bandwidth test with atomic transactions
Raw Ethernet interface benchmarks: raw_ethernet_send_lat latency test over raw Ethernet interface raw_ethernet_send_bw bandwidth test over raw Ethernet interface
=============================================================================== 5. Running Tests ===============================================================================
Prerequisites: kernel 2.6 (kernel module) matches libibverbs (kernel module) matches librdmacm (kernel module) matches libibumad (kernel module) matches libmath (lm) (linux kernel module) matches pciutils (lpci).
Server: ./ Client: ./
o is IPv4 or IPv6 address. You can use the IPoIB address if IPoIB is configured. o --help lists the available
*** IMPORTANT NOTE: The SAME OPTIONS must be passed to both server and client.
Common Options to all tests: ---------------------------- -h, --help Display this help message screen -p, --port= Listen on/connect to port (default: 18515) -R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs -z, --comm_rdma_cm Communicate with rdma_cm module to exchange…
Excerpt shown — open the source for the full document.