ReleaseNVIDIANVIDIApublished Apr 14, 2026seen 5d

NVIDIA/cudaqx 0.6.0

NVIDIA/cudaqx

Open original ↗

Captured source

source ↗
published Apr 14, 2026seen 5dcaptured 10hhttp 200method plain

0.6.0

Repository: NVIDIA/cudaqx

Tag: 0.6.0

Published: 2026-04-14T14:24:45Z

Prerelease: no

Release notes:

CUDA-Q QEC 0.6.0 and CUDA-Q Solvers 0.6.0

This is combined release of CUDA-Q QEC and CUDA-Q Solvers, both version 0.6.0.

This is the first CUDA-Q QEC release that builds example decoder applications on top of CUDA-Q Realtime [blog]. CUDA-Q QEC 0.6 ships with two new real-time-capable decoder pipelines: the RelayBP belief-propagation decoder for qLDPC codes and an NVIDIA Ising convolutional neural network (CNN) pre-decoder paired with a global decoder (PyMatching) for the surface code. These pipelines enable quantum vendors and QEC researchers to deploy real-time GPU decoding for two popular code families via NVQLink.

Additionally, this release of CUDA-Q QEC contains speed improvements for our GPU-accelerated RelayBP decoder (up to 19X!)

For CUDA-Q Solvers 0.6.0, support was added for a new UpCCGSD ansatz solver and a Coupled Exchange Operator (CEO) pool.

Please check out the docs and examples for how to get started using the CUDA-QX libraries!

_Note: CUDA-Q QEC 0.6.0 and CUDA-Q Solvers 0.6.0 both depend on CUDA-Q 0.14. For CUDA-Q Realtime usage (experimental), you need to use CUDA-Q 0.14.1._

Features and Enhancements (QEC) 🎉

  • Sliding window optimize by @cketcham2333 in https://github.com/NVIDIA/cudaqx/pull/343
  • feat(qec): add trt_decoder_config for real-time decoding by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/384
  • Add trt cudagraph by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/369
  • Add decode batch by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/383
  • Create PyMatching decoder plugin by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/396
  • Add optional O parameter to PyMatching plugin by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/449
  • Update trt_decoder to support uint8 data types for I/O by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/455
  • Add graph capture functions to common decoder interface by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/475
  • Follow-up to #475 - additional decoder interface updates by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/478
  • Add Hololink QLDPC graph decode bridge and CI test by @cketcham2333 in https://github.com/NVIDIA/cudaqx/pull/481
  • Add realtime AI decoder / predecoder infrastructure (GPU + Host) w/ host dispatcher by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/457
  • Add FPGA-based test application for realtime predecoder by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/490

nv-qldpc-decoder Updates (Closed Source)

  • Implemented new graph capture interface functions to run RelayBP with CUDA-Q Realtime
  • Add new repeatable configuration option to enable bit-for-bit repeatable results when running back-to-back on the same system
  • Significant RelayBP optimizations, for both fp32 and fp64. Timings below show speedups relative to 0.5 for some well-known Bicycle Bivariate Codes on B200. All of the reported speedups are for non-batched, serial execution mode.

| Case Name (n_k_d) | Variant | Total Speedup (Ratio) | |-----------|-------------------|---------------| | 72_12_6 | fp32 | 3.44 | | 72_12_6 | fp64 | 2.74 | | 144_12_12 | fp32 | 5.50 | | 144_12_12 | fp64 | 4.51 | | 288_12_18 | fp32 | 19.06 | | 288_12_18 | fp64 | 13.16 | | Average | | 8.07 |

Bug Fixes (QEC) 🐛

  • Coverity fixes by @cketcham2333 in https://github.com/NVIDIA/cudaqx/pull/395
  • Fix OOB r/w issues by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/464
  • Add onnxscript to trt_decoder optional dependency by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/407
  • Bug fix and add test cases by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/424
  • Fix pytorch AcceleratorError root-caused by QEC by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/472

Features and Enhancements (Solvers) 🎉

  • UpCCGSD ansatz solver by @rr637 in https://github.com/NVIDIA/cudaqx/pull/372
  • Add Coupled Exchange Operator (CEO) pool by @jgonthier in https://github.com/NVIDIA/cudaqx/pull/387

Bug Fixes (Solvers) 🐛

  • Fix OOB r/w issues by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/464
  • Fix the gradient evaluation bugs and add test cases by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/434
  • Fix optimiser forwarding bug and other minor bugs and add test cases by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/435
  • Fix BK transformation | Fix JW parity Z-chain | Add test cases by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/460
  • Fix GQE invalid CUDA handle / AcceleratorError when moving model to GPU by @vedika-saravanan in https://github.com/NVIDIA/cudaqx/pull/473
  • Fix mixer forwarding | Fix MPI implementation | Add test cases by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/452
  • Add error message for missing system dependencies by @vedika-saravanan in https://github.com/NVIDIA/cudaqx/pull/467
  • Fix qubit indices bug and add test case by @kaiqiy-nv in https://github.com/NVIDIA/cudaqx/pull/469
  • GQE: PyTorch GPU compatibility check, exit/skip test on mismatch by @vedika-saravanan in https://github.com/NVIDIA/cudaqx/pull/494

Documentation ✏️

  • Update documentation for relay-bp by @melody-ren in https://github.com/NVIDIA/cudaqx/pull/346
  • [Docs] Add uccgsd to doc by @marwafar in https://github.com/NVIDIA/cudaqx/pull/340
  • [Docs] update gen_ham with UHF by @marwafar in https://github.com/NVIDIA/cudaqx/pull/339
  • [docs] Update nv-qldpc-decoder docs to describe the new proc_float option by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/288
  • Incorporate Sliding Window Decoder docs by @bmhowe23 in https://github.com/NVIDIA/cudaqx/pull/359
  • Add docs for realtime decoding by @kvmto in https://github.com/NVIDIA/cudaqx/pull/345
  • Add docs for AI decoder training with PyTorch by @wsttiger in https://github.com/NVIDIA/cudaqx/pull/344
  • fix typo in calling operator pool with uccsd in doc by @marwafar in https://github.com/NVIDIA/cudaqx/pull/366
  • Add requirement for memory BP methods in docs by @melody-ren in https://github.com/NVIDIA/cudaqx/pull/376
  • Added trt_decoder docs for Python and C++ by @wsttiger in…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

CUDA quantum tool update by NVIDIA