RepoMicrosoftMicrosoftpublished Mar 27, 2024seen 5d

microsoft/graphrag

Python

Open original ↗

Captured source

source ↗
published Mar 27, 2024seen 5dcaptured 8hhttp 200method plain

microsoft/graphrag

Description: A modular graph-based Retrieval-Augmented Generation (RAG) system

Language: Python

License: MIT

Stars: 33637

Forks: 3560

Open issues: 141

Created: 2024-03-27T17:57:52Z

Pushed: 2026-06-05T23:46:49Z

Default branch: main

Fork: no

Archived: no

README:

GraphRAG

👉 Microsoft Research Blog Post

👉 Read the docs

👉 GraphRAG Arxiv

Overview

The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.

To learn more about GraphRAG and how it can be used to enhance your LLM's ability to reason about your private data, please visit the Microsoft Research Blog Post.

Quickstart

To get started with the GraphRAG system we recommend trying the command line quickstart.

Repository Guidance

This repository presents a methodology for using knowledge graph memory structures to enhance LLM outputs. Please note that the provided code serves as a demonstration and is not an officially supported Microsoft offering.

⚠️ *Warning: GraphRAG indexing can be an expensive operation, please read all of the documentation to understand the process and costs involved, and start small.*

Diving Deeper

  • To learn about our contribution guidelines, see [CONTRIBUTING.md](./CONTRIBUTING.md)
  • To start developing _GraphRAG_, see [DEVELOPING.md](./DEVELOPING.md)
  • Join the conversation and provide feedback in the GitHub Discussions tab!

Prompt Tuning

Using _GraphRAG_ with your data out of the box may not yield the best possible results. We strongly recommend to fine-tune your prompts following the Prompt Tuning Guide in our documentation.

Versioning

Please see the [breaking changes](./breaking-changes.md) document for notes on our approach to versioning the project.

*Always run graphrag init --root [path] --force between minor version bumps to ensure you have the latest config format. Run the provided migration notebook between major version bumps if you want to avoid re-indexing prior datasets. Note that this will overwrite your configuration and prompts, so backup if necessary.*

Responsible AI FAQ

See [RAI_TRANSPARENCY.md](./RAI_TRANSPARENCY.md)

  • [What is GraphRAG?](./RAI_TRANSPARENCY.md#what-is-graphrag)
  • [What can GraphRAG do?](./RAI_TRANSPARENCY.md#what-can-graphrag-do)
  • [What are GraphRAG’s intended use(s)?](./RAI_TRANSPARENCY.md#what-are-graphrags-intended-uses)
  • [How was GraphRAG evaluated? What metrics are used to measure performance?](./RAI_TRANSPARENCY.md#how-was-graphrag-evaluated-what-metrics-are-used-to-measure-performance)
  • [What are the limitations of GraphRAG? How can users minimize the impact of GraphRAG’s limitations when using the system?](./RAI_TRANSPARENCY.md#what-are-the-limitations-of-graphrag-how-can-users-minimize-the-impact-of-graphrags-limitations-when-using-the-system)
  • [What operational factors and settings allow for effective and responsible use of GraphRAG?](./RAI_TRANSPARENCY.md#what-operational-factors-and-settings-allow-for-effective-and-responsible-use-of-graphrag)

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Privacy

Microsoft Privacy Statement

Notability

Scored, but no written rationale attached yet.

Microsoft has a repo signal matching data demand, evals and quality.