RepoMicrosoftMicrosoftpublished May 30, 2024seen 6d

microsoft/RadFact

Python

Open original ↗

Captured source

source ↗
published May 30, 2024seen 6dcaptured 6dhttp 200method plain

microsoft/RadFact

Description: A metric suite leveraging the logical inference capabilities of LLMs, for radiology report generation both with and without grounding

Language: Python

License: MIT

Stars: 100

Forks: 13

Open issues: 5

Created: 2024-05-30T14:49:08Z

Pushed: 2026-06-20T04:12:01Z

Default branch: main

Fork: no

Archived: no

README:

RadFact: An LLM-based Evaluation Metric for AI-generated Radiology Reporting

RadFact is a framework for the evaluation of model-generated radiology reports given a ground-truth report, with or without grounding. Leveraging the logical inference capabilities of large language models, RadFact is not a single number but a _suite_ of metrics, capturing aspects of precision and recall at text-only and text-and-grounding levels.

RadFact was introduced in MAIRA-2: Grounded Radiology Report Generation. Here we provide an open-source implementation of the metric to facilitate its use and development. The RadFact metric currently supports both cxr and ct report types.

Table of Contents

  • [Getting Started](#getting-started)
  • [Installation](#installation)
  • [Endpoint (LLM) setup](#endpoint-llm-setup)
  • [Endpoint authentication](#endpoint-authentication)
  • [Set up endpoint config(s)](#set-up-endpoint-configs)
  • [Confirm entailment verification is working](#confirm-entailment-verification-is-working)
  • [LLMEngine for parallel processing](#llmengine-for-parallel-processing)
  • [Running RadFact](#running-radfact)
  • [Split reports into phrases](#split-reports-into-phrases)
  • [What is RadFact?](#what-is-radfact)
  • [Citation](#citation)
  • [Links](#links)
  • [Disclaimer](#disclaimer)
  • [Contributing](#contributing)
  • [Trademarks](#trademarks)

Getting Started

Installation

In order to run RadFact, you just need to clone this repository and run the following command:

pip install .

This will install the radfact package and all its dependencies.

Alternatively, we provide a Makefile to set up a conda environment with all the dependencies. You can create the environment with:

make miniconda
make mamba
make env
conda activate radfact

The first step installs miniconda, the second installs mamba for fast dependency resolution, and the third creates a conda environment called radfact with all the dependencies. This will also install the radfact package in editable mode by default via setup_packages_with_deps recipe (see [Makefile](Makefile#L28)). Finally, activate the environment for running RadFact. This is highly recommended if you intend to [contribute to the project](#contributing).

Endpoint (LLM) setup

To use RadFact, you need access to a large language model. You need to first set up the endpoints with authentication, and then confirm they are behaving as expected using our test script.

The LLM should be available as an API endpoint and be supported by langchain (version 0.1.4). We support two types of models: AzureChatOpenAI and ChatOpenAI models. The former is suitable for GPT models available on Azure, while the latter is suitable for custom deployed models like Llama-3 in Azure.

Endpoint authentication

We support the following authentication methods:

  • API Key environment variable: Set the API_KEY environment variable to the API key of the endpoint. We use API_KEY as the default environment variable name. If you use a different name, you can specify it in the endpoint config via api_key_env_var_name. This is especially useful when using multiple endpoints with different API keys.
  • API Key from an Azure Key Vault: Retrieve the API key from the default Azure Key Vault of an AzureML workspace. This requires:

1. Adding an AzureML workspace configuration file config.json in the root directory of the project. This config should have keys subscription_id, resource_group, and workspace_name. It can be downloaded from the AzureML workspace via the portal. This file is added to the .gitignore to avoid accidental commits. Make sure to save the file in root directory of the project under the name config.json as expected by the [endpoint](src/radfact/llm_utils/endpoint.py#L38) class. 2. Specifying the key_vault_secret_name in the endpoint config.

  • Token-based authentication via Azure token provider: This relies on token-based authorization. If none of the above methods are set up, we fall back to generating an Azure token provider assuming you have the right Azure credentials set up. The token provider is set to the azure_ad_token_provider parameter of an AzureChatOpenAI model allowing automatic token refresh. This is only supported for AzureChatOpenAI models.

To learn more about how we integrate the enpoints within RadFact please refer to the LLMAPIArguments class in [arguments.py](src/radfact/llm_utils/engine/arguments.py) that consumes an endpoint object of the Endpoint class in [endpoint.py](src/radfact/llm_utils/endpoint.py).

Set up endpoint config(s)

We use hydra for config management. The endpoint configs are in the path: [configs/endpoints](configs/endpoints).

This is an example of config file:

ENDPOINT_EXAMPLE:
type: "CHAT_OPENAI"
url: ""
deployment_name: "llama3-70b"
api_key_env_var_name: ""
keyvault_secret_name: ""
speed_factor: 1.0
num_parallel_processes: 10
  • There are 2 types of endpoints type: "CHAT_OPENAI" and type: "AZURE_CHAT_OPENAI" depending on the model end-point used. For GPT models available on Azure, use type: "AZURE_CHAT_OPENAI". For custom deployed models like Llama-3 on Azure, use type: "CHAT_OPENAI".
  • You will need to update the url and likely deployment_name fields with the appropriate values.
  • keyvault_secret_name is optional and not required if you set the api via an environment variable. Update api_key_env_var_name if you use a different environment variable name for the API key than the default "API_KEY". When using multiple endpoints, specify different api_key_env_var_name for each endpoint.
  • The option speed_factor is used when more than one endpoint is available. This allows you to specify the relative speed of the endpoint compared to the others which is used to shard the data across the endpoints proportionally.
  • The option num_parallel_processes is used to...

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

New Microsoft repo with modest traction

Microsoft has a repo signal matching data demand, infrastructure.