amazon-science/toward-clinical-coding-verification-adaptation
Python
Captured source
source ↗amazon-science/toward-clinical-coding-verification-adaptation
Language: Python
License: NOASSERTION
Stars: 8
Forks: 1
Open issues: 0
Created: 2025-10-14T06:57:54Z
Pushed: 2025-10-14T07:20:39Z
Default branch: main
Fork: no
Archived: no
README:
Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation
This repository contains ICD-10-CM annotations for the paper: "Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation", EMNLP 2025, Industry Track
Dataset
This repository only contains the double expert-annotated ICD-10-CM annotations used for the paper. To derive the full training and testing data, including the corresponding notes, please follow these steps below:
1. Download https://github.com/wyim/aci-bench to ACI_BENCH_PATH 2. Run
python merge_aci_annotations.py --aci_data_dir ${ACI_BENCH_PATH}/data/challenge_data --annotation_dir annotation --output_dir merged_data3. In merged_data, you should find:
JSONL files with merged data:
train.jsonl(67 records)valid.jsonl(20 records)test.jsonl(120 records - combines all test files)
Each record contains dialogue, clinical note, and associated ICD10 codes.
Citation
If you find this data useful or if you use this for research and development, please cite
@inproceedings{toward-reliable-clinical-coding-verification-adaptation,
title = "Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation",
author = "Yuan, Zhangdie and
Shing, Han-Chin and
Strong, Mitch and
Shivade, Chaitanya",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track",
publisher = "Association for Computational Linguistics",
}License
This library is licensed under the CC-BY-NC-4.0 License.
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: CC-BY-NC-4.0
Notability
notability 3.0/10Low stars, routine research repo