RepoAmazon (Nova)Amazon (Nova)published Oct 27, 2025seen 5d

amazon-science/post-training-industry-specialized-small-reasoning-models

Open original ↗

Captured source

source ↗

amazon-science/post-training-industry-specialized-small-reasoning-models

License: MIT-0

Stars: 2

Forks: 0

Open issues: 0

Created: 2025-10-27T23:19:18Z

Pushed: 2025-10-28T05:28:16Z

Default branch: main

Fork: no

Archived: no

README:

Dataset for Efficient Post-Training for Industry-Specialized Reasoning in Small Language Models

Overview

This repository contains the dataset used in our research on efficient post-training techniques for enhancing industry-specialized reasoning capabilities in small language models.

Dataset Summary

This is a reasoning trace dataset based on 6251 samples in the training set of FinQA. The total number of the samples in our dataset is 187506. Three different methods were applied for generating reasoning traces: 62501 samples have the reasoning traces generated by DeepSeek-R1, 62504 samples have those by DeepSeek-R1 Distill Qwen 1.5B, and the reasoning traces of the rest 65021 samples are the DeepSeek-R1 reasoning traces summarized by DeepSeek-R1. To generate the traces, the inferences were executed 10 times per sample of FinQA per trace generation method, and some results from the failed inferences were removed.

Data Fields

  • example_id (string): Unique ID for each sample in this dataset. (0, 1, ..., 187505.)
  • sample_id (string): ID for distinguishing up to 10 samples generated from a given FinQA sample by a given trace generation method. (0, 1, ..., 9.)
  • id (string): This column is identical to the ID column \texttt{id} in the original FinQA dataset. This column can be used to associate a sample in our dataset with FinQA's.
  • question (string): Question statement prepared by combining the original question with its context information in FinQA.
  • answer (string): Original answer in FinQA.
  • llm_answer (string): Answer generated by DeepSeek-R1 or Distill Qwen 1.5B.
  • llm_reasoning (string): Reasoning trace generated by DeepSeek-R1 or Distill Qwen 1.5B (including the one summarized by DeepSeek R1).
  • is_correct (bool): Whether llm_answer is correct or not. (True or False.) Note that FinQA includes 48 samples with the empty string in the answer column. We set to is_correct = None for such samples.
  • model (string): Trace generation model. (deepseek_r1 or qwen_1_5b.)
  • is_summarized (bool): If the trace is summarized one or not. (True or False.)

Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Citation

@article{cai2025efficientindustryspecializedft,
title={Efficient Post-Training for Industry-Specialized Reasoning in Small Language Models},
author={Bill Cai, Sheldon Liu, Tatsuo Azeyanagi, and Tomal Deb},
year={2025}
}

@article{chen2021finqa,
title={FinQA: A Dataset of Numerical Reasoning over Financial Data},
author={Chen, Zhiyu and Chen, Wenhu and Smiley, Charese and Shah, Sameena and Borova, Iana and Langdon, Dylan and Moussa, Reema and Beane, Matt and Huang, Ting-Hao and Routledge, Bryan and Wang, William Yang},
journal={Proceedings of EMNLP 2021},
year={2021}
}

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning},
author={DeepSeek-AI},
year={2025},
eprint={2501.12948},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.12948},
}

Notability

notability 2.0/10

Very low traction, new repo