microsoft/SchGen
Python
Captured source
source ↗microsoft/SchGen
Language: Python
License: MIT
Stars: 10
Forks: 0
Open issues: 10
Created: 2025-10-31T13:03:38Z
Pushed: 2026-06-05T23:42:51Z
Default branch: main
Fork: no
Archived: no
README:
SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations
SchGen is a domain-specialized large language model and dataset framework for automated PCB schematic generation from natural-language descriptions. It introduces a scalable schematic code representation and a PCB schematic dataset collected from real-world open-source hardware designs, enabling supervised training and evaluation of LLMs for schematic synthesis.
Key Features
- Natural-language-to-PCB schematic generation
- Semantic-grounded code representation
- Editable KiCad schematic generation
- Agentic sketch pipeline for dataset construction
- LoRA fine-tuning pipeline for GPT-oss models

The dataset is available at: microsoft/SchGen_dataset.
The model is available at: microsoft/SchGen.
To cite this project and corresponding paper, please use the following bib item:
@misc{luo2026schgenpcbschematicgeneration,
title={SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations},
author={Qinpei Luo and Ruichun Ma and Xinyu Zhang and Lili Qiu},
year={2026},
eprint={2605.30345},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2605.30345},
}Get Started
Prerequisites
1. LLM Model Access
OpenRouter: In the `./config.py, replace the variable of openrouter_api_key` with your own API key.
2. Python Environment
Instructions
(1) Set up a python virtual environment (Python 3.10 and Conda suggested) for the project. You can refer to the Tutorial.
(2) Enter your virtual environment and install python packages with:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install -r ./requirements.txt
(3) Set up project path environment variable under the virtual environment
conda env config vars set PROJECT_PATH={YOUR_PROJECT_PATH} && conda deactivate && conda activate {YOUR_CONDA_ENV}
(4) Set up GPT fine tuning, follow Blog here.
(5) The path of Python interpreters used by KiCad on different systems are specified in `./config.py`. The configurations are based on normal default settings for each OS, but may need to be adjusted based on the user's specific installation paths.
TL;DR All commands to run for setting up the environment below
# 1) Create and activate the env
conda create -n [YOUR_CONDA_ENV] python=3.10 -y
conda activate [YOUR_CONDA_ENV]
# 2) Install deps
pip install --upgrade pip
pip install -r requirements.txt
# 3) Set PROJECT_PATH of Conda environment
conda env config vars set PROJECT_PATH={YOUR_PROJECT_PATH} && conda deactivate && conda activate {YOUR_CONDA_ENV}
# 4) For GPT fine tuning
pip install "trl>=0.20.0" "peft>=0.17.0" "transformers>=4.55.0" trackio
pip install -U flash-attn
#Optional: login hugging face
from huggingface_hub import notebook_login
notebook_login()3. KiCad v8 installation
Instructions
Install version 8.0.9 from Github KiCad releases
Direct installer download link here: Windows Mac
To install kicad v8 on ubuntu
sudo add-apt-repository --yes ppa:kicad/kicad-8.0-releases sudo apt update sudo apt install --install-recommends kicad
Tests
To test whether you have set up environment correctly:
1. Run ./modules/utils/llm_interface.py to test the LLM model access 2. Run ./modules/kicad_sch_interface.py to test python-based KiCad schematic editing test.
These scripts have a main function implemented for testing purposes.
KiCad Usage
1. Open KiCad project by clicking the project file. For example: ./KiCAD_Project/example_project.kicad_pro
2. You will see a KiCad main project window showing up. In the window, click KiCad schematic file to view current schematic in a separate window. For example: example_project.kicad_sch
3. SchGen relies on KiCad's bundled Python environment for PCB manipulation. Default setting is specified for different systems in `./config.py`, however, you may need to check on them and make necessary changes.
Step by Step Guide
All of the following commands are executed under the `PROJECT_PATH` as you specify.
1. Dataset Construction

Dataset
The training dataset is constructed from open-source PCB designs and reference schematics, primarily based on SparkFun resources released under CC BY-SA 4.0 licenses.
The dataset includes:
- KiCad schematics
- semantic code representations
- synthesized user requests
- chain-of-thought reasoning traces
1.1 Prepare Symbol Context
Prepare symbol and footprint information from `.kicad_sym` files from KiCad with the following commands:
mkdir export python ./modules/utils/kicad_scan_lib.py
You should see two files of `organized_fp.json and organized_lib.json under the folder ./export`.
1.2 Agentic Sketch
Execute the following command to sketch the schematic based on the user request and image source.
python ./dataset_construction/agentic_sketch.py --model {MODEL_NAME} --save_path ./dataset_construction/sch_sketch --schematic_name {SCHEMATIC_NAME} --sch_request "{USER_REQUEST}" --img_ref_path {IMAGE_REFERENCE}1.3 Human Alignment
The sketch of KiCad schematic is avaiable at `./dataset_construction/sch_sketch/{schematic_name}`, the user can compare it with the reference image to ensure their alignment.
1.4 Code Conversion
Run the following command to convert the KiCad schematic to corresponding Python code with assigned representation level. Use the lightweight CLI of dataset_construction/kicad_read_sch.py. Short flags make the command concise:
python ./dataset_construction/kicad_read_sch.py \ -m \ -s \ -r
Notes:
- The output file is written next to the schematic file and named…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Low stars, trivial new repo