microsoft/MegaDetector-Classifier
Python
Captured source
source ↗microsoft/MegaDetector-Classifier
Description: MegaDetector-Classifier — The Microsoft open-source AI for camera-trap species classification. Fine-tuning and inference for wildlife identification, running downstream of MegaDetector animal detection. Maintained by Microsoft AI for Good Lab. Part of the Pytorch-Wildlife ecosystem.
Language: Python
License: MIT
Stars: 1
Forks: 0
Open issues: 2
Created: 2026-05-15T19:44:18Z
Pushed: 2026-05-18T20:01:23Z
Default branch: main
Fork: no
Archived: no
README:
MegaDetector-Classifier
Microsoft AI for Good Lab's open-source classification fine-tuning tool — train custom species classifiers on your own camera-trap datasets and deploy them through PyTorch-Wildlife.
MegaDetector-Classifier is part of the microsoft/Biodiversity ecosystem and is powered by the PyTorch-Wildlife framework. It is free, open-source, and available under the MIT license.
Part of the Biodiversity Ecosystem
MegaDetector-Classifier is one tool in a larger open-source ecosystem from the Microsoft AI for Good Lab. Each project lives in its own repository, with the microsoft/Biodiversity umbrella tying them together.
| Repository | Description | |---|---| | microsoft/Biodiversity | The umbrella repository — documentation hub for the AI for Good Lab's biodiversity work | | microsoft/MegaDetector | Animal, human, and vehicle detection for camera-trap images | | microsoft/PytorchWildlife | The collaborative deep learning framework that hosts MegaDetector, species classifiers, and demo notebooks | | microsoft/MegaDetector-Acoustic | Bioacoustic AI for audio-based wildlife detection and classification | | microsoft/MegaDetector-Overhead | Wildlife detection in aerial and drone imagery | | microsoft/MegaDetector-Sonar | Sonar-based wildlife detection for aquatic monitoring | | microsoft/MegaDetector-Classifier | This repo — classification fine-tuning for camera-trap species identification | | microsoft/SPARROW | Solar-Powered Acoustic and Remote Recording Observation Watch — AI-enabled edge device for field recording |
Overview
MegaDetector-Classifier is a training toolkit for fine-tuning ResNet-based species classifiers on custom camera-trap image datasets. The output weights integrate directly with the PyTorch-Wildlife framework, making it straightforward to deploy a classifier trained on your own data.
Key capabilities:
- ResNet-18 and ResNet-50 classifier training using PyTorch Lightning
- Three data-splitting strategies designed for camera-trap realities: random, location-based, and sequence-based
- YAML-based configuration — no code changes required for most use cases
- Demo data included for immediate testing without your own dataset
Designed for:
- Conservation practitioners adapting existing classifiers to new geographic regions
- Researchers adding new species to the PyTorch-Wildlife model zoo
- Projects running MegaDetector detection upstream and needing a matched classifier downstream
Installation
Using pip
git clone https://github.com/microsoft/MegaDetector-Classifier cd MegaDetector-Classifier pip install -r requirements.txt
Using conda
git clone https://github.com/microsoft/MegaDetector-Classifier cd MegaDetector-Classifier conda env create -f environment.yaml conda activate PT_Finetuning
Requirements: Python 3.9+
Quick Start
1. Configure configs/config.yaml — set dataset_root, annotation_dir, num_classes, and split_type 2. Run training:
python main.py
Output weights are saved to the weights/ directory and can be loaded directly into PyTorch-Wildlife.
Data Preparation
Data Structure
Images should be stored in a single flat directory (no nested subdirectories). An annotations.csv file — placed outside the images directory — maps each image to its class:
MegaDetector-Classifier/ ├── data/ │ ├── imgs/ # All images stored here (flat) │ └── annotation_example.csv # Annotations file └── configs/config.yaml
Annotation File Format
The CSV must contain three columns:
| Column | Description | Example | |---|---|---| | path | Relative path to the image | imgs/leopard_001.jpg | | classification | Integer class ID | 0 | | label | Human-readable class name | leopard |
Data Splitting
MegaDetector-Classifier supports three splitting strategies, selected via split_type in config.yaml:
| Strategy | When to use | Extra column required | |---|---|---| | random | Balanced class distribution; not recommended for camera-trap bursts | None | | location | Keeps all images from one camera location in the same split | Location | | sequence | Groups burst images within 30-second windows before splitting | Photo_time (YYYY-MM-DD HH:MM:SS) |
> Camera-trap note: Random splitting is not recommended because burst images of the same animal can appear in both training and validation sets, causing artificially high validation accuracy. Use location or sequence splitting instead.
Demo Data
Download demo data to test the pipeline without your own dataset:
# Download and extract wget https://zenodo.org/records/15376499/files/demo_data_clf.zip unzip demo_data_clf.zip -d data/
Then set dataset_root: ./data/imgs in configs/config.yaml and run python main.py.
Loading PytorchWildlife classifiers
In addition to training a custom classifier, MegaDetector-Classifier ships a PyTorch-Wildlife–compatible inference layer. Pretrained species classifiers can be loaded with a single import — the same classification subpackage layout that PyTorch-Wildlife exposes:
from src.models import classification as pw_classification
# Pick any of the available loaders below
model = pw_classification.AI4GAmazonRainforest(device="cuda")
results = model.single_image_classification("path/to/image.jpg")Available loaders:…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Minimal traction, new repo