microsoft/tgrep
Rust
Captured source
source ↗microsoft/tgrep
Description: Trigram-indexed grep with a client/server architecture for fast regex search in large codebases locally
Language: Rust
License: MIT
Stars: 18
Forks: 4
Open issues: 2
Created: 2026-04-03T15:50:37Z
Pushed: 2026-06-12T02:45:33Z
Default branch: main
Fork: no
Archived: no
README:
tgrep
Trigram-indexed grep with a client/server architecture for fast regex search in large codebases.
Why?
Tools like grep and ripgrep scan every file on every search — O(total bytes) per query. In a 100k+ file monorepo, that's painfully slow. tgrep pre-builds a trigram index so searches only touch the small set of files that could match.
Start a server once, search instantly forever.
tgrep index . # build the trigram index tgrep serve . # start server (watches for file changes) tgrep "fn main" . # instant — auto-connects to running server
See [full benchmark results](BENCHMARKS.md) — up to 72x faster than ripgrep on large repos.
Benchmark highlights (avg latency per query, index pre-built)
| Repo | Files | Platform | ripgrep | tgrep | Speedup | | --- | ---: | --- | ---: | ---: | ---: | | chromium | 496K | macOS arm64 | 61,110ms | 2,630ms | 23x | | chromium | 496K | Windows | 29,557ms | 2,491ms | 12x | | gecko-dev | 388K | macOS arm64 | 35,413ms | 492ms | 72x | | gecko-dev | 388K | Windows | 16,199ms | 310ms | 52x | | gecko-dev | 388K | Linux | 1,931ms | 170ms | 11x | | linux | 94K | Windows | 4,317ms | 934ms | 5x | | rust | 59K | Windows | 1,989ms | 215ms | 9x | | kubernetes | 30K | Windows | 1,489ms | 178ms | 8x | | go | 15K | Windows | 450ms | 70ms | 6x |
Architecture
tgrep ---TCP---> tgrep serve (multi-client) (client) | HybridIndex / \ IndexReader LiveIndex (mmap disk) (in-memory overlay) ^ ^ | | Periodic Flush File Watcher (notify) (50K files / Background Indexer 5 min) (rayon parallel)
- IndexReader — mmap'd on-disk index (zero-copy, binary search on sorted
trigram lookup table)
- LiveIndex — in-memory overlay for files modified after server start, or
being built by the background indexer
- HybridIndex — merges both layers; overlay takes precedence
- Background Indexer — builds the index in parallel batches of 500 files
using rayon; queries are served immediately from partial data
- Periodic Flush — every 50K files or 5 minutes, the in-memory index is
flushed to disk and the reader is swapped, keeping memory bounded
- File Watcher —
notifycrate watches the repo; updates LiveIndex in
real time
- TCP Server — JSON-RPC 2.0 over newline-delimited TCP; each connection
handled in a separate thread; multiple clients can connect simultaneously
- File Cache — 50K-entry content cache with RwLock for lock-free reads
Performance
tgrep is designed to be significantly faster than ripgrep on large repos:
- Parallel search — candidate files are searched in parallel using rayon
- Fast query planning — sorted posting lists are intersected/unioned without
unnecessary resorting, and on-disk posting lists skip redundant deduplication
- Memory-efficient full builds — index builds batch extraction and stream
sorted postings, file entries, and lookup entries instead of retaining the full inverted index in memory
- Smart file walking — extension-based binary rejection (50+ formats),
8KB content check, 1MB file size limit
- Lock-free reads —
RwLockcache allows concurrent reads
without contention
- Hot serving — queries work immediately during background index building;
no need to wait for full index
See [BENCHMARKS.md](BENCHMARKS.md) for end-to-end large-repo benchmarks and Criterion microbenchmarks for query execution, trigram extraction, and index building.
Usage
Build the index
tgrep index . # index current directory tgrep index /path/to/repo # index a specific repo tgrep index . --index-path /tmp/idx # custom index location tgrep index . --exclude vendor --exclude third_party # skip directories
Start the server
tgrep serve . # start server (auto-builds index if missing) tgrep serve . --index-path /tmp/idx # custom index location tgrep serve . --no-watch # skip file watcher (saves memory) tgrep serve . --exclude node_modules # exclude directories from indexing
The server builds the index in the background if none exists, and serves queries immediately from partial data. Multiple clients can connect simultaneously.
Search
tgrep "pattern" . # basic regex search tgrep "pattern" file1.rs file2.rs # search multiple files/paths tgrep "TODO|FIXME" . # alternations tgrep '\w+(?!_test)' . # PCRE-style lookahead fallback tgrep "error" . -i # case-insensitive tgrep "error" . -S # smart-case (auto if all lowercase) tgrep -F "Vec" . # literal string tgrep "MyStruct" . -l # filenames only tgrep "pattern" . -c # count per file tgrep "pattern" . -o # only matching text tgrep "pattern" . -w # whole word tgrep "pattern" . -v # invert match tgrep "pattern" . -m 5 # max 5 matches per file tgrep "pattern" . -g "*.rs" # glob filter tgrep "pattern" . -g "*.rs" -g "*.toml" # multiple globs (OR) tgrep "pattern" . -t rust # type filter tgrep "pattern" . -e "also_this" # multiple patterns tgrep "pattern" . -A 3 # 3 lines after match tgrep "pattern" . -B 2 # 2 lines before match tgrep "pattern" . -C 3 # 3 lines before & after tgrep "pattern" . --json # JSON output tgrep "pattern" . --vimgrep # vim-compatible output tgrep "pattern" . --stats # show query plan & timing tgrep "pattern" . --no-index # brute-force (skip index) tgrep "pattern" . -U # multiline matching tgrep "pattern" . -q # quiet: exit code only tgrep "pattern" . -L # files that DON'T match tgrep "pattern" . --no-filename # suppress filenames tgrep "pattern" . -N # suppress line numbers tgrep --files . # list searchable files tgrep --files src/main.rs # list a single file if searchable tgrep --files -t rust . # list Rust files only tgrep --type-list # show all file types
Check status
tgrep status .
Server status for /src/my-monorepo PID: 37980 Port: 51043 Files: 152 Trigrams: 12265 Cache: 2/50000 Watcher: active Indexing: complete
Count files
tgrep count-files . # count text files (no server needed) tgrep count-files /path/to/repo # scan a specific repo
Prints the count to stdout (scriptable) and details to stderr:
284957 284957 text files (47516 binary skipped, 0 errors) in...
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Routine repo from Microsoft with minimal traction.