This benchmark suite compares the performance of several in-memory string search libraries against a dataset of ~31,000 world cities and 2,500 fuzzy search queries.
Library | Description |
---|---|
Autypo | Fast, extensible multi-token fuzzy search engine designed for user input. |
Levenshtypo | Lightweight fuzzy string matcher (single-token only); Autypo's engine. |
Lucene (Fuzzy) | Token-based fuzzy search using Lucene’s FuzzyQuery . |
Lucene (Suggest) | Fast prefix and fuzzy lookup using Lucene’s FuzzySuggester . |
FuzzyWuzzy | .NET port of FuzzyWuzzy for whole-string similarity scoring. |
All benchmarks follow this flow:
- Load ~31,000 cities from a CSV file.
- Index the data (in-memory).
- Perform 2,500 fuzzy searches.
- Time and log indexing + search duration.
Library | Supports Multi-Token | Index Time | Search Time | Avg Time/Search |
---|---|---|---|---|
Autypo | ✅ Yes | 163 ms | 3.34 s | 1.33 ms |
Levenshtypo | ❌ No | 25 ms | 312 ms | 0.12 ms |
Lucene (Fuzzy) | ✅ Yes | 776 ms | 2.91 s | 1.16 ms |
Lucene (Suggest) | ❌ No | 1.12 s | 503 ms | 0.20 ms |
FuzzyWuzzy | ❌ No | 1 ms | 4m 49s | 92.56 ms |
These benchmarks are intended as indicative performance comparisons and are not meant to represent absolute or universally reliable results.
They were run under consistent conditions but may vary depending on hardware, OS, or runtime versions.
PC specifications have intentionally been omitted to emphasize relative performance trends, not raw numbers.
AutypoBenchmark/
LevenshtypoBenchmark/
LuceneBenchmark/
LuceneSuggestionsBenchmark/
FuzzySharpBenchmark/
Each project contains its own Program.cs
with benchmark logic for reproducibility.
To execute all benchmarks:
.\run.cmd
Thanks to GeoNames.org for providing the open dataset of cities used in this demo.
Their free geographic data is an incredible resource for developers building location-aware applications.