Rebuilding the spellchecker
This is the table of contents for the (finished in May 2021!) “Rebuilding the spellchecker” series, dedicated to explaining how the world’s most popular spellchecker Hunspell works, via its Python port called Spylls.
- Introduction
- “Just look in the dictionary, they said!”: Dictionary lookup, pt.1
- “Compounds and solutions”: Dictionary lookup, pt.2 (word compounding)
- Introduction to suggest algorithm
- “Hunspell and the order of edits”: Edit-based suggest
- “Well, akchualy…”: Search for similar words suggest
- “17 (ever so slightly) weird facts about most popular dictionary format”
- “I forgot how to spellcheck”: Summary of it all
This work is referenced by:
- espells is JS/TS port of Spylls;
- spellbook is Hunspell-compatible Rust spellchecker, that lists the articles above as a resource for the early prototypes;
- LESPELL – A Multi-Lingual Benchmark Corpus of Spelling Errors to Develop Spellchecking Methods for Learner Language (Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 697–706)
- An exploratory investigation of functional variation in South Asian online Englishes (Cambridge University Press, English Language & Linguistics, Volume 28 Issue 2)