Rustling

Rustling is a blazingly fast library for computational linguistics. It is written in Rust, with Python bindings.

Documentation: Python | Rust

Features

Language Models — N-gram language models with smoothing
- MLE — Maximum Likelihood Estimation (no smoothing)
- Lidstone — Lidstone (additive) smoothing
- Laplace — Laplace (add-one) smoothing
Word Segmentation — Models for segmenting unsegmented text into words
- LongestStringMatching — Greedy left-to-right longest match segmenter
- RandomSegmenter — Random baseline segmenter
Part-of-speech Tagging
- AveragedPerceptronTagger - Averaged perceptron tagger
CHAT Parsing — Parser for CHAT transcription files (CHILDES/TalkBank)
- CHAT — Read and query CHAT data from directories, files, strings, or ZIP archives

Performance

Benchmarked against pure Python implementations from NLTK, wordseg (v0.0.5), and pylangacq (v0.19.1). See benchmarks/ for full details and reproduction scripts.

Component	Task	Speedup	vs.
Language Models	Fit	10x	NLTK
	Score	2x	NLTK
	Generate	80–112x	NLTK
Word Segmentation	LongestStringMatching	9x	wordseg
	RandomSegmenter	1.1x	wordseg
POS Tagging	Training	5x	NLTK
	Tagging	7x	NLTK
CHAT Parsing	from_dir	55x	pylangacq
	from_zip	48x	pylangacq
	from_files	63x	pylangacq
	from_strs	116x	pylangacq
	words()	3x	pylangacq
	utterances()	15x	pylangacq

Installation

Python

pip install rustling

Rust

cargo add rustling

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
python		python
src		src
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rustling

Features

Performance

Installation

Python

Rust

License

About

Uh oh!

Releases 4

Uh oh!

Languages

License

jacksonllee/rustling

Folders and files

Latest commit

History

Repository files navigation

Rustling

Features

Performance

Installation

Python

Rust

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Uh oh!

Languages