A practical MT project built around real Lezgi texts. This hub brings together models, datasets, and a live Telegram bot to support bilingual communication and language technology for a low‑resource language.
Bidirectional Lezgi↔Russian translation with curated parallel data and a deployment-ready model.
Parallel corpora from Lezgi Wikipedia and Lezgi Gazet texts, with Russian translations.
View datasetLezgi has limited digital resources. Open tools help communities, researchers, and educators.
Datasets, training notes, and model cards make the work transparent and reusable.
A Telegram bot makes translation available without installing any software.
Uses the trained NLLB model for sentence translation. You can also forward a message from another chat and the bot will translate it. In auto mode the bot detects source language, but very short texts can be misdetected.
Activates when you send a single word. It uses the M.M.Gadzhiev dictionary, with normalization and fuzzy search. For example, "Горы/Дагълар" can still match "Гора/Дагъ" after morphological normalization. For Lezgi, fuzzy suggestions are shown (e.g., "Пирем" -> "Перем", рубашка).
Have Lezgi texts to share? Reach out to contribute or improve the dataset.
Report translation issues or domain gaps to help prioritize improvements.
Use the model and dataset to create tools for education, media, and research.