Downloads lemmatization dictionaries from the
michmech/lemmatization-lists
upstream and places them in the configured Text.Data cache so
Text.Lemma can load them with no further network access.
Useful when you don't want to enable
config :text, auto_download_lemma_data: true (which would let
the package fetch on first lookup) but still want the data
available locally — e.g. as a build step in CI, or to ship the
dictionary alongside a release.
Source data is licensed under the Open Database License (ODbL)
by Michal Boleslav Měchura. The download includes the upstream
license file alongside the per-language .txt files.
Usage
# Download a single language pack.
mix text.download_lemma_data de
# Download several at once.
mix text.download_lemma_data de fr es it pt
# List the languages this task knows about.
mix text.download_lemma_data --listOptions
--force— re-downloads even when the file is already cached.--list— prints the language codes available upstream and exits without downloading anything.
Where files land
Files are written to Text.Data.data_dir()/lemma/lemmatization-<lang>.txt,
by default ~/.cache/text/lemma/. Override the cache location
with:
config :text, data_dir: "/path/to/cache"