|
1 | 1 | # LanguageIdentification.jl |
| 2 | + |
| 3 | +[](https://guo-yong-zhi.github.io/LanguageIdentification.jl/dev) [](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci.yml) [](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci-nightly.yml) [](https://codecov.io/gh/guo-yong-zhi/LanguageIdentification.jl) |
2 | 4 | # Installation |
3 | 5 | ```julia |
4 | 6 | import Pkg; Pkg.add("LanguageIdentification") |
5 | | -``` |
| 7 | +``` |
| 8 | +# Usage |
| 9 | +After loading the package, initialization is required. Different parameters have different balances among accuracy, speed, and memory usage. See the documentation for details. |
| 10 | +```julia |
| 11 | +using LanguageIdentification |
| 12 | +LanguageIdentification.initialize() |
| 13 | +``` |
| 14 | +Currently, `LanguageIdentification.jl` supports the identification of 50 languages. You can check them with the following command, where the language is represented by the [ISO 639-3](https://en.wikipedia.org/wiki/ISO_639_macrolanguage) code. |
| 15 | +```julia |
| 16 | +LanguageIdentification.supported_languages() |
| 17 | +``` |
| 18 | +```julia |
| 19 | + ["ara", "bel", "ben", "bul", "cat", "ces", "dan", "deu", "ell", "eng", "epo", "fas", |
| 20 | + "fin", "fra", "hau", "hbs", "heb", "hin", "hun", "ido", "ina", "isl", "ita", "jpn", |
| 21 | + "kab", "kor", "kur", "lat", "lit", "mar", "mkd", "msa", "nds", "nld", "nor", "pol", |
| 22 | + "por", "ron", "rus", "slk", "spa", "swa", "swe", "tat", "tgl", "tur", "ukr", "vie", |
| 23 | + "yid", "zho"] |
| 24 | +``` |
| 25 | +This package provides a simple interface to identify the language of a given text. The package exports two functions: |
| 26 | +- `langid`: returns the language code of the tested text. |
| 27 | +- `langprob`: returns the probabilities of the tested text for each language. |
| 28 | +```julia |
| 29 | +langid("This is a test.") |
| 30 | +``` |
| 31 | +```julia |
| 32 | +"eng" |
| 33 | +``` |
| 34 | +```julia |
| 35 | +langprob("这是一个测试。", topk=3) |
| 36 | +``` |
| 37 | +```julia |
| 38 | +["zho" => 0.157798836477618, |
| 39 | +"mar" => 0.11718444394383595, |
| 40 | +"ben" => 0.10440699125820749,] |
| 41 | +``` |
| 42 | +# Benchmark |
| 43 | +todo |
0 commit comments