Skip to content

Commit e586b21

Browse files
committed
update readme
1 parent 259f144 commit e586b21

File tree

1 file changed

+39
-1
lines changed

1 file changed

+39
-1
lines changed

README.md

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,43 @@
11
# LanguageIdentification.jl
2+
3+
[![docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://guo-yong-zhi.github.io/LanguageIdentification.jl/dev) [![CI](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci.yml/badge.svg)](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci.yml) [![CI-nightly](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci-nightly.yml/badge.svg)](https://github.com/guo-yong-zhi/LanguageIdentification.jl/actions/workflows/ci-nightly.yml) [![codecov](https://codecov.io/gh/guo-yong-zhi/LanguageIdentification.jl/graph/badge.svg?token=lwDSoRUTmH)](https://codecov.io/gh/guo-yong-zhi/LanguageIdentification.jl)
24
# Installation
35
```julia
46
import Pkg; Pkg.add("LanguageIdentification")
5-
```
7+
```
8+
# Usage
9+
After loading the package, initialization is required. Different parameters have different balances among accuracy, speed, and memory usage. See the documentation for details.
10+
```julia
11+
using LanguageIdentification
12+
LanguageIdentification.initialize()
13+
```
14+
Currently, `LanguageIdentification.jl` supports the identification of 50 languages. You can check them with the following command, where the language is represented by the [ISO 639-3](https://en.wikipedia.org/wiki/ISO_639_macrolanguage) code.
15+
```julia
16+
LanguageIdentification.supported_languages()
17+
```
18+
```julia
19+
["ara", "bel", "ben", "bul", "cat", "ces", "dan", "deu", "ell", "eng", "epo", "fas",
20+
"fin", "fra", "hau", "hbs", "heb", "hin", "hun", "ido", "ina", "isl", "ita", "jpn",
21+
"kab", "kor", "kur", "lat", "lit", "mar", "mkd", "msa", "nds", "nld", "nor", "pol",
22+
"por", "ron", "rus", "slk", "spa", "swa", "swe", "tat", "tgl", "tur", "ukr", "vie",
23+
"yid", "zho"]
24+
```
25+
This package provides a simple interface to identify the language of a given text. The package exports two functions:
26+
- `langid`: returns the language code of the tested text.
27+
- `langprob`: returns the probabilities of the tested text for each language.
28+
```julia
29+
langid("This is a test.")
30+
```
31+
```julia
32+
"eng"
33+
```
34+
```julia
35+
langprob("这是一个测试。", topk=3)
36+
```
37+
```julia
38+
["zho" => 0.157798836477618,
39+
"mar" => 0.11718444394383595,
40+
"ben" => 0.10440699125820749,]
41+
```
42+
# Benchmark
43+
todo

0 commit comments

Comments
 (0)