KWJA: Kyoto-Waseda Japanese Analyzer

KWJA is a Japanese language analyzer based on pre-trained language models. KWJA performs many language analysis tasks, including:

Typo correction
Tokenization
Morphological analysis
Named entity recognition
Dependency parsing
PAS analysis
Coreference resolution
Discourse relation analysis
etc.

Requirements

Python: 3.9+
Dependencies: See pyproject.toml.

Getting Started

Install KWJA with pip:

$ pip install kwja

Perform language analysis with the kwja command (the result is in the KNP format):

# Analyze a text
$ kwja --text "月が綺麗ですね。死んでもいいわ。"

# Analyze a text file
$ kwja --file path/to/file.txt

Usage from Python

Make sure you have kwja command in your path:

$ which kwja
/path/to/kwja

Install rhoknp:

$ pip install rhoknp

Perform language analysis with the kwja instance:

from rhoknp import KWJA
kwja = KWJA()
analyzed_document = kwja.apply("月が綺麗ですね。死んでもいいわ。")

Citation

@InProceedings{植田2022,
  author    = {植田 暢大 and 大村 和正 and 児玉 貴志 and 清丸 寛一 and 村脇 有吾 and 河原 大輔 and 黒橋 禎夫},
  title     = {KWJA：汎用言語モデルに基づく日本語解析器},
  booktitle = {第253回自然言語処理研究会},
  year      = {2022},
  address   = {京都},
}

Name		Name	Last commit message	Last commit date
Latest commit History 776 Commits
.github/workflows		.github/workflows
configs		configs
kwja/resource		kwja/resource
scripts		scripts
src/kwja		src/kwja
tests		tests
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KWJA: Kyoto-Waseda Japanese Analyzer

Requirements

Getting Started

Usage from Python

Citation

About

Releases

Packages

Languages

License

juntakano/kwja

Folders and files

Latest commit

History

Repository files navigation

KWJA: Kyoto-Waseda Japanese Analyzer

Requirements

Getting Started

Usage from Python

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages