Maschinenlesbare Textkorpora im Japanischen. Erstellung und Nutzung

Narrog, Heiko GND

Electronic text corpora play a major role in literary and linguistic study in European languages. The same has not yet been true for Japanese, for among other problems the complexity of the writing system has posed a major obstacle in Japanese text computation. However, a number of major corpora have been published recently, especially in the fields of classical literature, newspapers and language data collections in support of computational linguistics. This article tries to give a critical overview of representative corpora in the first two of these fields, and discusses the possibilities of creating one’s own electronic texts. Furthermore, it introduces software for the analysis of text corpora. The descriptions of the text corpora and programs contain information about availability and price.