ABSTRACT
The application of the formalism of two-level morphology to Basque and its use in the elaboration of the XUXEN spelling checker/corrector are described. This application is intended to cover a large part of the language.Because Basque is a highly inflected language, the approach of spelling checking and correction has been conceived as a by-product of a general purpose morphological analyzer/generator. This analyzer is taken as a basic tool for current and future work on automatic processing of Basque.An extension for continuation class specifications in order to deal with long-distance dependencies is proposed. This extension consists basically of two features added to the standard formalism which allow the lexicon builder to make explicit the interdependencies of morphemes.User-lexicons can be interactively enriched with new entries enabling the checker from then on to recognize all the possible flexions derived from them.Due to a late process of standardization of the language, writers don't always know the standard form to be used and commit errors. The treatment of these "typical errors" is made in a specific way by means of describing them using the two-level lexicon system. In this sense, XUXEN is intended as a useful tool for standardization purposes of present day written Basque.
- Agirre E., Alegria I., Arregi X., Artola X., Díaz de Ilarraza A., Sarasola K., Urkia M. Aplicaciòn de la morfología de dos niveles al euskara. S.E.P.L.N, vol. 8, 87--102. 1989.Google Scholar
- Angell R., Freund G., Willety P. Automatic Spelling Correcting using a trigram similarity measure. Information Processing & Management, vol 19, no 4, 1983.Google Scholar
- Barton, E. Computational Complexity in two-level Morphology, 1985.Google Scholar
- Damerau F. A technique for computer detection and correction of spelling errors. Comm. of ACM vol. 7 pp. 171--176, 1964. Google ScholarDigital Library
- Euskaltzaindia. Aditz laguntzaile batua. Euskaltzaindia, Bilbo 1973.Google Scholar
- Euskaltzaindia. Euskal Gramatika: Lehen urratsak (I eta II). Euskaltzaindia, Bilbo 1985.Google Scholar
- Kaplan, R. M., and M. Kay. Phonological rules and finitestate transducers. Paper read at the annual meeting of the Linguistic Society of America in New York City, 1981.Google Scholar
- Karttunen, L. KIMMO: A two-level Morphological Analyzer. Texas Linguistic Forum, Vol 22, Pp.165--186, 1983.Google Scholar
- Karttunen L., Koskenniemi K., Kaplan R. A Compiler for Two-Level Phonological Rules in "Tools for Morphological Analysis", Center for the Study of Language and Information, Report No. CLSI-87-108.Google Scholar
- Kay, M. Morphological Analysis. A. Zampolli & N. Calzolari eds. (1980). Proc. of the Int. Conference on Computational Linguistics (Pisa), 1973. Google ScholarDigital Library
- Koskenniemi, K., Two-level Morphology: A general Computational Model for Word-Form Recognition and Production, University of Helsinki, Department of General Linguistics. Publications no 11, 1983.Google Scholar
- Koskenniemi, K. Compilation of Automata from Morphological Two-level Rules. Pp. 143--149. Publication no 15. University of Helsinki, 1985.Google Scholar
- Peterson J. L. Computer Programs for detecting and correcting spelling errors. Comm. of ACM vol.23 no. 12 1980. Google ScholarDigital Library
- Pollock J., Zamora A. Automatic spelling correction in scientific and scholarly text. Comm. of ACM vol.27 358--368, 1984. Google ScholarDigital Library
- Ritchie, G. D., S. G. Pulman, A. W. Black and G. J. Russell. A Computational Framework for Lexical Description. Computational Linguistics, vol. 13, numbers 3--4, 1987. Google ScholarDigital Library
- Sarasola, I. Gaurko euskara idatziaren maiztasun-hiztegia. (3gn. liburukia), GAK, Donostia, 1982.Google Scholar
- Schiller A. Steffens P. A lexicon for a German two-level morphology. Paper read at Euralex 1990 (Benalmádena).Google Scholar
- Tanaka E., Kojima Y. A High Speed String Correction method using a hierarchical file. IEEE transactions on pattern analysis and Machine Intelligence vol.9 no. 6, 1987. Google ScholarDigital Library
- Trost, H. The application of two-level morphology to non-concatenative German morphology. COLING-90, Helsinki, vol.2 371--376. Google ScholarDigital Library
- Winograd, T. Language as a cognitive process. Vol.1: Syntax, pp 544--549. Addison-Wesley, 1983. Google Scholar
- Yannakoudakis E. J. The rules of spelling errors. Information Processing & Management vol.19 no. 2, 1983.Google Scholar
-
XUXEN: a spelling checker/corrector for basque based on two-level morphology
-
Recommendations
-
SemEval-2010 task 3: cross-lingual word sense disambiguation
SEW '09: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future DirectionsWe propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sensetagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the ...
-
Hindi Word Sense Disambiguation Using Lesk Approach on Bigram and Trigram Words
AICTC '16: Proceedings of the International Conference on Advances in Information Communication Technology & ComputingWord Sense Disambiguation (WSD) is a vital task which provides the definition of particular words according to their sense or according to given context. Lesk algorithm is originally based on the gloss overlap that can be observed as the measure, ...
-
Toward an Effective Igbo Part-of-Speech Tagger
Part-of-speech (POS) tagging is a well-established technology for most Western European languages and a few other world languages, but it has not been evaluated on Igbo, an agglutinative African language. This article presents POS tagging experiments ...
Comments