Rule-Based Natural Language Processing Methods   Ozlem Aktas and Yalc?n CEBI

Rule-Based Natural Language Processing Methods

176 страниц. 2010 год.
LAP Lambert Academic Publishing
In this book, the developed methods of natural language processing for Turkish by using rule-based approach were told, and also an implemented infrastructure, Rule-Based Automatical Corpus Generation (RB-CorGen), to use the new developed methods was explained briefly. For testing RB-CorGen on Turkish, the roots, stems and suffixes were obtained by coopoeration with Turkish Linguistic Association (Turk Dil Kurumu, TDK) and Dokuz Eylul University, College of Literature Linguistic Department, the defined tags and grammatical rules were stored in XML formatted file, and documents, include nearly 95 million wordforms, were collected from five Turkish newspapers in electronic environment. New methods, called Rule-Based Sentence Boundary Detection (RB-SBD), Rule-Based Morphological Analyser (RB-MA) and Rule-Based POS Tagging (RB-POST), were developed and analysed. It was seen that the success rates of these methods increase with the increasing number of rules.
