چکیده:
Persian follows a concatenative morphology, where new morphemes are generated by chaining different morphemes together to form a new compound word. Whenever, two morphemes bind to form a new morpheme, there is a possibility that the syllables at the morpheme boundaries undergo structural change. This study suggests that these syllabic alterations may be captured using a finite state approach. It further argues that syllabification may be incorporated into the process of lexicon building. This approach allows the syllabification rules to be encoded in the lexical knowledge, when a lexicon is built using the finite state methods. The rules captured here can also assist the processing of syllabic alterations in word boundaries as well. It is particularly useful to process meter in Persian poetry.
خلاصه ماشینی:
Building a Syllabic Analyzer for Persian Using Finite State Transducers Mohammad Amin Mahdavi Department of Computer Engineering, Imam Khomeini International University, Qazvin, Iran mahdavi Hresearchattic.
Keywords: Persian language, syllabic alterations, epenthesis, finite state morphology, lexicon, syllabification.
The syllabic alterations in Persian can occur in combination with epenthesis, where additional consonants would have to be inserted at the morpheme boundaries.
It is argued in this paper that, if the syllabic alteration rules are incorporated into the lexicon building process using a finite state approach, the resulting transducer is able to suggest the proper syllabification for a given word.
The premises This study is based on two premises 1) phonological structure of Persian syllables may be captured by orthography 2) syllabic alterations in Persian can only happen at the morpheme boundaries.
e. the transducer) will have the surface form of the word at the lower level, while capturing the orthographic representation of phonology and syllabic structure at the upper level.
A few tools have been developed to accommodate the compilation of extended regular expression into finite state automata (FSA) and finite state transducers (FST) (Karttunen, Chanod, Grefenstette, & Schiller, 1996; Mohri, 1996; van Noord & Gerdemann, 2001).
Role of silent / i/ in syllabic alteration Every word in Persian must either end in a consonant or a long vowel.
Similarly, the upper level of the transducer in Figure 6 will have to be cascaded with a series of rewrite rules to apply the syllabic alterations at the morpheme boundary.