Daniel Zeman: Using Unsupervised Paradigm Acquisition for Prefixes We describe a simple method of unsupervised morpheme segmentation of words in an unknown language. All what is needed is a raw text corpus (or a list of words) in the given language. The algorithm identifies word parts occurring in many words and interprets them as morpheme candidates (prefixes, stems and suffixes). There are two main phases: /morpheme learning/ and proper /morpheme segmentation./ In the first phase, we learn morpheme candidates and filter them until we get lists of known morphemes. In the second phase, we get back to the original words and use the morpheme lists for segmenting of the words into morphemes. In Zeman (2007) we only were able to cut the word in two parts at most: the stem and the suffix. The main innovation over Zeman (2007) is the ability to learn prefixes. We propose two algorithms for prefixes. “Reversed word” method is just the stem-suffix algorithm applied to a reversed word. “Rule-based” method is a more conservative one: required properties are specified and all prefixes complying with the constraints are learned. Two segmentation algorithms have been tested: a strict (precision-oriented) one, and one less strict. The paper reports on more experiments than have been included in the main Morpho Challenge competition. The combination of Zeman (2007) stem-suffix learning, the rule-based prefix learning and the less strict segmentation is currently the most successful one. Resulting F-score of morpheme labeling heavily depends on language, ranging from 0.23 (Arabic) to 0.50 (English). The error analysis section shows how typos affect the results. The current algorithm cannot use word frequencies and has no means of identifying typos. Numerous examples from data are shown and other suggestions for future work are made. References: Daniel Zeman. 2007. /Unsupervised Acquiring of Morphological Paradigms from Tokenized Text./ In: Working Notes for the Cross Language Evaluation Forum (CLEF) 2007 Workshop, Budapest, Hungary. ISSN 1818-8044. Revised version to appear in C. Peters et al. (eds.): CLEF 2007, LNCS 5152, pp. 892–899, Springer-Verlag, Berlin / Heidelberg, Germany, 2008.