If this is the first time you download the Hutmegs evaluation package, please give us some information about yourself and the system you are planning to use it on. (If you have already given this information, just click on the submit form and download package button.)
We collect this information in order to be able to inform you about updates. You will only receive email from us if you check the appropriate checkbox. Thank you for your time.
Back to the Morpho project.
The Helsinki University of Technology Morphological Evaluation Gold Standard package, Hutmegs, is a collection of files (Makefiles, Perl scripts and sample data files) and documentation. The scripts and makefiles have been tested on a Linux operating system. Hutmegs is free to use for non-commercial purposes as long as a reference is made to the source of the material and the corresponding technical report is cited in scientific work.
The Finnish gold-standard segmentations are based on the FINTWOL analyzer, which is a commercial product. To obtain the complete Finnish Gold Standard, a missing component must be licensed from Lingsoft, Inc. The missing component is a file containing the FINTWOL analyses for the word forms comprised in the Gold Standard. The list of FINTWOL analyses has the product name LS Hutmegs-Fintwol (version 1.0) and the one-time license fee for non-commercial activities is 600 euros (as of March, 2005). To purchase this component contact Lingsoft through e-mail: info@lingsoft.fi. If the component is not purchased, the user will have access to all Hutmegs scripts and documentation, but only a sample Gold Standard containing the analyses of no more than 700 Finnish word forms.
Likewise, the CELEX database is a prerequisite for accessing the complete English Gold Standard. Two files from CELEX describing the English morphology must be available in order to generate the English gold-standard segmentations. Non-commercial licenses are available from the Linguistic Data Consortium. The CELEX database is described on the following page: http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC96L14. The non-member price is 150 US dollars (as of March, 2005). The Hutmegs package provides sample gold-standard segmentations for roughly 600 English word forms, which can be viewed without access to the CELEX database.
You are at: CIS → Research → Multimodal Interfaces → NatLang group → Morpho project
Page maintained by morpho at aalto.fi, last updated Tue Jul 14 15:12:07 2015