Analysis of the ORTHOTEL Corpus: The Contribution of Automatic Treatment to the Classification of Spelling Errors

Jun 30,2018 Scientific research & Postgraduate Studies, ICT Engineering

Analysis of the ORTHOTEL Corpus: The Contribution of Automatic Treatment to the Classification of Spelling Errors

Author	Véronique Aubergé, Nada Ghneim, Rahia Belrhali
Published in	Langue française, 124(1):90-103, December 1999

Abstract

The aim of this study is to present organized statistical data extracted from a large corpus of 15,000 forms showing spelling errors. This corpus, ORTHOTEL, is the result of Minitel users wondering about word spelling. An automatic treatment has been applied to the corpus to separate and analyse errors. Half of the forms of the corpus are rightly spelled. It indicates the users' degree of linguistic insecurity. An automatic text-to-phone system applied on the badly spelled words shows that a great part are homophone to a correct word taken from a reference lexicon of 80,000 canonical forms. An alignment algorithm has classified the orthographic transformations which account for deviations from the reference lexicon.

Link to read full paper

https://doi.org/10.3406/lfr.1999.6308

Analysis of the ORTHOTEL Corpus: The Contribution of Automatic Treatment to the Classification of Spelling Errors

Quick Links

Visiting AIU

Subscribe Us Today

Campus Address

Campus Phone

Analysis of the ORTHOTEL Corpus: The Contribution of Automatic Treatment to the Classification of Spelling Errors

Quick Links

Visiting AIU

Subscribe Us Today

Arab International University

Campus Address

Campus Phone