Combining Corpus-Based Features for Selecting Best Natural Language Sentences

Foaad Khosmood and Robert Levinson
International Conference on Machine Learning and Applications
Honolulu, Hawaii

Automated paraphrasing of natural language text has many interesting applications from aiding in better translations to generating better and more appropriate style language. In this paper, we are concerned with the problem of picking the best English sentence out of a set of machine generated paraphrase sentences, each designed to express the same content as a human generated original. We present a system of scoring sentences based on examples in large corpora.