Добро пожаловать в клуб

Показать / Спрятать  Домой  Новости Статьи Файлы Форум Web ссылки F.A.Q. Логобург    Показать / Спрятать

Главное меню
ДомойНовостиСтатьиДефектологияПостановка звуковФайлыКнижный мирФорумСловарьРассылкаКаталог ссылокРейтинг пользователейЧаВо(FAQ)КонкурсWeb магазинШкольникамЭлектроникаБыт.техникаКарта сайта

Поздравляем нового Логобуржца Галина2007 со вступлением в клуб!



Using Roget’s Thesaurus to Determine the Similarity of Texts   Jeremy Ellman

Using Roget’s Thesaurus to Determine the Similarity of Texts

228 страниц. 2010 год.
LAP Lambert Academic Publishing
This thesis addresses the problem of extracting a representation of text''s meaning from its content. The solution investigated is based on the use of Roget''s thesaurus as an external knowledge source and can be used to analyse texts of any length or complexity. The resulting document representation can then be compared to others, producing a new method for text similarity assessment. All coherent texts contain embedded sequences of words that are related in meaning. These sequences can be detected by identifying simple relationships between the relevant thesaural entries in which the words are found. The identification of initial sequences drives the addition of further related words into conceptually related “lexical chains”. Every coherent text contains many lexical chains of different lengths and strengths. These may be used to represent the broad subject matter of a text. By identifying the key concept of each chain, and relating this to its presence we may...
- Генерация страницы: 0.04 секунд -