Добро пожаловать в клуб

Показать / Спрятать  Домой  Новости Статьи Файлы Форум Web ссылки F.A.Q. Логобург    Показать / Спрятать

       
Поиск   
Главное меню
ДомойНовостиСтатьиПостановка звуковФайлыКнижный мирФорумСловарьРассылкаКаталог ссылокРейтинг пользователейЧаВо(FAQ)КонкурсWeb магазинКарта сайта

Поздравляем!
Поздравляем нового Логобуржца surikovaa со вступлением в клуб!

Реклама

КНИЖНЫЙ МИР

Automatic construction of labeled clusters of named entities for IR   Henock Tilahun Teffera

Automatic construction of labeled clusters of named entities for IR

64 страниц. 2011 год.
LAP Lambert Academic Publishing
In this study we have tried to harvest labeled clusters of semantically similar named entities which can be used as a first step for web document clustering. We first collect ~44,000 named entities from a thesaurus which is constructed by Dekang Lin applying a word similarity measure based on their distributional pattern. Using their similarity metrics and CLUTO clustering software, we create 2000 semantically similar clusters of the named entities. Then we collect ~305,500 label-instance pairs from the 2007 English Wikipedia dump and implement a labeling algorithm presented by Benjamin Van Durme and M.Pasca (2008) to assign a label to the clusters. This automatic lableing task is able to assign a label which describes the majority of the named entities in 924 of the clusters, which is 46.2% of the total clusters. Finally we evaluate both the clustering and labeling tasks taking 86 randomly selected clusters and on the bases of two native English speaker evaluators? subjective...
 
- Генерация страницы: 0.05 секунд -