Show simple item record

dc.contributor.authorHenschel, Andreas
dc.contributor.authorWoon, Wei Lee
dc.contributor.authorWachter, Thomas
dc.contributor.authorMadnick, Stuart
dc.date.accessioned2011-10-24T20:56:46Z
dc.date.available2011-10-24T20:56:46Z
dc.date.issued2009-09
dc.identifier.urihttp://hdl.handle.net/1721.1/66564
dc.description.abstractWe compare a family of algorithms for the automatic generation of taxonomies by adapting the Heymannalgorithm in various ways. The core algorithm determines the generality of terms and iteratively inserts them in a growing taxonomy. Variants of the algorithm are created by altering the way and the frequency, generality of terms is calculated. We analyse the performance and the complexity of the variants combined with a systematic threshold evaluation on a set of seven manually created benchmark sets. As a result, betweenness centrality calculated on unweighted similarity graphs often performs best but requires threshold fine-tuning and is computationally more expensive than closeness centrality. Finally, we show how an entropy-based filter can lead to more precise taxonomies.en_US
dc.language.isoen_USen_US
dc.publisherCambridge, MA; Alfred P. Sloan School of Management, Massachusetts Institute of Technologyen_US
dc.relation.ispartofseriesMIT Sloan School of Management Working Paper;4758-09
dc.relation.ispartofseriesCISL Working Paper;2009-12
dc.titleComparison of Generality Based Algorithm Variants for Automatic Taxonomy Generationen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record