Show simple item record

dc.contributor.authorWoon, Wei Lee
dc.contributor.authorMadnick, Stuart E.
dc.contributor.authorFirat, Ayse
dc.contributor.authorZiegler, Blaine
dc.contributor.authorSeshasai, Satwik
dc.date.accessioned2016-06-02T15:43:28Z
dc.date.available2016-06-02T15:43:28Z
dc.date.issued2009-04
dc.identifier.urihttp://hdl.handle.net/1721.1/102838
dc.description.abstractThe planning and management of research and development is a challenging process which is compounded by the large amounts of information which is available. The goal of this project is to mine science and technology databases for patterns and trends which facilitate the formation of research strategies. Examples of the types of information sources which we exploit are diverse and include academic journals, patents, blogs and news stories. The intended outputs of the project include growth forecasts for various technological sectors (with an emphasis on sustainable energy), an improved understanding of the underlying research landscape, as well as the identification of influential researchers or research groups. This paper focuses on the development of techniques to both organize and visualize the data in a way which reflects the semantic relationships between keywords. We studied the use of the joint term frequencies of pairs of keywords, as a means of characterizing this semantic relationship – this is based on the intuition that terms which frequently appear together are more likely to be closely related. Some of the results reported herein describe: (1) Using appropriate tools and methods, exploitable patterns and information can certainly be extracted from publicly available databases, (2) Adaptation of the Normalized Google Distance (NGD) formalism can provide measures of keyword distances that facilitate keyword clustering and hierarchical visualization, (3) Further adaptation of the NGD formalism can be used to provide an asymmetric measure of keyword distances to allow the automatic creation of a keyword taxonomy, and (4) Adaptation of the Latent Semantic Approach (LSA) can be used to identify concepts underlying collections of keywords.en_US
dc.language.isoen_USen_US
dc.publisherMassachusetts Institute of Technology. Engineering Systems Divisionen_US
dc.relation.ispartofseriesESD Working Papers;ESD-WP-2009-04
dc.titleTechnology Forecasting Using Data Mining and Semantics: First Annual Reporten_US
dc.typeWorking Paperen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record