MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Engineering Systems Division
  • Engineering Systems Division (ESD) Working Paper Series
  • View Item
  • DSpace@MIT Home
  • Engineering Systems Division
  • Engineering Systems Division (ESD) Working Paper Series
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Technology Forecasting Using Data Mining and Semantics: First Annual Report

Author(s)
Woon, Wei Lee; Madnick, Stuart E.; Firat, Ayse; Ziegler, Blaine; Seshasai, Satwik
Thumbnail
Downloadesd-wp-2009-04.pdf (2.216Mb)
Metadata
Show full item record
Abstract
The planning and management of research and development is a challenging process which is compounded by the large amounts of information which is available. The goal of this project is to mine science and technology databases for patterns and trends which facilitate the formation of research strategies. Examples of the types of information sources which we exploit are diverse and include academic journals, patents, blogs and news stories. The intended outputs of the project include growth forecasts for various technological sectors (with an emphasis on sustainable energy), an improved understanding of the underlying research landscape, as well as the identification of influential researchers or research groups. This paper focuses on the development of techniques to both organize and visualize the data in a way which reflects the semantic relationships between keywords. We studied the use of the joint term frequencies of pairs of keywords, as a means of characterizing this semantic relationship – this is based on the intuition that terms which frequently appear together are more likely to be closely related. Some of the results reported herein describe: (1) Using appropriate tools and methods, exploitable patterns and information can certainly be extracted from publicly available databases, (2) Adaptation of the Normalized Google Distance (NGD) formalism can provide measures of keyword distances that facilitate keyword clustering and hierarchical visualization, (3) Further adaptation of the NGD formalism can be used to provide an asymmetric measure of keyword distances to allow the automatic creation of a keyword taxonomy, and (4) Adaptation of the Latent Semantic Approach (LSA) can be used to identify concepts underlying collections of keywords.
Date issued
2009-04
URI
http://hdl.handle.net/1721.1/102838
Publisher
Massachusetts Institute of Technology. Engineering Systems Division
Series/Report no.
ESD Working Papers;ESD-WP-2009-04

Collections
  • Engineering Systems Division (ESD) Working Paper Series

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.