Show simple item record

dc.contributor.advisorStuart E. Madnick and John R. Williams.en_US
dc.contributor.authorChuang, Shin Wee, 1978-en_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Civil and Environmental Engineering.en_US
dc.date.accessioned2005-06-02T19:10:12Z
dc.date.available2005-06-02T19:10:12Z
dc.date.copyright2004en_US
dc.date.issued2004en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/17915
dc.descriptionThesis (S.M.)--Massachusetts Institute of Technology, Engineering Systems Division, Technology and Policy Program; and, (S.M.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004.en_US
dc.descriptionIncludes bibliographical references (p. 76-77).en_US
dc.description.abstractWeb wrapping technologies were developed in the 90s in the middle of the dot com boom to facilitate the extraction of web data. In recent years, the underlying architecture of web wrapping technologies is also been used for other applications such as information integration between legacy systems in large enterprises. Despite the relatively widespread use of this technology, there is currently no uniform way of characterizing web wrapping toolkits, unlike say, a digital camera which can be described in terms of the size of its sensor or storage capacity. The focus of this thesis therefore is to develop a taxonomy or classification scheme that can be used to effectively describe a web wrapping toolkit in terms of its retrieval, extraction and conversion features. For this purpose, some 20 toolkits are studied and of which, verification tests were performed on 9 of these toolkits where evaluation copies are available. The last part of the thesis discusses two policy Acts that are closely related to data extraction. They are the EU Database Directive and the HR3261 Database and Collection of Information Misappropriation Act. A comparative analysis between the two Acts was performed and their respective implications on the database producing industry were examined.en_US
dc.description.statementofresponsibilityby Shin Wee Chuang.en_US
dc.format.extent77 p.en_US
dc.format.extent6449867 bytes
dc.format.extent6449671 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/pdf
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582
dc.subjectTechnology and Policy Program.en_US
dc.subjectCivil and Environmental Engineering.en_US
dc.titleA taxonomy and analysis of web wrapping technologiesen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Civil and Environmental Engineering
dc.contributor.departmentMassachusetts Institute of Technology. Engineering Systems Division
dc.contributor.departmentTechnology and Policy Program
dc.identifier.oclc56728256en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record