MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Structuring Heterogeneous Real Estate Market Evidence Using LLMs: A Provenance-Aware Analytical Framework

Author(s)
Hahmann, Luca; Xie, Richard
Thumbnail
DownloadThesis PDF (2.757Mb)
Advisor
Torous, Walter
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Real estate market analysis at early analytical stages relies on evidence drawn from heterogeneous public and semi-public sources, including brokerage research, administrative datasets, planning documents, and narrative commentary. These inputs differ systematically in scope, definition, temporal framing, and institutional construction. Market indicators are frequently consumed through static reports and summary tables that obscure provenance, comparability constraints, and evidentiary gaps. As a result, analytical conclusions often depend on implicit assumptions about how market information is constructed and aligned before formal underwriting or quantitative modeling begins. This thesis develops an evidence-centric framework for structuring and inspecting real estate market information prior to inference, using large language models (LLMs) to translate unstructured report artifacts into structured evidence objects. The framework treats market indicators, narrative claims, and source dependencies as constructed analytical objects. Each observation is represented together with explicit metadata describing geographic scope, temporal reference, definitional disclosure, and upstream data dependencies. This representation supports disciplined comparison across sources and makes uncertainty, non-equivalence, and missing context explicit. The methodological contribution consists of a KPI taxonomy, a context-aware data model, and a layered processing architecture. Observations are preserved with their original construction context and aligned only when comparability conditions are satisfied. Uncertainty is encoded through coverage, recency, and dispersion indicators. Visualization functions as an interface for evidentiary inspection, enabling users to navigate indicators, examine parallel representations, and trace reported values to their sources. The framework is demonstrated through an application to U.S. multifamily market reports. A single-source case study illustrates how brokerage reports combine headline KPIs, narrative claims, submarket tables, time-series elements, and transaction summaries within a single artifact. A multi-source case study examines contemporaneous reports for the same market and shows how differences in segmentation logic, measurement conventions, and temporal aggregation shape apparent agreement and disagreement. In both cases, structural non-equivalence remains visible as an analytical feature. The results show that explicit representation of evidentiary structure supports clearer interpretation of market claims independent of predictive modeling. The thesis positions LLM-enabled structuring as foundational infrastructure for real estate market analysis and as a prerequisite for downstream quantitative and causal research. Future work may extend the framework to additional data sources, longitudinal analysis of reporting behavior, and institutional deployment across investment workflows.
Date issued
2026-02
URI
https://hdl.handle.net/1721.1/165539
Department
Massachusetts Institute of Technology. Center for Real Estate. Program in Real Estate Development.
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.