Show simple item record

dc.contributor.authorPlatnick, Daniel
dc.contributor.authorAlirezaie, Marjan
dc.contributor.authorRahnama, Hossein
dc.date.accessioned2025-01-10T20:56:01Z
dc.date.available2025-01-10T20:56:01Z
dc.date.issued2024-12-02
dc.identifier.urihttps://hdl.handle.net/1721.1/157953
dc.description.abstractThis paper advances contextual image understanding within perspective-aware Ai (PAi), an emerging paradigm in human–computer interaction that enables users to perceive and interact through each other’s perspectives. While PAi relies on multimodal data—such as text, audio, and images—challenges in data collection, alignment, and privacy have led us to focus on enabling the contextual understanding of images. To achieve this, we developed perspective-aware scene graph generation with LLM post-processing (PASGG-LM). This framework extends traditional scene graph generation (SGG) by incorporating large language models (LLMs) to enhance contextual understanding. PASGG-LM integrates classical scene graph outputs with LLM post-processing to infer richer contextual information, such as emotions, activities, and social contexts. To test PASGG-LM, we introduce the context-aware scene graph generation task, where the goal is to generate a context-aware situation graph describing the input image. We evaluated PASGG-LM pipelines using state-of-the-art SGG models, including Motifs, Motifs-TDE, and RelTR, and showed that fine-tuning LLMs, particularly GPT-4o-mini and Llama-3.1-8B, improves performance in terms of R@K, mR@K, and mAP. Our method is capable of generating scene graphs that capture complex contextual aspects, advancing human–machine interaction by enhancing the representation of diverse perspectives. Future directions include refining contextual scene graph models and expanding multi-modal data integration for PAi applications in domains such as healthcare, education, and social robotics.en_US
dc.publisherMultidisciplinary Digital Publishing Instituteen_US
dc.relation.isversionofhttp://dx.doi.org/10.3390/info15120766en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceMultidisciplinary Digital Publishing Instituteen_US
dc.titleEnabling Perspective-Aware Ai with Contextual Scene Graph Generationen_US
dc.typeArticleen_US
dc.identifier.citationPlatnick, D.; Alirezaie, M.; Rahnama, H. Enabling Perspective-Aware Ai with Contextual Scene Graph Generation. Information 2024, 15, 766.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Media Laboratoryen_US
dc.relation.journalInformationen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2024-12-27T14:02:43Z
dspace.date.submission2024-12-27T14:02:43Z
mit.journal.volume15en_US
mit.journal.issue12en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record