Learning sentiment and semantic relatedness in user generated content using neural models
Author(s)
Nassif, Henry Michel
DownloadFull printable version (6.210Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
James Glass and Mitra Mohtarami.
Terms of use
Metadata
Show full item recordAbstract
Online review platforms and discussion forums are filled with insights that are critical to unlocking the value in user-generated content. In this thesis, we investigate two major Natural Language Processing (NLP) research areas: Aspect-Based Sentiment Analysis (ABSA) and Community Question Answering (cQA) ranking problems, for the purposes of harnessing and understanding the sentiment and semantics expressed in review platforms and discussion forums. Riding on the recent trends of deep learning, this work applies neural networks to solve these tasks. We design neuralbased models including Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs) to capture the semantic and sentiment information. Aspect Based Sentiment Analysis is concerned with predicting the aspect categories mentioned in a sentence and the sentiments associated with each aspect category. We refer to these tasks as Aspect Category Detection and Aspect category Sentiment Prediction, respectively. We present a neural-based model with convolutional layers and Multi-Layer Perceptron (MLP) to address these tasks. The model uses the word vector representations generated using word2vec and computes the convolutional vectors of the user-generated reviews. These vectors are then employed to predict the aspect categories and their corresponding sentiments. We evaluate the performance of our ABSA models on a restaurant review dataset and show that our results on the aspect category detection task and aspect category sentiment prediction task outperform the baselines. The Community Question Answering system is concerned with automatically finding the related questions in an existing set of questions, and finding the relevant answers to a new question. We address these ranking problems, which we respectively refer to as similar-Question Retrieval and Answer Selection. We present a neural-based model with stacked bidirectional LSTMs and MLP to address these tasks. The model generates the vector representations of the question-question or question-answer pairs and computes their semantic similarity scores. These scores are then used to rank and predict relevancies. Extensive experiments demonstrate that our cQA models for the question retrieval and answer selection tasks outperform the baselines if enough training data is available.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 113-124).
Date issued
2016Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.