Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder

Vosoughi, Soroush; Vijayaraghavan, Prashanth; Roy, Deb K

dc.contributor.author	Vosoughi, Soroush
dc.contributor.author	Vijayaraghavan, Prashanth
dc.contributor.author	Roy, Deb K
dc.date.accessioned	2016-09-20T14:25:48Z
dc.date.available	2016-09-20T14:25:48Z
dc.date.issued	2016-07
dc.identifier.isbn	9781450340694
dc.identifier.uri	http://hdl.handle.net/1721.1/104352
dc.description.abstract	We present Tweet2Vec, a novel method for generating general- purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTM encoder-decoder. We trained our model on 3 million, randomly selected English-language tweets. The model was evaluated using two methods: tweet semantic similarity and tweet sentiment categorization, outperforming the previous state-of-the-art in both tasks. The evaluations demonstrate the power of the tweet embeddings generated by our model for various tweet categorization tasks. The vector representations generated by our model are generic, and hence can be applied to a variety of tasks. Though the model presented in this paper is trained on English-language tweets, the method presented can be used to learn tweet embeddings for different languages.	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2911451.2914762	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	Vosoughi	en_US
dc.title	Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder	en_US
dc.type	Article	en_US
dc.identifier.citation	Vosoughi, Soroush, Prashanth Vijayaraghavan, and Deb Roy. "Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder." Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’16, July 17-21, 2016, Pisa, Italy.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Media Laboratory	en_US
dc.contributor.department	Program in Media Arts and Sciences (Massachusetts Institute of Technology)	en_US
dc.contributor.approver	Vosoughi, Soroush	en_US
dc.contributor.mitauthor	Vosoughi, Soroush
dc.contributor.mitauthor	Vijayaraghavan, Prashanth
dc.contributor.mitauthor	Roy, Deb K
dc.relation.journal	Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '16	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.embargo.terms	N	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-2564-8909
dc.identifier.orcid	https://orcid.org/0000-0002-5826-1591
dc.identifier.orcid	https://orcid.org/0000-0002-4333-7194
mit.license	OPEN_ACCESS_POLICY	en_US

Files in this item

Name:: tweet2vec_vvr.pdf
Size:: 646.3Kb
Format:: PDF
Description:: Main article

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record