Show simple item record

dc.contributor.authorVijayaraghavan, P
dc.contributor.authorRoy, D
dc.date.accessioned2021-11-02T12:24:29Z
dc.date.available2021-11-02T12:24:29Z
dc.date.issued2020
dc.identifier.urihttps://hdl.handle.net/1721.1/137065
dc.description.abstract© Springer Nature Switzerland AG 2020. Recently, generating adversarial examples has become an important means of measuring robustness of a deep learning model. Adversarial examples help us identify the susceptibilities of the model and further counter those vulnerabilities by applying adversarial training techniques. In natural language domain, small perturbations in the form of misspellings or paraphrases can drastically change the semantics of the text. We propose a reinforcement learning based approach towards generating adversarial examples in black-box settings. We demonstrate that our method is able to fool well-trained models for (a) IMDB sentiment classification task and (b) AG’s news corpus news categorization task with significantly high success rates. We find that the adversarial examples generated are semantics-preserving perturbations to the original text.en_US
dc.language.isoen
dc.publisherSpringer International Publishingen_US
dc.relation.isversionof10.1007/978-3-030-46147-8_43en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleGenerating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Modelen_US
dc.typeArticleen_US
dc.identifier.citationVijayaraghavan, P and Roy, D. 2020. "Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model." Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11907 LNAI.
dc.contributor.departmentMassachusetts Institute of Technology. Media Laboratory
dc.relation.journalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2021-07-01T16:55:21Z
dspace.orderedauthorsVijayaraghavan, P; Roy, Den_US
dspace.date.submission2021-07-01T16:55:22Z
mit.journal.volume11907 LNAIen_US
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record