A generalized solution to verify authorship and detect style change in multi-authored documents
Author(s)
Leekha, Rohan; Vandam, Courtland
Download3625007.3627589.pdf (921.7Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Identifying changes in style can be used to detect multi-authored social media accounts, plagiarism, compromised accounts, and author contributions in long documents. We propose an approach to recognize changes in authorship using large language models. Our approach leverages sentence-level contextual embeddings and semantic relationships. First we expand the training set by adding adversarial examples to the minority class [5], [13], [17]. Then we fine-tune a sequence classification transformer model to detect style change. Our approach outperforms all baselines of PAN21 with macro F1-scores of 0.80, 0.74, and 0.70 for detecting style changepoint between paragraphs, closed-set author ID per paragraph, and style changepoint between sentences, respectively. Our approach also performs better than the leading competitors in PAN22. Also, we achieved a five percent improvement in macro F1-score (0.78) on the newly introduced DarkReddit+ dataset for authorship verification.
Description
ASONAM '23, November 6–9, 2023, Kusadasi, Turkiye
Date issued
2023-11-06Department
Lincoln LaboratoryPublisher
ACM
Citation
Leekha, Rohan and Vandam, Courtland. 2023. "A generalized solution to verify authorship and detect style change in multi-authored documents."
Version: Final published version
ISBN
979-8-4007-0409-3
Collections
The following license files are associated with this item: