MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • CSAIL Digital Archive
  • CSAIL Work Products
  • View Item
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • CSAIL Digital Archive
  • CSAIL Work Products
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"

Author(s)
Patrick Winston; Genesis; Finlayson, Mark Alan
Thumbnail
Downloadarchive.zip (8.145Mb)
Other Contributors
Genesis
Advisor
Patrick Winston
Terms of use
Creative Commons Attribution 4.0 International http://creativecommons.org/licenses/by/4.0/
Metadata
Show full item record
Abstract
This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archive contains several different types of files. First, it contains the annotation guides that were used to train the annotators. The guides are numbered to match the team numbers in Table 6. Included here are not only detailed guides for some layers, as produced by the original developers of the specification, but also our synopsis guides for each layer, which were used as a reference and further training material for the annotators. Also of interest are the general annotator and adjudicator training guides, which outline the general procedures followed by the teams when conducting annotation. Those who are organizing their own annotation projects may find this material useful.Second, the archive contains a comprehensive manifest, in Excel spreadsheet format, listing the word counts, sources, types, and titles (in both Russian and English) of all the texts that are part of the corpus. Finally, the archive contains the actual corpus data files, in Story Workbench format, an XML-encoded stand-off annotation scheme. The scheme is described in the file format specification file, also included in the archive. These files can be parsed with the aid of any normal XML reading software, or can be loaded and edited easily with the Story Workbench annotation tool, also freely available.
Date issued
2015-12-02
URI
http://hdl.handle.net/1721.1/100054

Collections
  • CSAIL Work Products

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.