Multimodal generative models for storytelling
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Jacob Andreas and Hendrik Strobelt.
MetadataShow full item record
Storytelling is an open-ended task that entails creative thinking and requires a constant flow of ideas. Generative models have recently gained momentum thanks to their ability to identify complex data's inner structure and learn efficiently from unlabeled data . Natural language generation (NLG) for storytelling is especially challenging because it requires the generated text to follow an overall theme while remaining creative and diverse to engage the reader . Competitive story generation models still suffer from repetition , are unable to consistently condition on a theme  and struggle to produce a grounded, evolving storyboard . Published story visualization architectures that generate images require a descriptive text to depict the scene to illustrate . Therefore, it seems promising to evaluate an interactive multimodal generative platform that collaborates with writers to face the complex story-generation task. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. In this work, we introduce a system and a web-based demo, FairyTailor¹, for machine-in-the-loop visual story co-creation. Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. FairyTailor adds another modality and modifies the text generation process to produce a coherent and creative sequence of text and images. To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-creation of both texts and images. It allows users to give feedback on co-created stories and share their results. We release the demo source code² for other researchers' use.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021Cataloged from the official PDF of thesis.Includes bibliographical references (pages 41-45).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.