Reading and annotating novels (in English, Spanish, Russian, German or Japanese) for NLP
David Bamman, Professor
Information, School of
Closed. This professor is continuing with Fall 2023 apprentices on this project; no new apprentices needed for Spring 2024.
LitBank an annotated dataset of fiction to support tasks in natural language processing and the computational humanities. While it currently exists for English, we'll be branching out to create similar resources for other language as well. The primary research will involve carrying out linguistic annotations (e.g., reading novels and marking the people and places contained within them), and exploring the affordances of this data for work in cultural analytics. This URAP is for students who are interested the field of cultural analytics (using empirical methods to study culture) and come from a humanities/social science background.
Role: Research will involve annotating novels for the linguistic phenomena described above. We will contextualize this research by reading and discussing papers in cultural analytics. Participation in biweekly group meetings to discuss progress and questions (lasting one hour) is required.
Qualifications: A background in humanities or social sciences, and an interest in the digital humanities/cultural analytics, is preferred. No programming experience is required, but this research can be a stepping stone for students looking to carry out future computational research in this space.
Hours: 9-11 hrs
Off-Campus Research Site: Online
Engineering, Design & Technologies Arts & Humanities Digital Humanities and Data Science