Charles J. Fillmore's theory of Frame Semantics posits that the meanings of words and linguistic expressions are largely understood against the background of the semantic frames they evoke in the mind of the hearer. Semantic frames are clusters of concepts, or gestalts, which include a description of the participants in the frame (called frame elements), and a list of words or phrases (lexical units) that evoke the frame. Many frames represent events, like the frame Arriving, which has has frame elements such as the Theme which moves and the Goal to which the Theme moves and is evoked by expressions like approach, enter, get (to), make it (to), and reach. Other frames represent relations such as the Locative_relation frame, evoked by expressions like near, over, out (of), and beyond., states (such as Being_in_operation) and entities, such as Clothing and Food. The FrameNet project at the International Computer Science Institute in Berkeley (http://framenet.icsi.berkeley.edu) has been ongoing since 1997 and has defined more than 1,200 such frames covering more than 13,000 lexical units and manually annotated more than 200k examples of how they are evoked in sentences and which parts of those sentences fill which frame elements (roles). The FrameNet lexical database is has been downloaded roughly 5,000 times and is widely used by natural language processing (NLP) researchers and software developers around the world. (See https://framenet.icsi.berkeley.edu/fndrupal/framenet_users for a partial list.)

This project will be a combination of a seminar on Frame Semantics as part of NLP and a set of small projects to make concrete improvements in the FrameNet database and software. Students will learn about both current theories of lexical semantics and the practical realities of an NLP project.

This will be a 2-credit course and will meet twice a week, ideally once in person and once remotely, but this may change depending on the public health situation.

Topics to be covered in the seminar will include:
• A brief overview of linguistics and the work of Charles J. Fillmore (emeritus UC Berkeley)
• Introduction to cognitive linguistics and lexical semantics
• Frame Semantics: Frames, Lexical units (LUs), and Frame Elements (FEs)
• The FrameNet project, its history, and present status
• FrameNet annotation theory and practice
• FrameNet, NLP, and automatic semantic role labeling (ASRL)
• FrameNet applications (such as Information extraction and question answering)
• FrameNets in languages other than English and in specialized domains

The projects are intended to be both educational and useful contributions to the development of FrameNet; most will be suitable for a group of 2 or 3 students, but individual projects are also possible. Examples of possible projects are:
• Annotating more sentences for existing, unannotated LUs
• Adding more common senses of certain LUs e.g. grass,
• Correcting known errors in FrameNet annotation
• Filling in more FEs in annotation semiautomatically
• Connecting with FrameNets in specialized domains
• Improving the generation of the XML currently exported
• Create a JSON and/or graph representation and software to export it
• Improve the annotation tools available for FrameNet
• Make more corpora available for annotation, with a more robust importation system
• Analyze errors in the output of ASRL systems and how to reduce them

Day-to-day supervisor for this project: Dr. Collin Baker, Staff Researcher

Qualifications: Students from a variety of backgrounds are welcome; familiarity with linguistics, programming, statistics/data science, and more than one language family would be very helpful. If interested, please send email to Collin Baker with the following as PDF documents attached: • a statement of why you are interested in the project • your resume/CV (1 or 2 pages) • a copy of your unofficial transcript from UC Berkeley (not a screen shot, please!)

