NucScholar: Natural Language Processing for Nuclear Science References
Bethany Goldblum, Research Engineer
Nuclear Engineering
Closed. This professor is continuing with Spring 2024 apprentices on this project; no new apprentices needed for Fall 2024.
NucScholar is a software platform in development for the retrieval, categorization, and recommendation of nuclear physics literature. The current means by which researchers and nuclear data evaluators identify and process bibliographic information is the Nuclear Science References (NSR) database, the starting point for all nuclear structure evaluations and a platform of critical importance to the nuclear data pipeline. However, NSR is limited in capability (with a fixed set of human-derived keywords) and heavily reliant upon human intelligence tasks (e.g., bibliographic entries are generated manually by subject matter experts!); thus, it is resource- and time-intensive to maintain. NucScholar provides the foundation for a sea change in NSR using a modern software framework and natural language processing tools to automatically collate and process nuclear science literature. NucScholar further expands the volume and variety of bibliographic information available to the nuclear data community without heavy reliance on human intervention.
Role: Responsibilities of this position may include text extraction and mining, named entity recognition, NLP algorithm development, data analysis, website development, UI/UX, etc. The student is required to attend and participate in a weekly research group meeting. This assistantship provides opportunities for authorship of peer-reviewed journal articles. Successful candidates will have a passion for data science and interest in natural language processing for applications.
Qualifications: Required:
Lower Division Physics (7 Series) and math through Math 54; Programming fundamentals
Desired (or what you'll learn):
Upper division undergraduate standing; Proficiency in Python programming; Familiarity with a Linux/Unix environment; Completion of NE101 Nuclear Reactions and Radiation (or equivalent); Experience with methods and tools; Data visualization
Hours: 9-11 hrs
Off-Campus Research Site: Remote
Related website: http://appliedphysics.nuc.berkeley.edu/
Related website: http://appliedphysics.nuc.berkeley.edu/