Language Models for Particle Detectors
Haichen Wang, Professor
Physics
Applications for Spring 2025 are closed for this project.
Particle detectors like the ATLAS detector at the Large Hadron Collider are complex apparatuses whose language is made of data recorded in sub-detectors and sophisticated readout modules. Inspired by large language model's revolutions in natural language processing, this project ultimately aims to develop one or more language models at different scales, which can understand detector vocabulary and can translate detector readouts to physics objects like particle tracks, electrons, etc.
Role: Specifically, this project involves:
1. Learn basic information about particle detectors and Language Models
2. Perform data processing on simulation data
4. Train language models and evaluate their performance
The above steps are foreseen and may be done in order. Discussing the results with researchers at LBNL and actively learning advanced Python programming and ML techniques are important parts of this work; related materials will be made available.
Qualifications: Applicants should be interested in particle physics. Applicants should finish classes related to Data Structure and Programming. Knowing Machine Learning Programming is a strong plus. Knowing data analysis with Numpy and Pandas is also desired. Majoring in Physics would be great, but it is not a requirement. We encourage EECS students to apply.
Hours: 9-11 hrs
Off-Campus Research Site: Our default mode of operation will be virtual, meeting on Zoom weekly and communicating via Slack and email. This research project will be directed by Dr. Xiangyang Ju of Lawrence Berkeley National Lab.
Mathematical and Physical Sciences