Saul Perlmutter, Professor

Closed (1) Machine Learning classification of astronomical transients

Applications for Spring 2022 are now closed for this project.

A main challenge in time domain astronomy is that of classifying different types of high-energy transients from large datasets solely based on imaging data or sparse spectroscopic observations. As part of this project, the student will use state-of-the-art machine learning techniques to identify different classes of transients in the latest Dark Energy Camera (DECam) and Dark Energy Spectroscopic Instrument (DESI) data. For students interested in the astronomical component of transients other than the classification, this project can also be extended to analyses of interesting transients, for example of kilonova candidates, which are expected to emerge from the merger of two neutron stars, or a neutron star and a black hole. Analyses of these transients will be used to infer the physical properties of the mergers and/or to constrain the merger rate of merging neutron stars in the Universe.


Day-to-day supervisor for this project: Antonella Palmese, Postdoc

Qualifications: Proficiency in Python coding. Experience with numpy/scipy/matplotlib/pandas, machine learning, and parallel computing is a plus.

Weekly Hours: to be negotiated

Off-Campus Research Site: Lawrence Berkeley National Lab, Building 50, or Campbell Hall. This project can also be carried out remotely with regular Zoom meetings.


Closed (2) Finding (with Deep Learning) and Modeling Strong Gravitational Lenses

Applications for Spring 2022 are now closed for this project.

Strong gravitational lenses are very rare occurrences and are a powerful tool in studying dark matter and dark energy, two mysterious entities that together account for 95% of the energy in the universe. We have found over 2000 strong gravitational lenses in a large imaging survey, the DESI Legacy Surveys (covering 1/3 of the sky, http://legacysurvey.org/), using deep neural networks. Our Hubble Space Telescope program provides images with exquisite details about these systems. This allows us to construct detailed models of these lenses in order to better understand dark matter and measure the expansion rate of the universe. We have developed a fast GPU-based lens modeling code (the fastest in the world) and will apply it to our systems with Hubble images. There is even a chance we may find a strongly lensed and highly magnified supernova. We expect to find hundreds *more* lenses. We aim to complete our next search by May, 2022. By then we will likely have found 3000 lenses, more than doubling the number of known lenses.

The student will focus on our next search for new lenses, the search for lensed supernovae, or strong lens modeling.

Day-to-day supervisor for this project: Dr. Xiaosheng Huang, Staff Researcher

Qualifications: Proficiency in Python coding a necessity. Experience with numpy/scipy/matplotlib/pandas and TensorFlow/JAX/PyTorch highly desirable, and machine learning a plus. Preferred but not required: 1) experience with parallel computing (we use supercomputers at NERSC, https://www.nersc.gov/); 2) experience with training neural nets on GPUs (we train on Google Colab and NERSC); 3) experience with HDF5/PyTables; 4) knowledge in gravitational lensing, or even some experience in modeling gravitational lensing systems; 5) experience with any of the following: modeling galaxy light profiles, image coaddition, and transient detection by image subtraction; 6) experience with web development.

Weekly Hours: 9-11 hrs

Off-Campus Research Site: Lawrence Berkeley National Lab, Building 50. In the current situation this project will likely be carried out remotely with regular Zoom meetings.

Related website: https://sites.google.com/usfca.edu/neuralens

Closed (3) Locating Historical Supernovae

Closed. This professor is continuing with Fall 2021 apprentices on this project; no new apprentices needed for Spring 2022.

Type Ia supernovae are used to measure the rate at which our Universes is expanding, today and over the past 10 billion years. Supernova distances can be measured more accurately if we know about the other stars with which they were born. Large numbers of nearby supernovae have been discovered by amateur astronomers and are still used for such cosmology measurements. But for many, their locations are not known well enough to measure the stars present around them in their parent galaxies. This project seeks to find historical images of such supernovae and measure their locations to the accuracy needed for modern cosmology.

We have a URAP student who has made tremendous progress on this project so far, but she will be graduating soon. We are looking for a student to work with her to learn how to find historical supernova images, process them to determine locations of known foreground stars, and use that information to determine the supernova location on the sky. Hunting for historical images involves finding digital images on-line or by contacting the amateur astronomers thought to have taken such images, but also includes digitizing images of supernovae from scientific papers. The student would also be asked to help in the preparation of a scientific journal article presenting the results, so that other scientists can use them.

The learning outcomes would be in learning more about supernovae, how they are discovered and utilized for cosmology, and how new measurements of supernova locations can be disseminated and used by the public.

Day-to-day supervisor for this project: Dr. Greg Aldering

Qualifications: For the early stages of the project the primary qualification needs to be persistence in tracking down leads to additional supernova images. In parallel, the student will earn how to execute existing on-line software, so they should be comfortable with learning on-line scientific applications. As the student learns more, they will need to become comfortable with some software development to measure supernova locations and keep track of the results in an on-line database (currently within AWS). The ability to work in a small team is essential, including a commitment to keeping the team up-to-date on progress throughout the semester.

Weekly Hours: 9-11 hrs

Off-Campus Research Site: Once the student is trained, some of the research can be done off-site since it is mostly on-line.

Most meetings taking place at Lawrence Berkeley National Lab, Building 50.

Closed (4) Public Editor: The Citizen Science Solution to Media Misinformation. DemoWatch Government Archives

Closed. This professor is continuing with Fall 2021 apprentices on this project; no new apprentices needed for Spring 2022.

Public Editor is a collaborative news assessment platform that brings the public together to improve their own media literacy, evaluate the quality of information circulating on the Internet, and share their results with the broader public. Participating students will get first-hand experience building, refining, and launching a national-scale data science project that aims to engage thousands of public volunteers and news readers.

Students on this project will analyze data from Public Editor. Working alongside a national coalition of social science researchers and journalists, a Nobel Laureate, cognitive scientists, and software designers/developers, students will test the robustness of the Public Editor system, fortify it against attacks by trolls, and implement gamification features to ensure volunteers enjoy their experience. Students will create (1) a Jupyter notebook performing validation studies on the system to ensure the accuracy and reliability of users' labels; (2) Red Team scenarios and solutions to those scenarios, and (3) a codebase for automatically updating users' badges, points, and leaderboard status based upon their activity data.

Public Editor can take up to 12 students.

You can also apply to join the sister project DemoWatch, a project identifying common sequences of interaction between protesters and governments and key decision points that result in violence, peace, and everything in between. The Demo Watch project has collected and is curating over 8,000 news articles describing all the interactions between police and protesters during the Occupy movement. This semester, students will work with senior researchers and professors from Goodly Labs, NYU, and the Univ. of Michigan to (1) implement/code a multi-level time-series model that will analyze curated Demo Watch data to find patterns of peaceful and violent activity; and (2) create a text classifier, via supervised machine learning, that is capable of scanning through news articles about protest to identify important data for analysis. Ideally, the semester will end with (1) a Jupyter notebook that intakes Demo Watch data and outputs data-enriched models of police/protester interaction and (2) a Jupyter notebook that intakes Demo Watch data and creates a text classifier via supervised ML.

OR

The Research Ready project seeks two students to help improve and maintain archives of government activity that researchers, journalists, and the public can easily query. The project team has already scraped the web for document files while retaining document metadata; identified and extracted meaningful data objects within the documents; linked those objects to external databases; prepared all this compiled textual data for computational analysis in R and Python; and partnered with the Social Science Research Council to host their newly formed database so that the public and other researchers can launch their own studies of the data. Now, we need people with text analysis and data visualization skills to help us improve this data and visualize it for the public and researchers. A successful semester would end with an improved database, and some nifty visualizations that make these tools more compelling for the public.

In your application, please specify which of these three projects you are most interested in (Public Editor, DemoWatch, or Government Archives).

Participating students will get first-hand experience building, refining, and launching a national-scale data science project.

Students will be expected to complete 40 hours of work over the semester, and contribute to system design, data analysis, and reports and presentations of findings and challenges throughout the semester. Students will work with a national coalition of social science researchers and journalists, a Nobel Laureate, cognitive scientists, and software designers/developers to develop a key social good technology of the future.

We are especially looking for students who are interested in the following roles:

(a.) Building a citizen science community, through recruiting, a community forum, social media outreach, etc., and asking questions to gauge the efficacy of different recruitment and retention strategies.
(b.) Software engineers to build/improve Public Editor’s Chrome extension, Mobile App, and automated data flow, and add functionality for volunteers, journalists, and readers. Must have relevant JavaScript experience for this role.
(c.) ML/Algorithm Designers – to test and improve Public Editor’s scoring and user reputation algorithms, and build its text classification algorithms. Experience with NLP is desirable., Post-Doc

Qualifications: The applicant should be in good standing. Public Editor is seeking students with experience analyzing data in Python and interest in using Amazon’s Cloud Computing tools like AWS Lambda. DemoWatch is seeking students with experience & interest in data modeling and supervised ML. Research Ready Government Archives is seeking students with experience in text analysis and data visualization.

Weekly Hours: 3-5 hrs

Off-Campus Research Site: remote

Related website: http://publiceditor.io/
Related website: https://www.goodlylabs.org/liberating-archives