Clinically-Informed Self-Supervised Learning of Medical Images
Ahmed Alaa, Professor
Electrical Engineering and Computer Science
Closed. This professor is continuing with Fall 2023 apprentices on this project; no new apprentices needed for Spring 2024.
Self-supervised learning (SSL) is a crucial driver of much of the recent progress in vision and language modeling. In the SSL paradigm, a model is pre-trained through a pretext task that only involves unlabeled data—the pre-trained representation is then fine-tuned on downstream tasks of interest where only a small number of labeled examples may be available. The transferability of pre-trained representations to downstream tasks hinges on the quality of the pretext tasks used for pre-training. Existing SSL approaches for visual representation learning are often meant for diverse object-centric data sets, hence the pretext tasks used in these settings aim at learning generic visual features that are not informed by prior (domain-specific) knowledge. In clinical applications, the data of interest are medical images that exhibit a very strong inductive bias, i.e., all images in a data set capture the same phenomena. This project will investigate (theoretically and empirically) different methodologies for incorporating prior clinical knowledge into self-supervised pre-training of visual representations.
Role: The undergraduate will develop novel methodologies for self-supervised learning tailored to medical imaging data set, and will be supervised directly by the PI. Students will be expected to meet with their supervisor at least twice a week. The students will be trained to conduct literature reviews, engage in scientific writing, formulate practical problems in an abstract form, develop new algorithms and run experiments.
Successful applicants should have a strong background or interest in machine learning, computer vision and/or statistics, and are expected to commit a minimum of 12 hours/week. Experience with Python is required.
Qualifications: Python (required), machine learning (required), computer vision (desirable).
Hours: 12 or more hours
Digital Humanities and Data Science Engineering, Design & Technologies