Maryam Vareth, Researcher

Closed (1) Medical Imaging Research: k-space MRI Reconstruction Using Deep Learning

Closed. This professor is continuing with Fall 2020 apprentices on this project; no new apprentices needed for Spring 2021.

One of the fastest growing fields of research in medical imaging during the last several years is the use of machine learning methods for image reconstruction.

This project aims to use deep learning approaches in image reconstruction to accelerate Magnetic Resonance Imaging (MRI) acquisition and in result reduce MRI examination times for patients. Two of the most influential development in this area during the last two decades have been parallel imaging and compressed sensing. Both of these rapid imaging techniques are based on the principle of reducing the number of lines that are acquired in k-space, which reduces the scan time and then exploiting the redundancy in measured data during the image reconstruction. In Parallel imaging, the redundancy arises from the simultaneous acquisition of MR signal with multiple receive coils; in compressed sensing, it derives from the observation that images are generally compressible. Machine learning approaches have adopted similar strategies for the acceleration of MRI, which set the main design criteria for this project. The majority of existing work have focused on designing better reconstruction models given a pre-determined acquisition trajectory, ignoring the question of trajectory optimization. In this project, we also focus on learning acquisition trajectories given a fixed image reconstruction model.

To make the image reconstruction problem realistic, we will use a large-scale database of raw (complex-valued) k-space data obtained directly from MRI scanners (https://fastmri.med.nyu.edu/).

The specific selection of tasks will depend on the skill sets and interest of the students and could include developing, implementing, refining, and testing algorithms and workflows to achieve the specific goals of this project. The students will work in teams and closely with graduate students and post-docs.

Possible Tasks:
• Re-implementation and/or standardization of existing approaches
• Writing modules for validation
• Generic testing and debugging
• Presenting work at group meetings
• Formal presentation at the end of the semester to BIDS community
• Upon successful progress, contribute to a manuscript

References
• fastMRI: An Open Dataset and Benchmarks for Accelerated MRI (https://arxiv.org/pdf/1811.08839.pdf)
• https://github.com/facebookresearch/fastMRI
• https://github.com/mrirecon/bart


Qualifications: Students from various majors are encouraged to apply, including but not limited to EECS, BioE, CS, and Data Science. We are looking for 2 highly inquisitive students who have: - (Required) • Interest in open source software development, data science, medical imaging, machine learning, engineering and healthcare research • Great teamwork (e.g. communication skills, punctuality, organization) • Proficiency in programming languages (Python and/or MATLAB) • Working knowledge of Tensorflow/Keras or Pytorch • Working knowledge of version control (e.g. GitHub) • Familiarity with Linux/Unix environment - (Recommended) • Working knowledge of basic machine learning and deep learning (cost function, cross-validation, overfitting, error analysis, etc) • Working knowledge of signal processing and Image processing • CS 188 and/or CS 189 • EE 120 and/or EE 145B

Weekly Hours: 9-11 hrs

Off-Campus Research Site: During the Fall 2020 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: https://bids.berkeley.edu/
Related website: https://innovateforhealth.berkeley.edu/

Closed (2) Developing an open-source software package for generating synthetic electronic health record data

Closed. This professor is continuing with Fall 2020 apprentices on this project; no new apprentices needed for Spring 2021.

Research access to electronic health record (EHR) data is limited due to patient privacy concerns. Creating synthetic EHR data (data that models realistic patterns and yet does not correspond to real patient records) provides a potential mechanism to expand data access.

Although the academic and commercial sectors have developed successful methodologies for generating realistic synthetic EHR data, these methodologies are not in common use despite a great need. Commercial products are closed source and expensive. Academic solutions are sometimes open source but often buggy, unportable, and difficult to use, especially for clinical users. Furthermore, different generators are implemented in different languages and packages, making direct comparison and benchmarking laborious. Finally, validation of realism and privacy preserving properties of generated synthetic datasets is often not incorporated into the generation pipeline.

This project aims to create a portable, usable, consolidated, and open-source software package for generating synthetic EHR data. The student will gain knowledge of open-source software, generative, unsupervised machine learning techniques such as generative adversarial networks, user-interface design in addition to gaining exposure to EHR data and healthcare data science. The student will be expected to present on their work at the end of the semester in addition to the potential to contribute to a manuscript describing the developed software package.

The specific selection of tasks will depend on the skill sets and interest of the student and could include the following:

• Re-implementation and/or standardization of existing approaches to generating synthetic EHR generators. For example...
• MedGAN (Choi et al. 2017, https://github.com/mp2893/medgan)
• CorGAN (Torfi et al. 2020, https://github.com/astorfi/cor-gan)
• Writing modules for validation of synthetic data realism and privacy preserving properties
• Developing a command line interface or dashboard for users to interact with the package
• Generic testing and debugging
• Testing the package on a real EHR dataset (e.g. MIMIC-III, https://mimic.physionet.org/)
• Presenting work at group meetings
• Formal presentation at the end of the semester to BIDS community
• Contribute to writing a manuscript


References
• Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W. F., Sun, J. (2017). Generating Multi-label Discrete Patient Records using Generative Adversarial Networks. 68(PG-1-20), 1–20.
• Torfi, A., Fox, E. A. (2020, January 25). COR-GAN: Correlation-Capturing Convolutional Neural Networks for Generating Synthetic Healthcare Records. ArXiv.Org.

Day-to-day supervisor for this project: Haley Hunter-Zinck

Qualifications: (Required): • Interest in open-source software development, data science, machine learning, and healthcare research • Great teamwork (e.g. communication skills, punctuality, organization) • Proficiency in Python • Majoring in EECS, BioE, CS, data science, math, statistics, or other related discipline • Working knowledge of Tensorflow/Keras or Pytorch • Working knowledge of version control (e.g. GitHub) (Recommended): • Familiarity with machine learning • Experience with portable package creation (e.g. Docker) • CS 188 and/or CS 189

Weekly Hours: 9-11 hrs

Off-Campus Research Site: During the Fall 2020 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: https://bids.berkeley.edu/
Related website: https://innovateforhealth.berkeley.edu/

Closed (3) Anomaly Detection using Deep Learning for Fundamental Physics Discovery

Closed. This professor is continuing with Fall 2020 apprentices on this project; no new apprentices needed for Spring 2021.

This is an exciting time in fundamental physics, with many current or planned experiments producing complex data. There are many experimental and theoretical hints for new phenomena (such as dark matter), but we do not yet have any significant evidence for new particles or forces of nature since the discovery of the Higgs Boson in 2012. This could be because our experiments are not sensitive enough, that the new particles are rare, or that we are not looking in the right place. The goal of this project is to investigate this last possibility. We have developed a variety of deep learning methods to automatically explore the high-dimensional data with as little model bias as possible (“less than supervised”). This project will involve developing, integrating, and/or deploying deep learning-based anomaly detection techniques to a variety of physical systems including collider physics (e.g. the Large Hadron Collider) and indirect dark matter detection (e.g. Gaia space observatory). This project will involve developing, integrating, and/or deploying deep learning-based anomaly detection techniques to a variety of physical systems including collider physics (e.g. the Large Hadron Collider) and indirect dark matter detection (e.g. Gaia space observatory).

The exact work will depend on the experience, availability, interest, and progress of the student. Research in this area is at the intersection of theory, experiment, and applied statistics/machine learning. At least 6-8 hours are typically needed to make significant progress.

References:
https://indico.cern.ch/event/809820/contributions/3708303/attachments/1971116/3347225/SummaryTalk.pdf
https://lhco2020.github.io/homepage/


Day-to-day supervisor for this project: Benjamin Nachman

Qualifications: (Required): • Interest in particle physics and machine learning solutions to physics challenges. Majoring/minoring in Physics or Astronomy would be great, but this is not required; majors in EECS/CS/Data Science/Math/Statistics ore related disciplines would be most welcome with significant interest in physics topics. • Experience with at least one programming language (Java/C++/Matlab/Python/Julia/etc.). The research will mostly be carried out in Python, so this would be desired but is not required. • Great teamwork (e.g. communication skills, punctuality, organization). (Recommended): • Basic probability and statistics. • Experience with communication and collaboration tools like Slack and Github. • Experience with deep learning packages like Keras/Tensorflow or PyTorch.

Weekly Hours: to be negotiated

Off-Campus Research Site: During the Fall 2020 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: http://bnachman.web.cern.ch/bnachman/