Maryam Vareth, Innovate for Health Co-Director

Closed (1) Medical Imaging Research: k-space MRI Reconstruction Using Deep Learning

Applications for fall 2021 are now closed for this project.

One of the fastest growing fields of research in medical imaging during the last several years is the use of machine learning methods for image reconstruction.

This project aims to use deep learning approaches in image reconstruction to accelerate Magnetic Resonance Imaging (MRI) acquisition and in result reduce MRI examination times for patients. Two of the most influential development in this area during the last two decades have been parallel imaging and compressed sensing. Both of these rapid imaging techniques are based on the principle of reducing the number of lines that are acquired in k-space, which reduces the scan time and then exploiting the redundancy in measured data during the image reconstruction. In Parallel imaging, the redundancy arises from the simultaneous acquisition of MR signal with multiple receive coils; in compressed sensing, it derives from the observation that images are generally compressible. Machine learning approaches have adopted similar strategies for the acceleration of MRI, which set the main design criteria for this project. The majority of existing work have focused on designing better reconstruction models given a pre-determined acquisition trajectory, ignoring the question of trajectory optimization. In this project, we also focus on learning acquisition trajectories given a fixed image reconstruction model.

To make the image reconstruction problem realistic, we will use a large-scale database of raw (complex-valued) k-space data obtained directly from MRI scanners (https://fastmri.med.nyu.edu/).

The specific selection of tasks will depend on the skill sets and interest of the students and could include developing, implementing, refining, and testing algorithms and workflows to achieve the specific goals of this project. The students will work in teams and closely with graduate students and post-docs.

Possible Tasks:
• Re-implementation and/or standardization of existing approaches
• Writing modules for validation
• Generic testing and debugging
• Presenting work at group meetings
• Formal presentation at the end of the semester to BIDS community
• Upon successful progress, contribute to a manuscript

References
• fastMRI: An Open Dataset and Benchmarks for Accelerated MRI (https://arxiv.org/pdf/1811.08839.pdf)
• https://github.com/facebookresearch/fastMRI
• https://github.com/mrirecon/bart


Qualifications: Students from various majors are encouraged to apply, including but not limited to EECS, BioE, CS, and Data Science. We are looking for 2 highly inquisitive students who have: - (Required) • Interest in open source software development, data science, medical imaging, machine learning, engineering and healthcare research • Great teamwork (e.g. communication skills, punctuality, organization) • Proficiency in programming languages (Python and/or MATLAB) • Working knowledge of Tensorflow/Keras or Pytorch • Working knowledge of version control (e.g. GitHub) • Familiarity with Linux/Unix environment - (Recommended) • Working knowledge of basic machine learning and deep learning (cost function, cross-validation, overfitting, error analysis, etc) • Working knowledge of signal processing and Image processing • CS 188 and/or CS 189 • EE 120 and/or EE 145B

Weekly Hours: 9-11 hrs

Off-Campus Research Site: During the Fall 2021 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: https://bids.berkeley.edu/
Related website: https://innovateforhealth.berkeley.edu/

Closed (2) Developing an open-source software package for generating synthetic electronic health record data

Closed. This professor is continuing with Spring 2021 apprentices on this project; no new apprentices needed for Fall 2021.

Research access to electronic health record (EHR) data is limited due to patient privacy concerns. Creating synthetic EHR data (data that models realistic patterns and yet does not correspond to real patient records) provides a potential mechanism to expand data access.

Although the academic and commercial sectors have developed successful methodologies for generating realistic synthetic EHR data, these methodologies are not in common use despite a great need. Commercial products are closed source and expensive. Academic solutions are sometimes open source but often buggy, unportable, and difficult to use, especially for clinical users. Furthermore, different generators are implemented in different languages and packages, making direct comparison and benchmarking laborious. Finally, validation of realism and privacy preserving properties of generated synthetic datasets is often not incorporated into the generation pipeline.

This project aims to create a portable, usable, consolidated, and open-source software package for generating synthetic EHR data. The student will gain knowledge of open-source software, generative, unsupervised machine learning techniques such as generative adversarial networks, user-interface design in addition to gaining exposure to EHR data and healthcare data science. The student will be expected to present on their work at the end of the semester in addition to the potential to contribute to a manuscript describing the developed software package.

The specific selection of tasks will depend on the skill sets and interest of the student and could include the following:

• Re-implementation and/or standardization of existing approaches to generating synthetic EHR generators. For example...
• MedGAN (Choi et al. 2017, https://github.com/mp2893/medgan)
• CorGAN (Torfi et al. 2020, https://github.com/astorfi/cor-gan)
• Writing modules for validation of synthetic data realism and privacy preserving properties
• Developing a command line interface or dashboard for users to interact with the package
• Generic testing and debugging
• Testing the package on a real EHR dataset (e.g. MIMIC-III, https://mimic.physionet.org/)
• Presenting work at group meetings
• Formal presentation at the end of the semester to BIDS community
• Contribute to writing a manuscript


References
• Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W. F., Sun, J. (2017). Generating Multi-label Discrete Patient Records using Generative Adversarial Networks. 68(PG-1-20), 1–20.
• Torfi, A., Fox, E. A. (2020, January 25). COR-GAN: Correlation-Capturing Convolutional Neural Networks for Generating Synthetic Healthcare Records. ArXiv.Org.

Qualifications: (Required): • Interest in open-source software development, data science, machine learning, and healthcare research • Great teamwork (e.g. communication skills, punctuality, organization) • Proficiency in Python • Majoring in EECS, BioE, CS, data science, math, statistics, or other related discipline • Working knowledge of Tensorflow/Keras or Pytorch • Working knowledge of version control (e.g. GitHub) (Recommended): • Familiarity with machine learning • Experience with portable package creation (e.g. Docker) • CS 188 and/or CS 189

Weekly Hours: 9-11 hrs

Off-Campus Research Site: During the Fall 2020 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: https://bids.berkeley.edu/
Related website: https://innovateforhealth.berkeley.edu/

Closed (3) Anomaly Detection using Deep Learning for Fundamental Physics Discovery

Closed. This professor is continuing with Spring 2021 apprentices on this project; no new apprentices needed for Fall 2021.

This is an exciting time in fundamental physics, with many current or planned experiments producing complex data. There are many experimental and theoretical hints for new phenomena (such as dark matter), but we do not yet have any significant evidence for new particles or forces of nature since the discovery of the Higgs Boson in 2012. This could be because our experiments are not sensitive enough, that the new particles are rare, or that we are not looking in the right place. The goal of this project is to investigate this last possibility. We have developed a variety of deep learning methods to automatically explore the high-dimensional data with as little model bias as possible (“less than supervised”). This project will involve developing, integrating, and/or deploying deep learning-based anomaly detection techniques to a variety of physical systems including collider physics (e.g. the Large Hadron Collider) and indirect dark matter detection (e.g. Gaia space observatory). This project will involve developing, integrating, and/or deploying deep learning-based anomaly detection techniques to a variety of physical systems including collider physics (e.g. the Large Hadron Collider) and indirect dark matter detection (e.g. Gaia space observatory).

The exact work will depend on the experience, availability, interest, and progress of the student. Research in this area is at the intersection of theory, experiment, and applied statistics/machine learning. At least 6-8 hours are typically needed to make significant progress.

References:
https://indico.cern.ch/event/809820/contributions/3708303/attachments/1971116/3347225/SummaryTalk.pdf
https://lhco2020.github.io/homepage/


Qualifications: (Required): • Interest in particle physics and machine learning solutions to physics challenges. Majoring/minoring in Physics or Astronomy would be great, but this is not required; majors in EECS/CS/Data Science/Math/Statistics ore related disciplines would be most welcome with significant interest in physics topics. • Experience with at least one programming language (Java/C++/Matlab/Python/Julia/etc.). The research will mostly be carried out in Python, so this would be desired but is not required. • Great teamwork (e.g. communication skills, punctuality, organization). (Recommended): • Basic probability and statistics. • Experience with communication and collaboration tools like Slack and Github. • Experience with deep learning packages like Keras/Tensorflow or PyTorch.

Weekly Hours: to be negotiated

Off-Campus Research Site: During the Fall 2020 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: http://bnachman.web.cern.ch/bnachman/

Closed (4) Developing a dashboard for primary care clinicians to support their situational awareness

Applications for fall 2021 are now closed for this project.

Situational awareness for primary care physicians plays an important role in providing proactive care for patients with complex health conditions. Many factors may impede physician’s situational awareness, particularly aggregating scattered data from a patient's electronic medical records. Then, creating a dashboard that can automatically pull high priority information from a patient's medical data and present it in a way to help clinician understand patient overall health status in a shorter time would improve quality of care in primary care offices.

So far, several dashboards have been developed for single specialties but our goal in this project is to address a patient’s overall health status as a whole person instead of addressing each single disease.


This project aims to create a portable, usable, consolidated, and open-source software platform for generating a dashboard to present personalized high priority patient data. The student will gain knowledge of open-source visualization platforms, visualization techniques, user-interface design, EHR data analysis and healthcare data analytics in addition to gaining exposure to product management skills. The student will be expected to present on their work at the end of the semester in addition to the potential to contribute to a manuscript describing the developed dashboard.


The specific selection of tasks will depend on the skill sets and interest of the student and could include the following:

• Building data visualization platforms on the web using D3.js
• Storytelling through Data Visualization: building an open source expressive visual storytelling environment for presenting timelines
• Qualitative analysis of interview data and statistical analysis of survey data
• Evaluation of user experience, cognitive load, and performance of the product for primary care clinicians
• Entrepreneurial tasks including communication with all stakeholders to provide the proper level of cooperation and developing pitches for various audiences and goals
• Presenting work at group meetings
• Formal presentation at the end of the semester to BIDS community
• Contribute to writing a manuscript

References
● Timeline storyteller: https://timelinestoryteller.com
● Timelines Revisited: A Design Space and Considerations for Expressive Storytelling: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/10/Brehmer-TVCG-2017.pdf
● Visualization dashboards for EHR data improve situational awareness, decrease https://www.aiin.healthcare/topics/business-intelligence/visualization-dashboards-ehr-data-improve-situational-awareness



Day-to-day supervisor for this project: Drs. Maryam Vareth and Akram Bayat

Qualifications: - (Qualifications) ● Interest in data science, clinical data visualization, product management ● Great teamwork (e.g., communication skills, punctuality, organization) ● Proficiency in Python for programming tasks - (Recommended) ● Familiarity with data analytics for analysis tasks ● Entrepreneurship experience

Weekly Hours: 12 or more hours

Off-Campus Research Site: During the Fall 2021 semester, we will only meet and communicate via Zoom, Slack,and email.

Related website: https://bids.berkeley.edu/
Related website: https://innovateforhealth.berkeley.edu/