Optimizing and automating a cell counting image-processing pipeline for histological comparisons in Alzheimer’s Disease
Lea Grinberg, Professor
UC San Francisco
Applications for Fall 2024 are closed for this project.
The importance of histology to neuropathological research cannot be overstated. As a neurology lab, the histological characterization of proteopathies (including tauopathies, Aβ-amyloidosis, synucleiopathies, etc.) is core to our operations. From determining severity/progression of the pathology to identifying areas of selective vulnerability, immunohistochemistry and microscopy are critical tools for scientific investigation of neurodegenerative diseases.
Quantification of the images obtained from these techniques, however, is a computationally expensive task. Traditionally, this quantification is done manually, with a human interpreter selecting neurons/cells of interest and segregating cell or expression types manually using tools like Fiji/ImageJ. This poses several disadvantages: the accuracy of the quantification will be subject to the experience of the researcher (a considerable issue of control with respect to scientific rigor) and humans are orders of magnitude slower than computers at such repetitive tasks.
Despite the speed of computers, automated segmentation remains a nontrivial problem. Due to a combination of confounding factors, histological sections from human brains are difficult for computers to sort – a simple binary threshold does not typically suffice, and often the morphology of the cell is a necessary consideration. With the advent of computer-vision and access to UCSF’s high-performance cluster, we believe that we have the opportunities and resources necessary to greatly accelerate our data pipeline; we are looking for a student to assist in optimizing and automating our cell counting process.
The specific aims are:
1. Image Registration: A consequence of our novel multiplex histology process is that slides scanned in different rounds sometimes need to be aligned. There is an existing protocol for this but it is slow and unoptimized, we would like to port this to a GPU-accelerated technique using either Vulkan or CUDA and run it on the UCSF High-Performance Cluster
2. Neuron Segmentation: Use computer-vision techniques to create a binary mask separating neurons from glia and background. There is also an existing model and protocol for this, but it has inefficiencies that we would like to address.
3. Better integrate various scripts used to set up files and prepare for automated cell counting. We currently use a whole host of different scripts to process each tissue case, but a single interface or Jupyter notebook that calls on existing libraries to unify the process and file hierarchies would be ideal to increase simplicity, reduce chance for error, and lower barrier of access for the less technically-fluent
4. Apply the new-and-improved cell-counting protocol to a multiplex histology project
Role: The student will assist a research associate with optimizing the data pipelines, writing new tools and scripts. For significant tools and optimizations, students will be fairly credited on posters and academic papers. The goal for this project will be a methods paper. In addition, there is a possibility for students to later take on their own projects provided there is enough interest and enthusiasm for the work.
Qualifications: Qualifications:
1. Coding Experience (Python, ML, Computer Vision, Data Science)
a. Frankly, we are looking for those with more of a computer science/data science background for this position. The primary language of this position is Python, and we will happily teach you the necessary neuroanatomy if you do not have that background!
b. The ability to write robust, modular, and well-documented code that is extensible, testable, maintainable, etc.
c. I would recommend you are fluent in Python (at least comfortable with CS 61A, CS 61B, Data 100 type of material), ideally you will have some experience or idea of Tensorflow/Pytorch or have some kind of computer vision experience. Pandas fluency will also make your life much easier, as we deal with lots of tabular data.
2. ~10 hours/week commitment, can be remote/mostly remote but I would encourage attendance to the lab at least once a week to touch base, review code, discuss with group members, and integrate into our lab :)
3. Detailed, organized, quick learner, and good with groups
a. You will be working with other lab members and also parsing through existing code, so the ability to work with others is a must.
b. Code must have proper commenting and documentation for ongoing utility in the future
4. Bonuses:
a. Familiarity with image technicalities and properties (if you’ve done any kind of imaging or photography work, that will serve you well)
b. If you’ve done work or projects in computer-vision before, that will be a great asset
c. Passion for neurodegenerative disease research: we are a lab and our goal is to further the knowledge and understanding of how these diseases come to be and progress
Day-to-day supervisor for this project: Ian Oh, Post-Doc
Hours: 9-11 hrs
Off-Campus Research Site: the work is held at the Grinberg Lab located at UCSF - Mission Bay campus. It is possible that after the initial training, part of the work will take place at the facilities of the Berkeley Institute of Data Science at UC Berkeley campus
Related website: http:\\grinberglab.ucsf.edu
Biological & Health Sciences