Skip to main content
  • UC Berkeley
  • College of Letters & Science
Berkeley University of California

URAP

Project Descriptions
Fall 2025

URAP Home Project Listings Application Contact

Website integration of DNA Sequencing Facility sample submission, data handling, and pipeline development / optimization.

Scott Geller, Research Scientist  
Molecular and Cell Biology  

Applications for Fall 2025 are closed for this project.

We are a campus research unit located in Barker Hall at the Northwest corner of the UC Berkeley campus. We support primarily on-campus molecular scientists and related professionals (graduate students, post-doctoral fellows, staff, etc) with their DNA sequencing and analysis needs. As DNA sequencing technologies continually advance, so do the associated computational needs in terms of data handling, manipulation, storage, analysis, presentation, and communication. We are currently looking to meet these larger dataset and manipulation requirements by expanding our computational capabilities in support of our diverse customer base. Though computational work and systems organization is necessary for proper processing of customer data – and we intend to work on these needs – we also need to further develop our outward-facing website to accommodate both sample submission and customer data retrieval.

Role: The initial project involves contributing to the improvement and streamlining of our Oxford Nanopore sequencing pipeline, which relies upon python scripting and data processing/manipulation on Berkeley's High Performance Computing cluster (Savio). Our initial deliverable is to expedite the computational processing and quality control of our Nanopore sequencing data (we sequence multiple samples every day, each with thousands or tens of thousands of reads). For better or worse the process is fairly manual and can be time-intensive; our hope is that we can derive computational approaches to enable [more] automated handling, processing, and quality control assessment of the sequencing reads and alignments. This part of the project requires computational fluency in Python and command line comfort in order to test and validate data manipulation strategies in concert with Savio (the Berkeley Research Computing supercluster).

In addition, we are looking to identify students with an interest in HTML markup, web design, Drupal, Wordpress, CSS, JavaScript, etc to help bring our website into the 21st century. Ideally, we would like to lay the groundwork to enable / foster sample submissions through our website – essentially forging ahead with a completely digital interface for the DNA Sequencing customer (with the exception of needing to send the physical sample to the lab as well).

Qualifications: Python, Savio, command line, bioinformatics would be a plus. Also web page coding / markup using WordPress, Drupal, CSS, JavaScript, would be useful. Database configuration and integration a plus.

Day-to-day supervisor for this project: Scott Geller, Staff Researcher

Hours: 6-8 hrs

Related website: https://mcb.berkeley.edu/barker/dnaseq/home
Related website: https://ucberkeleydnasequencing.com/

 Biological & Health Sciences   Engineering, Design & Technologies

Return to Project List

Office of Undergraduate Interdisciplinary Studies, Undergraduate Division
College of Letters & Science, University of California, Berkeley
Accessibility   Nondiscrimination   Privacy Policy