Priya Moorjani, Professor

Open (1) Using machine learning for chromosome ancestry painting in human populations

Open. Apprentices needed for the fall semester. Enter your application on the web beginning August 15th. The deadline to apply is Monday, August 27th at 9 AM.

As sequencing costs are plummeting, this number of sequenced human genomes is increasing exponentially. A critical challenge is to develop well-powered methods that can be applied to genetic data from heterogeneous populations, including admixed populations (that have ancestry from multiple ancestral groups). The genome of an admixed individual is a mosaic of chromosomal segments inherited from distinct ancestral populations. Understanding the ancestry of each block has a wide range of applications from disease mapping to inferring population history. This information is however unknown, and needs to be inferred from the data. To model admixture in a more flexible framework, we propose to apply machine learning methods such as Hidden Markov Models, Conditional Random Field and Support Vector machine that can robustly classify ancestry at each locus in the genome. The genomic ancestry information provided by our method can then be leveraged to identify disease associations or targets of selection.

Undergraduate will take responsibility in one of 2 roles: 1) Develop and extend previous methods available in the lab, or 2) Perform simulations to test the performance (accuracy and runtime) for various demographic parameters. The student will learn about populations genetics and machine learning, and will contribute to research publications associated with this work.

Qualifications: Proficiency in Python or C++ (required), Prior experience in genomic data analysis (desirable), knowledge of statistics and population genetics theory (desirable), Machine Learning (desirable). We prefer to recruit Sophomores or Juniors, with the expectation that they will work towards an honors thesis in their senior year.

Open (2) Evolution of mutation rate across primates

Open. Apprentices needed for the fall semester. Enter your application on the web beginning August 15th. The deadline to apply is Monday, August 27th at 9 AM.

Germline mutations are the ultimate source of genetic differences among individuals and across species; they provide the raw material for selection to act on, as well as play a role in many diseases. As mutations occur steadily over time, they provide a record of the time elapsed and hence a “molecular clock” for dating evolutionary events. However, despite strong constraints on the replication machinery, recent studies have shown that the mutation rate as well as the mutation spectra evolves rapidly across closely related species and also varies among humans. Thus, to investigate the causes of interspecies variation in mutation rate and to build robust models of evolution, we are interested in estimating direct pedigree-based mutation rates in humans and other primates. This will allow us to learn about the determinants of mutation rate and the mechanisms impacting its evolution across species.

Undergraduate will take responsibility for: 1) Applying standard pipelines for sequencing alignment and mapping to identify de novo mutations in pedigrees, 2) Compare variation in mutation rates across species. The student will learn about cutting edge methods for mapping and alignment of human sequence data, and will contribute to research publications associated with this work.

Qualifications: Proficiency in Python or C++ (required), Prior experience in genomic data analysis (desirable), knowledge of statistics and population genetics theory (desirable), Machine Learning (desirable). We prefer to recruit Sophomores or Juniors, with the expectation that they will work towards an honors thesis in their senior year.

Weekly Hours: more than 12 hrs

Related website: https://moorjanilab.org/