Heather Haveman, Professor

Closed (1) Oski Lab - Web-Scraping | Text Parsing| Machine Learning| Databases with Start-Ups and Entrepreneurs

Closed. This professor is continuing with Spring 2017 apprentices on this project; no new apprentices needed for Fall 2017.

For this project URAP apprentices will develop code in order to collect and clean data for a variety of research projects. Apprentices will apply their programming skills to scrape product data from publicly available websites and to turn messy unstructured data sets into shiny clean data sets available for reproducible research.

Participants will scrape data on a number of markets and phenomena. They will also develop code to analyze the discourse surrounding these markets as found in electronic forums visited by market participants and articles published in the mainstream and specialized press. Students will work with machine learning packages for text analysis to help analyze the millions of observations collected via our web scrapers.

We are collaborating with a Berkeley business incubator that will have 10 start-ups residing just off campus this fall. The most dedicated and productive apprentices will have the opportunity to work with start-up founders on real world problems.

You can find more about our research projects here: http://www.oskilab.com.

Cyrus Dioun will be the day-to-day contacts to field questions, trouble-shoot problems and address everyday issues.

Undergraduates that are proficient in programming languages and statistics will help us collect, clean, and analyze large data sets.

Day-to-day supervisor for this project: Cyrus Dioun

Qualifications: Advanced coding skills. Proficiency with machine learning and distributed computing.

Weekly Hours: to be negotiated

Related website: http://www.oskilab.com
Related website: http://bids.berkeley.edu/people/cyrus-dioun

Closed (2) Computer Vision: Classifying Photographs Using Deep Learning

Closed. This professor is continuing with Spring 2017 apprentices on this project; no new apprentices needed for Fall 2017.

We are looking for a few talented, irreverent advanced computer scientists that enjoy challenges, puzzles, and problem solving. We have collected hundreds of thousands of photographs with associated labels and want to use packages such as Caffe to train an algorithm to accurately classify them.

Last semester our team worked on normalizing the color and clarity of the photos and quantifying the color composition. This semester we will work on on recognizing shapes and finding a way to parallelize and speed up this data intensive task.

This project is funded by an Amazon Web Services Grant and will be run on EC2 instances.


Meeting bi-weekly. Writing code and implementing packages to analyze and classify photographs.

Day-to-day supervisor for this project: Cyrus Dioun, Ph.D. candidate

Qualifications: Advanced coding abilities in Python and Matlab. Courage. Persistence in the face of seemingly insurmountable odds.

Weekly Hours: to be negotiated

Closed (3) Chefs and Cooks: Exploring Race and Gender in Fine Dining and Food Writing

Closed. This professor is continuing with Spring 2017 apprentices on this project; no new apprentices needed for Fall 2017.

Who is a chef? What does a chef look like? How are the chef and the chef identity related to the food chefs prepare and/or the value of that food? How is the image or narrative of the chef shaped by or influenced by gender and race? Much of the sociological research about food focuses on either the processes of agro-ecology and the production of ingredients (farming, agriculture, the flawed food production system) or the process of eating (who eats what, what does this say about individuals’ or groups’ position in the social hierarchy, etc.). This project examines the neglected realm of professional cooking with an eye towards the gender and race dynamics of the professional fine dining kitchen. By collecting and analyzing data from the top food industry magazines and publications, we will examine some of the complexities and major trends surrounding the cultural meaning of food and fine food cooking in contemporary New York City and San Francisco.

Students' primary responsibility is data collection. Apprentices will scan, read and code magazine articles from the leading fine food magazines in the United State, learning about the data collection and analysis process with social scientific content analysis and text analysis.

Day-to-day supervisor for this project: Gillian Gualtieri, Ph.D. candidate

Qualifications: Attention to Detail; Timeliness; Interest in Topic; Restaurant Industry Experience (not required, but helpful); Apprentices must be able to regularly travel to downtown San Francisco to the SF Public library independently (via BART or other means)

Weekly Hours: 6-9 hrs

Off-Campus Research Site: 100 Larkin St San Francisco, CA 94102

Open (4) Charter Schools and the Business Age: Web-Scraping and Text Analysis

Open. Apprentices needed for the fall semester. Please do NOT contact faculty before September 11th (the start of the 4th week of classes)! Enter your application on the web beginning August 16th. The deadline to apply is Tuesday, August 29th at 8 AM.

If you’re interested in how today’s business age structures organizations and their messages to stakeholders, want to contribute to a team data collection & analysis effort focused on innovation in education, and are willing to challenge yourself through hands-on learning, then this is the project for you!

Here’s our focus: How does the push to run schools like businesses—complete with performance targets, incentives, and top-down oversight—shape the growing charter school sector, which is politically justified as an innovation incubator? Which charters survive and thrive in this political climate—those that stress standards-based rigor and college-readiness, or those that prioritize independent thinking and socio-emotional development?

To answer these questions, our team will extract and analyze mission statements (MSs) from the websites of every U.S. charter school open today. I am looking for outside-the-box, independent thinkers/tinkerers with significant computer science (CS) and/or coding skills to collaborate in web-scraping charters’ sites, cleaning HTML, performing text analysis and machine learning (natural language processing, dictionary methods, and topic models), database management, and statistical regression.


Day-to-day supervisor for this project: Jaren Haber, Ph.D. candidate

Qualifications: Basic to moderate experience with Python is a must; advanced skills in Python are a plus. Knowledge of HTML is also useful, and some analyses may use R. Other important qualities: Independent initiative, collaborative spirit, and timeliness in completing tasks.

Weekly Hours: to be negotiated

Related website: https://github.com/URAP-charter
Related website: https://osf.io/zgh5u/