Skip to main content
  • UC Berkeley
  • College of Letters & Science
Berkeley University of California

URAP

Project Descriptions
Spring 2025

URAP Home Project Listings Application Contact

Big Data Preparation

Anastassia Fedyk, Professor  
Business, Haas School  

Applications for Spring 2025 are closed for this project.

This project focuses on data: hand-collecting new data, improving existing data, and structuring large messy data. Today, very large data sets are often at the heart of many social science research questions. However, those data sets can be plagued by big data problems: missing data, bad data, duplicate data.

This project is a great starter for younger students (freshmen and sophomores) interested in working hands-on with data and can be a great segue to the more technical project 'Big Data Global Employment Dynamics.' Students will learn and practice all aspects of data analysis and preparation required before statistical methods such as machine learning can be applied to social science questions.

Role: Key tasks will include:
- Analyzing/annotating data sets in preparation for analysis.
- Identifying additional/alternative data sets that could help to answer key research questions.
- LInking multiple data sets and running summary statistics.
- Visualizing data in useful ways to communicate ideas.

Students involved in this project will perfect their skills in:
- Understanding big data in the context of real world questions.
- Data validation/annotation/visualisation.
- Effective communication, both written and verbal.

Qualifications: Required:
- Highly organized;
- Detail oriented;
- Interested in big data

Must be willing to put in 10 hours/week every week, with no exception.

Hours: 9-11 hrs

Related website: https://sites.google.com/berkeley.edu/fedyk

 Digital Humanities and Data Science   Social Sciences

Return to Project List

Office of Undergraduate Interdisciplinary Studies, Undergraduate Division
College of Letters & Science, University of California, Berkeley
Accessibility   Nondiscrimination   Privacy Policy