Big Data for Global Employment Dynamics
Anastassia Fedyk, Professor
Business, Haas School
Applications for Spring 2024 are closed for this project.
This is an opportunity to work with a large dataset of over 400 million employment profiles (resumes) in order to understand global employment dynamics and firm performance. In this project, we will leverage techniques from big data and machine learning to structure and analyze large textual data, which can help address questions ranging from individual career outcomes to understanding firm performance to automation and future of work. What are the effects of employment shocks such as the collapse of Lehman Brothers? Which firms are investing in new technologies such as AI, and what are the consequences of these invwstments? Which skillsets are booming and which are becoming obsolete in the modern economy? In this project, students will be able to practice the types of techniques needed in order to be able to work with large datasets (on the order of 100s of gigabytes).
Role: Key tasks will include:
- Working with textual employment data (similar to LinkedIn profiles) to extract and understand job titles, skills, and demographics.
- Using a range of statistical tools to analyze the relationship between employment metrics and firm outcomes.
Students who work on this project will increase their knowledge of:
- Working efficiently with real-world large datasets in the hundreds of gigabytes
- Using econometric and machine learning techniques
- Understanding and conducting scientific research processes
Qualifications: Required:
- Experience with at least one of: Python, C/C++, R, Matlab, in a comprehensive way (i.e., WITHOUT excessive reliance on specific packages such as Pandas);
- Foundational computer science courses (Data Structures & Algorithms);
- Experience writing object oriented, efficient, modular code, NOT just Jupyter notebooks.
Preferred:
- Foundational Statistics and/or Machine Learning courses;
- Practical experience with Machine Learning and/or Econometric analysis.
Must be willing to put in 10 hours/week every week, with no exception.
Hours: 9-11 hrs
Related website: https://sites.google.com/berkeley.edu/fedyk
Digital Humanities and Data Science Social Sciences