Research on labor economics and inequality in the United States
Paul Gertler, Professor
Business, Haas School
Closed. This professor is continuing with Fall 2023 apprentices on this project; no new apprentices needed for Spring 2024.
Project 4. Research on labor economics and inequality in the United States
Project abstract: This project has two goals. First, this project investigates the extent to which racial preferences in labor demand practices prior to the Civil Rights Act of 1964 reinforced occupational sorting or exclusion of black employees across occupations. The undergraduates working on this goal will be assisting with the compilation of digitized historical records and relate tasks including manipulating census data. Second, the project also has a component that requires applying sentiment analysis on text data provided by students subject to gender stereotyped assessments by teachers. The undergraduates working on these activities would have to be proficient in Python nltk, spaCy, pyTorch, among others.
Note: This work is not in collaboration with Prof Gertler, but in collaboration with Joan Martinez (Postdoctoral Researcher)
Role: Roles: The goal is to match undergraduates to tasks based on interest and skills. We expect undergraduates to mainly contribute to the empirical and data-intensive aspects of the project. We use Python, R, and Stata.
The tasks include, for example:
1) Digitizing novel historical data
2) Web scraping and Optical Character Recognition of historical records
3) Combining data from the US census and other sources to construct new data set
4) Visualizing the results in graphs and maps
5) Applying sentiment analysis and novel methods
6) Compiling a structured dataset with sentiments at the student level
7) Exploring the relation of methods and seat distribution in the classroom using econometric methods
Qualifications: Qualifications: We are looking for undergraduates with experience in scripting in Python (using packages like layout-parser and tesseract), doing sentiment analysis (using nltk, spaCy, pyTorch among others) and using R (packages like tidyverse, tidymodels, ggplot). These skills can also be learned and expanded working on the project.
Day-to-day supervisor for this project: Joan Martinez, Post-Doc
Hours: 6-8 hrs
Related website: https://joanjmartinez.com
Social Sciences Engineering, Design & Technologies