Anastassia Fedyk, Professor

Closed (1) Big Data for Global Employment Dynamics

Applications for Spring 2019 are now closed for this project.

This is an opportunity to work with a large dataset of over 400 million employment profiles (resumes) in order to understand global employment dynamics and firm performance. In this project, we will leverage techniques from big data and machine learning to structure and analyze large textual data, which can help address questions ranging from individual career outcomes to understanding firm performance to automation and future of work. What are the effects of employment shocks such as the collapse of Lehman Brothers? Can we build successful trading strategies based on the information contained in resumes of different firms' employees? Which skillsets are booming and which are becoming obsolete in the modern economy? This project will begin by introducing the students to the types of techniques they will need in order to be able to work with this data, at first guiding them through specific small-scale tasks replicating existing findings and gradually transitioning into more open-ended questions.

We will begin with guided replication of existing findings, and the students will gradually transition into more independent work. Key tasks will include:
- Replicating preliminary findings for a trading strategy based on firms' employee characteristics
- Working with textual employment data (similar to LinkedIn profiles) to extract and understand job titles, skills, and demographics

Students who work on this project will gain experience in:
- Working efficiently with real-world large datasets in the hundreds of gigabytes
- Using data science / econometric / machine learning techniques
- Understanding and conducting scientific research processes

Qualifications: Required: - Experience with at least one of: Python, R, Matlab - At least one foundational computer science course or equivalent Preferred: - Practical experience with data science and/or econometric analysis - Students studying applied mathematics, statistics, computer science, or finance/economics with a quantitative focus

Weekly Hours: 9-11 hrs

Related website: https://sites.google.com/berkeley.edu/fedyk

Closed (2) Global Information Flows and Financial Markets

Applications for Spring 2019 are now closed for this project.

This project introduces students to the increasingly complex landscape of financial news. With millions of articles published each day, investors' task of understanding and trading on relevant information becomes ever more challenging. Which news stories are relevant and market moving? How can traders tell new information from "old news" In this project, students will work with hundreds of gigabytes of textual data directly from premier news providers such as Dow Jones in order to understand how investors' cognitive limitations affect the link between news and stock price dynamics. The project will begin with guided replication of existing findings from Bloomberg News using a new dataset of all news stories published between 1999 and 2018 by Dow Jones, to see how stock price results differ across news providers. As students become more comfortable working with the news data, they will learn to conduct increasingly more novel and open-ended analysis.

Students involved in this project will gain:
- Understanding of the structure of financial news
- Experience working with large textual datasets and natural language processing techniques
- Experience working with the most widely used financial datasets and evaluating stock price dynamics
- Familiarity with the research process

Qualifications: Required qualifications: - Experience working with textual data - Working knowledge of at least one of: Python, R, Matlab - At least one foundational computer science course or equivalent Preferred qualifications: - Students studying applied mathematics, computer science, quantitative finance/economics, or statistics

Weekly Hours: 9-11 hrs

Related website: https://sites.google.com/berkeley.edu/fedyk