Sean Gailmard, Professor

Open (1) Property Tax Rates and the Long-run Distribution of Wealth in the Early U.S. South

Open. Apprentices needed for the fall semester. Please do NOT contact faculty before September 11th (the start of the 4th week of classes)! Enter your application on the web beginning August 16th. The deadline to apply is Tuesday, August 29th at 8 AM.

Farm land was an important component of personal wealth in the 19th Century U.S. South. Yet, not much is known about the distribution of land wealth in the South during this period, nor the relationship between the wealth distribution and property tax rates. This project will investigate county- and time-level variation in the wealth distribution and property tax rates in the U.S. State of Georgia in order to answer important questions related to the political economy of development.

The core of the project involves manually transcribing Georgia county tax digests, which are available sporadically for 137 Georgia counties from 1793 - 1892. The digests list names of taxpayers and assessments of value for various types of property and assets, as well as indicating who owed the poll tax, which is a specific sum to be paid by a person between 21 and 60 years of age. Thus, the records include all men 21 and over and women who owned property.

Primary Task: Apprentices will manually transcribe data on wealth holdings and taxes paid from early Georgia tax digests. Apprentices will locate the tax digest images found on ancestry.com and transcribe individuals’ data into a database.

Secondary Tasks: Writing summaries of relevant literature in Political Science, Economics, and History; occasional group meetings with Prof. Gailmard and other apprentices in order to discuss progress and the “big picture” of the project; weekly email communication with group members to discuss progress and logistics.

Learning outcomes: Apprentices will gain experience in creating and managing a large dataset and have the opportunity to improve skills needed for historical, demographic, and social science research.

Day-to-day supervisor for this project: Jason Poulos, Ph.D. candidate

Qualifications: Students with any background can be trained, although previous experience with databases (e.g., Google Sheets) is preferred. Students majoring in history or social science majors with an interest in early American history will find this project compelling because of the nature of the data.

Weekly Hours: 3-6 hrs

Off-Campus Research Site: Apprentices can conduct most work from their personal computers after initial training.

Related website: http://search.ancestry.com/search/db.aspx?dbid=1729

Open (2) Amnesty and Loyalty in the Postbellum South

Open. Apprentices needed for the fall semester. Please do NOT contact faculty before September 11th (the start of the 4th week of classes)! Enter your application on the web beginning August 16th. The deadline to apply is Tuesday, August 29th at 8 AM.

The amnesty proclamation issued by President Johnson immediately following the American Civil War granted near-universal amnesty to former Confederates, with the exception of fourteen classes of individuals, including voluntary participants in the rebellion with an estimated value of taxable property over $20,000. This project investigates the impact of presidential amnesty policy on public service participation and political competition after universal amnesty was granted in 1868.

Primary Task: Apprentices will manually transcribe 1860 and 1870 Census data for a large sample of adult white male slaveholders living in the South at the time of the 1860 Census. Specifically, Apprentices will locate the slaveholders in census images found on ancestry.com and transcribe individuals’ data into a database.

Secondary Tasks: Extracting data in historical records to spreadsheet data using optical character recognition (OCR); R programming for data cleaning and dataset assembly; occasional group meetings with Prof. Gailmard and other apprentices in order to discuss progress and the “big picture” of the project; weekly email communication with group members to discuss progress and logistics.

Learning Outcomes: Apprentices will gain experience in creating and managing a large dataset and have the opportunity to improve skills needed for historical, demographic, and social science research.

Day-to-day supervisor for this project: Jason Poulos, Ph.D. candidate

Qualifications: Students with any background can be trained, although previous experience with databases (e.g., Google Sheets) is strongly preferred. Experience with OCR software (e.g., Abbyy Finereader, Tabula, tesseract) and/or R programming desirable but not essential; Students majoring in history or social science majors with an interest in early American history will find this project compelling because of the nature of the data.

Weekly Hours: 3-6 hrs

Off-Campus Research Site: Apprentices can conduct most work from their personal computers after initial training.

Related website: http://search.ancestry.com/search/db.aspx?dbid=7667
Related website: https://usa.ipums.org/usa/slavepums/documentation/about.html

Open (3) A Natural Experiment on the Economic and Political Behavior of 1901 Oklahoma Land Lottery Winners

Open. Apprentices needed for the fall semester. Please do NOT contact faculty before September 11th (the start of the 4th week of classes)! Enter your application on the web beginning August 16th. The deadline to apply is Tuesday, August 29th at 8 AM.

At the turn of the 20th Century, the central and western parts of Oklahoma were opened to settlement by non-native Americans. Land formerly occupied by the Kiowa, Comanche, Apache, and Wichita tribes were opened by lottery in the summer of 1901 and split into two districts: El Reno and Lawton. About 170,000 people registered for the drawing for the chance to win one of 13,000 lots of 160 acres in size. Lottery participants were drawn in rank order: the person who drew number one was given first choice, the person who drew number two was given second choice, etc.

The project will investigate the effect of levels of lottery winnings on the political (e.g., candidacy and officeholding) and economic behavior (e.g., education, literacy, labor force participation, occupation, and home ownership) of lottery winners.

Primary Tasks: Extracting data in lottery record pdfs to spreadsheet data using optical character recognition (OCR); R programming for merging datasets, data cleaning and dataset assembly.

Secondary Tasks: Reviews of relevant literature and/or searches of historical newspapers; occasional group meetings with Prof. Gailmard and other apprentices in order to discuss progress and the “big picture” of the project; weekly email communication with group members to discuss progress and logistics.

Learning Outcomes: Apprentices will gain experience in creating and managing a large dataset and have the opportunity to improve skills needed for historical, demographic, and social science research.

Day-to-day supervisor for this project: Jason Poulos, Ph.D. candidate

Qualifications: Previous experience working with spreadsheet databases (e.g., Google Sheets) is required. Experience with OCR software (e.g., Abbyy Finereader, Tabula, tesseract) and/or R programming desirable but not essential; Social science majors with an interest in political economy or American history will find this project compelling because of the nature of the data.

Weekly Hours: 3-6 hrs

Off-Campus Research Site: Apprentices can conduct most work from their personal computers after initial training.

Related website: http://www.okhistory.org/publications/enc/entry.php?entry=LA016
Related website: http://www.okhistory.org/research/elreno

Closed (4) DeepCensus: A Framework for Automated Transcription of Handwritten Information within Historical Census Manuscripts

Closed. This professor is continuing with Spring 2017 apprentices on this project; no new apprentices needed for Fall 2017.

The quantity of accessible historical microdata digitized from census manuscripts has increased exponentially since 2000, and is expected to reach over 1.1 billion individual records by 2018. These large-scale microdata are invaluable for researchers investigating the demographic and social transformations that shape our society.

The proposed project will develop a framework for automatically transcribing handwritten content in order to produce digitized transcriptions of census manuscripts. Census manuscripts are currently transcribed by hand, a resource-intensive process that is susceptible to transcription error. DeepCensus draws heavily from the interdisciplinary field of Computer Vision, in which multiple-author offline handwritten text recognition is an active research area.

Primary Tasks: (1) Apprentices will write Bash shell scripts to recursively download census images from the internet. (2) Most of the work will involve writing Python scripts to preprocess the census images (i.e., segment, normalize, and binarize images). (3) If the first two steps are complete, apprentices will write a Python script to implement deep bidirectional recurrent neural networks (BRNNs) using the Keras neural networks library on a test collection of preprocessed census images.

Secondary Tasks: Occasional group meetings with Prof. Gailmard and other apprentices in order to discuss progress and the “big picture” of the project; weekly email communication with group members to discuss progress and logistics.

Learning Outcomes: Apprentices will gain hands-on programming experience in preprocessing images for handwritten text recognition, and if time allows, implementing deep networks on image data.

Day-to-day supervisor for this project: Jason Poulos, Ph.D. candidate

Qualifications: Experience programming in Python is required; experience with git and Bash shell scripting is desirable but not essential.

Weekly Hours: 6-9 hrs

Off-Campus Research Site: Apprentices can conduct most work from their personal computers after initial training.

Related website: https://wwwee.ee.bgu.ac.il/~dinstein/stip2002/HandwritingRecognitionSurveyPAMI.pdf
Related website: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3949202/