Machine Learning Methods for Computational Social Science

Tarr, Alexander

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/99999/fk4795n664

Title:	Machine Learning Methods for Computational Social Science
Authors:	Tarr, Alexander
Advisors:	Imai, Kosuke
Contributors:	Electrical Engineering Department
Keywords:	causal inference gerrymandering machine learning MCMC SVM video data
Subjects:	Political science Artificial intelligence Statistics
Issue Date:	2021
Publisher:	Princeton, NJ : Princeton University
Abstract:	Contributing to the rising popularity of computational social science, this dissertation presents new methods grounded in machine learning for solving several important problems in political science. In Chapter 2, adapted from coauthored work in Fifield et al. (2020), we present a new algorithm for sampling redistricting plans from arbitrary distributions. We formulate redistricting as a graph-cut problem and adapt an image segmentation algorithm from the computer vision literature to construct a Metropolis-Hastings style algorithm for sampling graph partitions. We then validate our algorithm using a small-scale map for which all possible redistricting plans can be enumerated, finding that our method samples from the true distribution. Lastly, we apply our algorithm to a more realistic redistricting problem using data from New Hampshire. In Chapter 3, adapted from coauthored work with June Hwang and Kosuke Imai, we develop a fully-automated video processing system for encoding information in political campaign advertisement videos. Our approach applies state-of-the-art algorithms to replicate a subset of variables in the human-labeled Wesleyan Media Project (WMP) data, performing tasks including video summarization, facial recognition, text recognition, speech recognition, audio classification, and text classification. We validate our method using the WMP data from the 2012 and 2014 election cycles, finding that machine coding is competitive with human coding for most of the variables considered in our study. In Chapter 4, adapted from coauthored work in Tarr and Imai (2021), we adapt the support vector machine (SVM) algorithm to address the balancing problem in causal inference. We first establish SVM as a kernel balancing method by showingthat the soft-margin SVM dual problem computes weights which balance functions in a reproducing kernel Hilbert space. We then show that the SVM cost parameter controls a trade-off between balance and sample size, allowing us to use path algorithms to give exact characterizations of how balance and causal effect estimates change over the path. We validate our method using simulation data, showing that our algorithm is competitive with leading balancing methods. Finally, we conduct an empirical study using the right heart catheterization data from Connors et al. (1996).
URI:	http://arks.princeton.edu/ark:/99999/fk4795n664
Alternate format:	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material:	Academic dissertations (Ph.D.)
Language:	en
Appears in Collections:	Electrical Engineering

Files in This Item:

File	Size	Format
Tarr_princeton_0181D_13868.pdf	14.41 MB	Adobe PDF	View/Download

Show full item record

Search

Browse