Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp019z903229x
Title: Women’s Impact on GitHub: A Machine Learning Approach
Authors: Swenson, Hannah
Advisors: LaPaugh, Andrea
Department: Computer Science
Class Year: 2016
Abstract: Not only are women underrepresented in computer science, but their percentage has been declining since the 1980’s. In order to analyze women’s potential impact on the programming world, I study female contribution and gender dynamics in Open Source Software (OSS) communities. GitHub, an OSS platform that allows for programmers world-wide to collaborate on software projects, is the largest publicly available source of OSS project data. I implement a supervised machine learning algorithm to predict, given certain factors, whether a programming team on GitHub has female members, in the hopes of showing positive correlations between having women on the team and project productivity or success. While my linear regression model predicts whether a team has women to a certain degree of accuracy, it is limited by the training data that is sparse of women. This sparsity in the dataset is consistent with and speaks to the broader issue of female underrepresentation in computer science.
Extent: 46 pages
URI: http://arks.princeton.edu/ark:/88435/dsp019z903229x
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File SizeFormat 
Swenson_Hannah_2016_Thesis.pdf1.07 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.