Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01db78tf67g
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorLiu, Han-
dc.contributor.authorTian, Amy-
dc.date.accessioned2017-07-26T20:13:10Z-
dc.date.available2017-07-26T20:13:10Z-
dc.date.created2017-04-08-
dc.date.issued2017-4-8-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01db78tf67g-
dc.description.abstractOften in modern multivariate analysis, data analysts rely solely on statistical estimators to explore the data. We are interested in using the notion of visual dependence to verify numerical tests of dependence (specifically, we focus on correlation metrics) and applying the results to portfolio selection, a setting that involves high-dimensional data sets. High-dimensional visualization is problematic because the number of pairwise plots to sort through increases quadratically as the number of variables increase. We present a visualization system that actively learns the user's concept of “visual correlation”, applies the resulting fitted classifier to unlabeled data to form a visual correlation graph \(\hat{G}=(V,E)\), and outputs the difference between \(\hat{G}\) and some given numerical correlation graph \(\hat{G}^{\text{num}}\). Specifically, we focus on the active learning and graph comparison components of the visualization system. We perform a simulation study with parameters that mimic the intended qualities of the system in order to select the best active learning method to use in the visualization system for the financial application. We compile various graph summarization metrics to compute the difference between two graphs (e.g. \(\hat{G}\) and \(\hat{G}^{\text{num}}\)), and propose and verify a procedure for selecting \(\hat{G}^*\), the numerical correlation graph most similar to the base graph \(\hat{G}\). Furthermore, we propose a simple but effective stock selection procedure that, given a correlation graph, selects a “buy and hold” portfolio of \(k\) stocks which are as uncorrelated with each other as possible, a proxy for independence. Numerical correlation graphs \(\hat{G}^{i, \text{num}}\) are formed from healthcare stock price data where \(i \in I\) (the set of all correlation metrics), the data is fed into the visualization system to create \(\hat{G}\), portfolios \(P^i\) are selected from \(\hat{G}^{i, \text{num}}\), and yearly returns are compiled. The results indicate that the portfolio \(P^*\), which is selected from \(\hat{G}^*\), is the top performer. Furthermore, all portfolios \(P^i, i \in I\) outperform the S&P 500, indicating that a more sophisticated selection strategy would yield even more fruitful returns. The VS may be further applied to improve upon other portfolio management techniques.en_US
dc.language.isoen_USen_US
dc.titleA High-Dimensional Visualization System with Applications to Portfolio Selectionen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2017en_US
pu.departmentOperations Research and Financial Engineeringen_US
pu.pdf.coverpageSeniorThesisCoverPage-
pu.contributor.authorid960877667-
pu.contributor.advisorid960033799-
pu.certificateApplications of Computing Programen_US
Appears in Collections:Operations Research and Financial Engineering, 2000-2019

Files in This Item:
File SizeFormat 
Tian_Amy_Thesis.pdf6.69 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.