Machine Learning for Decision Making: Applications to Off-Policy Learning and Combinatorial Optimization

Lu, Hao

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/99999/fk4zw2wz49

Title:	Machine Learning for Decision Making: Applications to Off-Policy Learning and Combinatorial Optimization
Authors:	Lu, Hao
Advisors:	Wang, Mengdi
Contributors:	Operations Research and Financial Engineering Department
Subjects:	Operations research Artificial intelligence
Issue Date:	2021
Publisher:	Princeton, NJ : Princeton University
Abstract:	In this thesis, we discuss machine learning methods for decision-making problems in off-policy learning and combinatorial optimization.We start with off-policy learning problems, specifically in the scenario of healthcare. In the first part, we model the clinical pathway optimization for knee replacement. Based on episodic claims from previous cases, we view pathway optimization as an intelligence crowdsourcing problem and learn the optimal decision policy from data by imitating the best expert at every intermediate state. We develop a reinforcement learning-based pipeline that uses value iteration, state compression, aggregation learning, and kernel representation to predict the best treatment policy. In the second part, we adopt the bootstrapping fitted Q-evaluation (FQE) algorithm for policy evaluation with off-policy data in sepsis treatment. Our method achieves reliable point estimates and confidence regions with neural network function approximators. We then explore combinatorial optimization problems from both empirical and theoretical perspectives. First, we study the empirical capacitated vehicle routing problem (CVRP) with a reinforcement learning approach. We present Learn to Improve (L2I), the first learning-based approach for CVRP that is efficient in solving speed and at the same time outperforms OR methods. Then we take a theoretical point of view in analyzing the computational-statistical gap of certain combinatorial problems. More specifically, we look at the hypothesis testing of inferring the existence of combinatorial structures in undirected graphical models. We quantify the minimum computational complexity required to attain the information-theoretic limits based on an oracle computational model, which is determined by two intrinsic quantities of the graph.
URI:	http://arks.princeton.edu/ark:/99999/fk4zw2wz49
Alternate format:	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material:	Academic dissertations (Ph.D.)
Language:	en
Appears in Collections:	Operations Research and Financial Engineering

Files in This Item:

File	Size	Format
Lu_princeton_0181D_13652.pdf	5.34 MB	Adobe PDF	View/Download

Show full item record

Search

Browse