Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/99999/fk4zw2wz49
Title: | Machine Learning for Decision Making: Applications to Off-Policy Learning and Combinatorial Optimization |
Authors: | Lu, Hao |
Advisors: | Wang, Mengdi |
Contributors: | Operations Research and Financial Engineering Department |
Subjects: | Operations research Artificial intelligence |
Issue Date: | 2021 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | In this thesis, we discuss machine learning methods for decision-making problems in off-policy learning and combinatorial optimization.We start with off-policy learning problems, specifically in the scenario of healthcare. In the first part, we model the clinical pathway optimization for knee replacement. Based on episodic claims from previous cases, we view pathway optimization as an intelligence crowdsourcing problem and learn the optimal decision policy from data by imitating the best expert at every intermediate state. We develop a reinforcement learning-based pipeline that uses value iteration, state compression, aggregation learning, and kernel representation to predict the best treatment policy. In the second part, we adopt the bootstrapping fitted Q-evaluation (FQE) algorithm for policy evaluation with off-policy data in sepsis treatment. Our method achieves reliable point estimates and confidence regions with neural network function approximators. We then explore combinatorial optimization problems from both empirical and theoretical perspectives. First, we study the empirical capacitated vehicle routing problem (CVRP) with a reinforcement learning approach. We present Learn to Improve (L2I), the first learning-based approach for CVRP that is efficient in solving speed and at the same time outperforms OR methods. Then we take a theoretical point of view in analyzing the computational-statistical gap of certain combinatorial problems. More specifically, we look at the hypothesis testing of inferring the existence of combinatorial structures in undirected graphical models. We quantify the minimum computational complexity required to attain the information-theoretic limits based on an oracle computational model, which is determined by two intrinsic quantities of the graph. |
URI: | http://arks.princeton.edu/ark:/99999/fk4zw2wz49 |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Operations Research and Financial Engineering |
Files in This Item:
File | Size | Format | |
---|---|---|---|
Lu_princeton_0181D_13652.pdf | 5.34 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.