Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp011n79h726v
Title: | TEXT TEXT Improving Accuracy of Counterfactual Estimation for Sales Forecasting Using an Ensemble of ARMA, ANN and BSTS Models 160627.pdf TEXT |
Authors: | Ogutu, Lencer |
Advisors: | Fan, Jianqing Fan, Jianqing Fan, Jianqing Guerzhoy, Michael |
Department: | Operations Research and Financial Engineering |
Certificate Program: | Center for Statistics and Machine Learning |
Class Year: | 2020 |
Abstract: | Counterfactuals have become increasingly standard in the estimation of causal inferences mostly in fields dealing with quantitative social research. One of the biggest challenges within the realm of causal inference has been how to accurately predict the counterfactual from which causal impact can be gauged. A well developed technique that is still popularly used to assess causal impact is the differences-in-differences (D-in-D) technique which assumes a treatment and control group whose features are similar except for an intervention applied to only the treatment group. The control group works as the counterfactual in this case. Causal impact is then calculated as the difference between what is observed in the two groups. However, there are some drawbacks to this approach that have necessitated research into other techniques of causal impact analysis. These drawbacks include the expectation that market data follows ideal randomized design which is rarely the case (typically it exhibits low signal-to-noise ratio), the fact that D-in-D does not account for seasonal variations and that it is confounded by the effects of unobserved variables and their interactions.Therefore this paper proposes the exploration and use of an ensemble of Bayesian Structural Time Series models, Artificial Neural Networks and Auto Regressive Moving Average models into a singular forecasting model to predict the counterfactual and estimate more accurately the causal impact of an intervention on metrics of analysis.I fit the three models on solar products daily sales time series and use the models to make forecasts in order to calculate their Mean Absolute Percentage Error (MAPE). The analysis finds that the forecasting accuracy of an ensemble model is higher than that of all the individual models. Additionally, the ensemble constructed by a weighted average of the individual models, the weights having been determined by a regression of these three models,is more accurate compared to the ensemble built by a simple average. |
URI: | http://arks.princeton.edu/ark:/88435/dsp011n79h726v |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Operations Research and Financial Engineering, 2000-2019 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
OGUTU-LENCER-THESIS.pdf | 3.16 MB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.