Modeling with Attention in Demand Forecasting and Beyond
Author(s)
Ocejo Elizondo, Clemente
DownloadThesis PDF (2.849Mb)
Advisor
Perakis, Georgia
Terms of use
Metadata
Show full item recordAbstract
Time series forecasting is an important task in many fields from supply chain management to weather forecasting. Traditionally there have been many simple models that extrapolate trends and seasonal patterns from individual time series in order to forecast future values, but not until recently have DNNs (Deep Neural Networks) been leveraged to capture complex relationships between time series as well as within the time series. Recent advances in Transformer architectures have shown promising results in this domain but have yet to show success in a retail settings. In these settings, forecasts are needed at granular levels (product-store) where the data is quite sparse. In this work we will develop new Transformer-based models to successfully predict the demand of a retailer in a medical device manufacturing setting. We will do this by proposing new positional encoding methods that aim to capture trends that specific medical products follow. We also propose new attention mechanisms attending over features and time series independently to generate more descriptive interactions. Ultimately we hope to combine this Transformer with more traditional time series models such as Holt-Winters as a way to alleviate some of the predictive responsibility from the Transformer which require relatively large amounts of data to train as compared to traditional time series methods.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology