← Projects

Demand Forecasting in Retail

Comprehensive research exploring machine learning models for predicting retail demand using the M5 Competition Dataset from Walmart, featuring hierarchical forecasting and deep learning approaches.

github.com/pompos02/M5-CompetitionDemandForecasting

Overview

This project, based on my thesis "Demand Forecasting in Retail", investigates multiple forecasting approaches ranging from traditional statistical methods to deep learning models. The research addresses the critical business challenge of inventory management and demand prediction across hierarchical sales data.

The project leverages the M5-Competition Dataset, one of the most comprehensive real-world retail forecasting challenges, featuring 42,840 hierarchical time series from Walmart stores across the United States spanning over 5 years of daily sales data.

Dataset Characteristics

Core Statistics:

  • 42,840 hierarchical time series from Walmart stores
  • 1,969 days of daily sales data (January 2011 - June 2016)
  • 3,049 individual products across 3 categories and 7 departments
  • 10 stores spanning 3 states (California, Texas, Wisconsin)
  • Rich contextual data: pricing, calendar events, SNAP eligibility

Hierarchical Structure:

The dataset enables analysis at 12 aggregation levels:

  • Total sales (1 series)
  • State level (3 series)
  • Store level (10 series)
  • Category and Department levels
  • Product-Store combinations (30,490 series)
M5 Dataset hierarchical structure

Figure: M5 Dataset hierarchical structure showing organization from total sales down to individual product-store combinations

Technical Stack

  • Data Processing: pandas, NumPy (handling ~3GB raw data expanding to ~15GB after feature engineering)
  • Machine Learning: scikit-learn, LightGBM
  • Deep Learning: PyTorch, PyTorch Lightning, Darts (time series library)
  • Statistical Models: statsmodels (for Exponential Smoothing)
  • Visualization: Matplotlib, Seaborn

Data Exploration Insights

Overall Sales Patterns:

  • Clear upward trend with seasonal variations and weekly cyclicality
  • Notable sales drops on Christmas Day due to store closures
  • Strong annual cyclicality with holiday peaks
  • Consistent weekend sales dominance
Total Sales Time Series

Figure: Overall sales trends showing clear upward trajectory with seasonal patterns

Geographic Distribution:

  • California accounts for ~45% of total sales with notable drops in 2013 and 2015
  • Texas shows ~35% of sales with more stable patterns
  • Wisconsin exhibits ~20% with significant seasonal variations
Sales by State

Figure: Sales comparison across states showing California's dominance in sales volume

Product Category Performance:

  • FOODS category dominates (65% of sales) with consistent weekly seasonality
  • HOUSEHOLD (20%) shows strong promotional sensitivity
  • HOBBIES (15%) exhibits high seasonality and event-driven spikes
Sales by Category

Figure: Product category comparison showing FOODS as the dominant category

Store-Level Analysis:

Sales by Store

Figure: Individual store performance showing significant variations within and across states

Department-Level Patterns:

Department Analysis

Figure: Sales patterns across departments showing FOODS_3 dominance

Seasonality Analysis:

  • Annual peaks: November-December (40-60% sales increase)
  • Low periods: January-February post-holiday decline
  • Weekend dominance: 20-30% higher sales than weekdays
  • State-specific patterns: Wisconsin shows inverted seasonal trends
Annual Seasonality

Annual seasonal patterns

Weekly Seasonality

Weekly patterns showing weekend dominance

State Seasonal Comparison

Figure: State-specific seasonal patterns after trend removal and scaling

Comprehensive Feature Engineering

Price-Based Features:

  • Current selling price and historical statistics (min, max, mean, std)
  • Price momentum indicators (weekly, monthly, annual trends)
  • Price volatility measures and normalization
  • Cross-product price relationships

Temporal Features:

  • Calendar features: Day of month, week, month, year, day of week
  • Holiday and event markers (religious, cultural, sporting events)
  • Weekend indicators and seasonal decomposition

Statistical Features:

  • Rolling window statistics: 7, 14, 28-day moving averages
  • Lag features: Sales history from 1, 7, 14, 21, 28 days prior
  • Exponentially weighted moving averages
  • Trend indicators using first differences

Models Implemented

1. Linear Regression

Baseline statistical approach using 28-day lag features

  • Simple interpretable model for benchmark comparison
  • Utilizes engineered features from preprocessing pipeline

2. Exponential Smoothing

Holt-Winters method with additive seasonality

  • 7-day seasonal period capturing weekly patterns
  • Individual models per product-store combination
  • No external features - pure time series approach
  • Excellent computational efficiency

3. LightGBM (Three Architectures)

Gradient boosted decision trees at multiple aggregation levels:

  • Item-Store Level: Individual models per product-store combination
  • Category-Store Level: Multivariate models per category-store (28 models per horizon)
  • Store Level: Comprehensive models per store handling all products
  • Feature-rich approach incorporating all engineered features

4. LSTM Neural Network

Global deep learning model with advanced architecture:

  • 4 hidden layers with 128 neurons each
  • 28-day input window predicting next 7 days
  • Single model trained on all time series
  • MinMax scaling (0-1) applied per time series
  • Selected features: price, temporal, and statistical indicators

Feature Importance Analysis

Random Forest Analysis:

  • Price features (mean, std) show highest importance scores
  • Temporal index critical for capturing seasonality
  • Product variety (item_nunique) impacts store demand significantly
  • Price momentum (monthly/yearly trends) highly informative
Random Forest Feature Importance

Figure: Random Forest feature importance showing price and temporal features as key predictors

Mutual Information Analysis:

  • Captures non-linear relationships missed by correlation
  • All price-related variables show high mutual information
  • Different features provide unique complementary information
Mutual Information Analysis

Figure: Mutual Information analysis highlighting non-linear relationships between features and sales

SHAP Analysis (LightGBM):

  • Recent sales history (lag-1) shows highest impact
  • Feature significance decreases with temporal distance
  • Product-specific characteristics (item_id) contribute substantially
  • Multiple features work together for optimal predictions
SHAP Analysis

Figure: SHAP analysis revealing feature contribution patterns for one-step-ahead sales prediction

Performance Results

Overall Model Comparison:

ModelMAERMSEWRMSSERank
LSTM1.141.430.8841st
Exp. Smoothing1.111.440.8882nd
LGBM-store1.171.460.8943rd
LGBM-category1.141.460.8984th
Linear Regression1.141.470.9145th
LGBM-item1.221.570.9576th

Key Findings:

  • LSTM Excellence: Best WRMSSE (0.884) - would rank 11th in M5 competition (only 0.009 points from winner)
  • Exponential Smoothing Surprise: Best MAE (1.11) despite simplicity - would achieve 23rd place in M5
  • Store-Level LGBM: Best balance among LGBM variants
  • Item-Level Challenges: Insufficient data per model led to poor generalization

Error Distribution Analysis

RMSE Distribution:

  • All models show strong concentration of RMSE values in 0-1 range
  • LSTM demonstrates most consistent error distribution with lowest outlier counts (267 cases with RMSE > 10)
  • LGBM-item exhibits highest maximum RMSE (~120) indicating severe overfitting cases
RMSE Distribution

RMSE distribution across models

RMSE BoxPlot

BoxPlot revealing model stability

MAE Distribution:

  • MAE distributions show smoother patterns than RMSE with reduced outlier impact
  • Median performance similar across models, but variance differs significantly
  • Less sensitivity to outliers compared to RMSE
MAE Distribution

MAE distribution patterns

MAE BoxPlot

MAE BoxPlot comparison

Category-Specific Performance

Outstanding Performance:

  • CA_1 HOUSEHOLD: RMSE = 1.03 (excellent accuracy)
  • CA_1 HOBBIES: RMSE = 1.26 (strong performance)
  • Texas stores: Consistent performance across all categories

Challenging Categories:

  • CA_3 FOODS: RMSE = 2.50 (highest error due to high purchase frequency variability)
  • WI_2 FOODS: RMSE = 1.74 (seasonal pattern complexity)
  • FOODS category generally more difficult than HOUSEHOLD/HOBBIES

Conclusion

This comprehensive study demonstrates that deep learning approaches (LSTM) achieve superior overall performance for retail demand forecasting, approaching top M5 competition results. However, traditional methods like Exponential Smoothing remain highly competitive with significant advantages in computational efficiency and interpretability.

The research provides practical guidance for model selection based on specific business contexts, balancing accuracy requirements, computational constraints, and interpretability needs. The achieved results validate the effectiveness of the implemented methodology and provide valuable insights for real-world retail forecasting applications.

← Back to Projects