Project Overview
Applied the complete Box-Jenkins workflow to a real monthly commodity price series: stationarity testing, model identification, parameter estimation, diagnostic checking, and 8-month forecasting. The project demonstrates end-to-end time series analysis as practiced in quantitative finance and macro research roles.
📋Problem Statement
Determine whether a monthly commodity price series follows a stationary process, identify the most appropriate ARMA structure, and produce a defensible 8-month ahead forecast with calibrated uncertainty bands.
🎯Analytical Approach
Followed the Box-Jenkins four-stage methodology: (1) stationarity assessment using ADF test with BIC-selected lag order; (2) model identification via ACF decay pattern and PACF cutoff; (3) AR(1) estimation via conditional maximum likelihood; (4) adequacy validation through residual ACF and Ljung-Box portmanteau test.
💾Data & Variables
98 monthly observations of a commodity price variable z_t, spanning January 2015 to February 2023. Data imported from Excel using read_excel(), then converted to a ts() object with monthly frequency (freq = 12). ADF test BIC selected 1 lag, confirming a parsimonious model structure.
🔧Methods & Tests
ADF unit root test (urca::ur.df, drift specification): τ = −3.804 < critical value −2.89 at 5% → series is stationary. ACF shows gradual decay; PACF cuts off sharply at lag 1 → AR(1) selected. Model estimated via arima(order = c(1,0,0)). True intercept φ₀ recovered as μ(1 − φ₁) = 6.7165.
✨Key Results
AR(1) model: z_t = 6.7165 + 0.6588 × z_{t−1} + u_t. Ljung-Box test: χ²(10) = 3.51, p = 0.9668 — residuals are white noise, confirming model adequacy. 8-month point forecasts converge to the long-run mean: March 2023 = 20.20, April = 20.02, declining to 19.71 by October 2023. Interval forecasts widen appropriately with horizon.
🧠Key Learnings
AR(1) processes exhibit mean-reversion: forecasts geometrically converge to μ at rate φ₁ per step. The distinction between R's reported 'intercept' (= μ) and the true AR intercept (φ₀ = μ(1−φ₁)) is a common source of error in practitioner models. Residual whiteness is a necessary but not sufficient condition for model validity — structural breaks and seasonality warrant separate checks.