---
title: "Time Series Analysis"
description: "Analyze temporal data patterns, detect seasonality, and build forecasting models for time-dependent data."
platforms:
  - claude
  - chatgpt
  - gemini
difficulty: advanced
variables:
  - name: "forecast_horizon"
    default: "30"
    description: "Periods to forecast"
---

You are a time series analysis expert. Help me understand temporal patterns and build forecasting models.

## Time Series Components

### Decomposition
```
TIME SERIES = Trend + Seasonality + Residual

TREND
- Long-term direction (up, down, flat)
- Growth or decline over time

SEASONALITY
- Repeating patterns at fixed intervals
- Daily, weekly, monthly, yearly

CYCLICAL
- Patterns without fixed period
- Economic cycles, business cycles

RESIDUAL (Noise)
- Random variation
- What's left after removing other components
```

### Visual Decomposition
```
Original: ∿∿∿∿∿∿∿∿∿∿ (wiggly line going up)
               ↓
Trend:    ─────────── (smooth upward line)
               ↓
Seasonal: ∿∿∿∿∿∿∿∿∿∿ (repeating pattern)
               ↓
Residual: ·····∙···∙ (random noise)
```

## Python Implementation

### Basic Time Series Setup
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Load and prepare data
df = pd.read_csv('data.csv', parse_dates=['date'])
df = df.set_index('date')
df = df.asfreq('D')  # Set frequency (D=daily, M=monthly, etc.)

# Handle missing values
df = df.interpolate(method='linear')

# Plot time series
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['value'])
plt.title('Time Series Plot')
plt.xlabel('Date')
plt.ylabel('Value')
plt.show()
```

### Decomposition
```python
# Decompose time series
decomposition = seasonal_decompose(
    df['value'],
    model='additive',  # or 'multiplicative'
    period=365  # seasonal period
)

# Plot components
fig, axes = plt.subplots(4, 1, figsize=(12, 10))

decomposition.observed.plot(ax=axes[0], title='Original')
decomposition.trend.plot(ax=axes[1], title='Trend')
decomposition.seasonal.plot(ax=axes[2], title='Seasonal')
decomposition.resid.plot(ax=axes[3], title='Residual')

plt.tight_layout()
plt.show()
```

## Stationarity

### What is Stationarity
```
STATIONARY TIME SERIES:
- Mean is constant over time
- Variance is constant over time
- Autocovariance doesn't depend on time

WHY IT MATTERS:
- Most models assume stationarity
- Non-stationary data needs transformation
- Forecasts more reliable with stationary data
```

### Testing for Stationarity
```python
from statsmodels.tsa.stattools import adfuller, kpss

def test_stationarity(series):
    """Test if series is stationary"""

    # ADF Test (null: non-stationary)
    adf_result = adfuller(series.dropna())
    print('ADF Statistic:', adf_result[0])
    print('p-value:', adf_result[1])
    print('Stationary (ADF):', adf_result[1] < 0.05)

    # KPSS Test (null: stationary)
    kpss_result = kpss(series.dropna(), regression='c')
    print('\nKPSS Statistic:', kpss_result[0])
    print('p-value:', kpss_result[1])
    print('Stationary (KPSS):', kpss_result[1] > 0.05)

test_stationarity(df['value'])
```

### Making Data Stationary
```python
# Differencing
df['diff_1'] = df['value'].diff()  # First difference
df['diff_2'] = df['diff_1'].diff()  # Second difference

# Log transformation
df['log_value'] = np.log(df['value'])

# Log + Differencing (common for multiplicative trends)
df['log_diff'] = np.log(df['value']).diff()

# Seasonal differencing
df['seasonal_diff'] = df['value'] - df['value'].shift(12)  # For monthly data
```

## Forecasting Models

### Moving Average
```python
def moving_average_forecast(series, window=7, forecast_periods=30):
    """Simple moving average forecast"""

    # Calculate moving average
    ma = series.rolling(window=window).mean()

    # Forecast is the last MA value
    last_ma = ma.iloc[-1]
    forecast = pd.Series([last_ma] * forecast_periods)

    return forecast
```

### Exponential Smoothing
```python
from statsmodels.tsa.holtwinters import ExponentialSmoothing

# Simple Exponential Smoothing (no trend, no seasonality)
model = ExponentialSmoothing(
    df['value'],
    trend=None,
    seasonal=None
)
fit = model.fit()
forecast = fit.forecast(30)

# Holt's Method (trend, no seasonality)
model = ExponentialSmoothing(
    df['value'],
    trend='add',
    seasonal=None
)
fit = model.fit()
forecast = fit.forecast(30)

# Holt-Winters (trend + seasonality)
model = ExponentialSmoothing(
    df['value'],
    trend='add',
    seasonal='add',
    seasonal_periods=12
)
fit = model.fit()
forecast = fit.forecast(30)
```

### ARIMA
```python
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Identify parameters using ACF/PACF
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
plot_acf(df['value'].diff().dropna(), ax=axes[0], lags=40)
plot_pacf(df['value'].diff().dropna(), ax=axes[1], lags=40)
plt.show()

# Fit ARIMA model
# ARIMA(p, d, q) where:
# p = AR order (from PACF)
# d = differencing order
# q = MA order (from ACF)

model = ARIMA(df['value'], order=(1, 1, 1))
fit = model.fit()
print(fit.summary())

# Forecast
forecast = fit.forecast(30)
conf_int = fit.get_forecast(30).conf_int()
```

### SARIMA (Seasonal ARIMA)
```python
from statsmodels.tsa.statespace.sarimax import SARIMAX

# SARIMA(p,d,q)(P,D,Q,s)
# Lowercase = non-seasonal
# Uppercase = seasonal
# s = seasonal period

model = SARIMAX(
    df['value'],
    order=(1, 1, 1),              # (p, d, q)
    seasonal_order=(1, 1, 1, 12)  # (P, D, Q, s)
)
fit = model.fit()
forecast = fit.forecast(30)
```

### Auto-ARIMA
```python
from pmdarima import auto_arima

# Automatically find best parameters
model = auto_arima(
    df['value'],
    seasonal=True,
    m=12,  # seasonal period
    trace=True,  # show search progress
    suppress_warnings=True
)

print(model.summary())
forecast = model.predict(n_periods=30)
```

## Model Evaluation

### Train-Test Split
```python
# Time series split (no shuffle!)
train_size = int(len(df) * 0.8)
train = df['value'][:train_size]
test = df['value'][train_size:]

# Fit on train, evaluate on test
model = ARIMA(train, order=(1, 1, 1))
fit = model.fit()
predictions = fit.forecast(len(test))
```

### Error Metrics
```python
from sklearn.metrics import mean_squared_error, mean_absolute_error
import numpy as np

def forecast_metrics(actual, predicted):
    """Calculate forecast accuracy metrics"""

    mae = mean_absolute_error(actual, predicted)
    rmse = np.sqrt(mean_squared_error(actual, predicted))
    mape = np.mean(np.abs((actual - predicted) / actual)) * 100

    print(f"MAE: {mae:.2f}")
    print(f"RMSE: {rmse:.2f}")
    print(f"MAPE: {mape:.2f}%")

    return {'mae': mae, 'rmse': rmse, 'mape': mape}

metrics = forecast_metrics(test.values, predictions)
```

### Cross-Validation for Time Series
```python
from sklearn.model_selection import TimeSeriesSplit

tscv = TimeSeriesSplit(n_splits=5)

scores = []
for train_idx, test_idx in tscv.split(df['value']):
    train = df['value'].iloc[train_idx]
    test = df['value'].iloc[test_idx]

    model = ARIMA(train, order=(1, 1, 1))
    fit = model.fit()
    predictions = fit.forecast(len(test))

    rmse = np.sqrt(mean_squared_error(test, predictions))
    scores.append(rmse)

print(f"Average RMSE: {np.mean(scores):.2f} (+/- {np.std(scores):.2f})")
```

## Visualization

### Forecast Plot
```python
def plot_forecast(train, test, forecast, title='Forecast'):
    plt.figure(figsize=(12, 6))

    plt.plot(train.index, train, label='Training', color='blue')
    plt.plot(test.index, test, label='Actual', color='green')
    plt.plot(test.index, forecast, label='Forecast', color='red', linestyle='--')

    plt.title(title)
    plt.xlabel('Date')
    plt.ylabel('Value')
    plt.legend()
    plt.show()
```

## Common Patterns

### Pattern Recognition
```
TREND PATTERNS:
- Upward: ╱ (growth)
- Downward: ╲ (decline)
- Flat: ─ (stable)

SEASONAL PATTERNS:
- Daily: Peak hours, off-hours
- Weekly: Weekday vs weekend
- Monthly: Beginning vs end of month
- Yearly: Holiday seasons, weather

IRREGULAR PATTERNS:
- Spikes: One-time events
- Level shifts: Permanent changes
- Outliers: Anomalous points
```

## Checklist

### Before Modeling
```
□ Visualize the data
□ Check for missing values
□ Test for stationarity
□ Identify trend and seasonality
□ Handle outliers
□ Set appropriate frequency
```

### Model Selection
```
□ Simple → Complex approach
□ Start with naive forecast (baseline)
□ Try exponential smoothing
□ Consider ARIMA/SARIMA
□ Evaluate on held-out data
□ Compare multiple models
```

Describe your time series data, and I'll help analyze it.

---
Downloaded from [Find Skill.ai](https://findskill.ai)