IEEE Conference Submission · 2026

An Explainable CNN-BiLSTM for Forecasting Industrial Air Pollution in Indian Metros.

A hybrid deep-learning framework that pairs 1D convolutional feature extraction with bidirectional recurrent sequence modeling, and decomposes SHAP attributions jointly across seasonal and diurnal axes — turning a black-box AQI forecaster into a policy-grade tool. Evaluated on six years of CPCB hourly data across Delhi, Mumbai, Kolkata, and Chennai.

View Code Cite Paper Open in Colab

0.9874

R² on Delhi

13.83

RMSE

23.2%

RMSE↓ from CNN block

08:00

Peak attribution (IST)

Indian metros evaluated

Pollutants modeled

Overview

What this paper does, in a paragraph.

Industrial air pollution in Indian metropolitan areas remains a serious public health problem, and AQI readings cross hazardous levels in cities like Delhi almost every winter. For policy interventions and citizen advisories to actually work, the underlying forecasts have to be both accurate and interpretable. This paper proposes a CNN-BiLSTM model paired with SHAP GradientExplainer, and decomposes the explainability output across season and hour-of-day to surface actionable intervention windows.

Hybrid Architecture

1D convolution extracts local temporal motifs; stacked bidirectional LSTM models forward accumulation and backward dissipation. Closed by a small FC head.

Explainability

SHAP GradientExplainer gives per-timestep, per-feature attribution. Aggregated globally, then stratified seasonally and diurnally.

Multi-City Honesty

Strong on Delhi and Mumbai; clearly weaker on Chennai because the current six-feature input cannot capture sea-breeze meteorology. We say so explicitly.

Architecture

CNN-BiLSTM with post-hoc SHAP attribution.

Three sequential stages, then a separate explainability pass.

Input

24-hour Multi-Pollutant Window

PM₂.₅, PM₁₀, NO₂, CO, SO₂, O₃ · shape (B, 24, 6)

Stage 1 · CNN

1D Conv + BatchNorm + ReLU + MaxPool

64 filters · kernel=3 · dropout=0.2

Stage 2 · BiLSTM

Stacked Bidirectional LSTM

128 → 64 hidden units per direction

Stage 3 · Head

Fully Connected Prediction Head

128 → 64 → 1 · dropout=0.3

Output

Scalar AQI ŷ

Inverse min-max scaled to AQI range

Post-hoc

SHAP GradientExplainer

φᵢ,ₜ · per-feature, per-timestep attribution

Results

CNN-BiLSTM beats six baselines on the Delhi test set.

We compare against ARIMA, SVR, Random Forest, LSTM, GRU, and CNN-LSTM. All metrics on the inverse-scaled AQI; deep models share a 15% chronological test split.

RMSE comparison (lower is better)

Walk-forward ARIMA shown for completeness; not directly comparable to chronological deep-learning evaluation.

Model	MAE	RMSE	MAPE (%)	R²
ARIMA (2,1,2)^†	2.27	7.25	2.08	0.9336
SVR (RBF)	13.71	18.72	7.80	0.9770
Random Forest	13.74	17.80	9.52	0.9792
LSTM	14.41	18.27	9.29	0.9780
GRU	14.72	18.53	9.28	0.9774
CNN-LSTM	10.89	14.19	6.71	0.9868
CNN-BiLSTM (Ours)	9.74	13.83	5.99	0.9874

^† ARIMA evaluated via walk-forward one-step-ahead validation on a 30-day subset, feeding the most recent observed value before each prediction.

Multi-City Generalization

Strong on Delhi and Mumbai. Weaker on Chennai — and we say why.

Chennai's R² of 0.68 is not a failure to hide. The current six-feature input doesn't capture sea-breeze meteorology, which is exactly what coastal Tamil Nadu air quality depends on. Adding wind direction and a sea-breeze index is a natural next step.

R² across cities

Climate diversity sharpens the model's limits.

RMSE across cities

Mumbai's low absolute error reflects its lower AQI variance.

City	MAE	RMSE	MAPE (%)	R²
Delhi	9.74	13.83	5.99	0.9874
Mumbai	5.13	6.28	7.89	0.9388
Kolkata	8.87	10.86	16.40	0.8993
Chennai	15.47	21.69	18.91	0.6808

SHAP Explainability

Why explainability is the actual contribution.

A high-R² forecaster is not the new thing. The new thing is decomposing SHAP attributions jointly across season and hour-of-day, so a policymaker can read off when each pollutant matters most. Three findings stand out.

Global

PM₂.₅ dominates

φ̄ = 0.0033 globally, followed by PM₁₀ at 0.0025. Consistent with PM₂.₅'s heavy weight in the CPCB sub-index formula.

Seasonal

PM₁₀ overtakes during monsoon

Rainfall preferentially scavenges fine PM₂.₅, leaving the coarse fraction dominant. Counterintuitive at first glance, clean once you know the chemistry.

Diurnal

08:00 IST is the intervention window

Both PMs peak sharply at 08:00 IST, exactly when morning traffic is at its worst. NO₂ and O₃ peak between 10:00–14:00 in the photochemical cycle.

Global feature importance

Mean |φ| aggregated over the Delhi test set.

Seasonal attribution shift

Notice the PM₂.₅ → PM₁₀ inversion during monsoon.

Diurnal attribution by pollutant

Hour-of-day SHAP attribution. Shaded morning rush window highlighted in the paper.

Ablation Study

What each component is actually doing.

Component-level ablation on the Delhi test set. Removing the CNN block hurts the most — confirming that local convolutional feature extraction is doing the heavy lifting, not the recurrence alone.

ΔRMSE when component removed

Higher bar = component matters more.

Configuration	MAE	RMSE	R²	ΔRMSE
CNN-BiLSTM (Full)	9.74	13.83	0.9874	—
w/o Bidirection	10.85	14.00	0.9871	+0.17
w/o BatchNorm	12.38	16.22	0.9827	+2.39
w/o CNN	13.85	18.01	0.9787	+4.18

Authors

Akash Nath

Dept. of Computer Science & Engineering
Assam University, Silchar, India

akashnath.aus@gmail.com

Pragyat Jyoti Baruah

Dept. of Computer Science & Engineering
Assam University, Silchar, India

Arnab Paul

Dept. of Computer Science & Engineering
Assam University, Silchar, India

Arun Jyoti Nath

Ecology and Environmental Science
Assam University, Silchar, India

Tirthanka Borah

Dept. of Computer Science & Engineering
Assam University, Silchar, India

Kamalesh Debnath

Department of Management Studies
NIT Silchar, Assam, India

Citation

If you build on this work, please cite us.

BibTeX entry below. Click to copy.

nath2026aqi.bib

@inproceedings{nath2026aqi,
  title     = {An Explainable Deep Learning Architecture for Forecasting
               Industrial Atmospheric Pollutants of Indian Metropolitan Cities},
  author    = {Nath, Akash and Baruah, Pragyat Jyoti and Paul, Arnab and
               Nath, Arun Jyoti and Borah, Tirthanka and Debnath, Kamalesh},
  booktitle = {Proceedings},
  year      = {2026}
}