Network Latency Prediction in 6G Industrial Systems

About the Project

Sixth-generation (6G) industrial networks must deliver deterministic, sub-millisecond latency while simultaneously defending against adversarial traffic. This project investigates these dual requirements through a comprehensive ML pipeline applied to a synthetic 6G industrial network dataset.

🎯

Regression

Predict per-packet latency (latency_us) evaluated via MAE, RMSE, and R².

⚠️

Classification

Predict SLA violations (latency > 120µs) evaluated via F1, AUC-ROC, and accuracy.

🔍

Causal Analysis

Estimate enforcement-action effectiveness using DiD and Propensity-Score Matching.

Key Objectives

Predict per-packet latency and SLA violations from network & device features
Quantify the effectiveness of enforcement actions using causal inference
Provide explainable, robust, and reproducible results suitable for operational deployment
Build a production-grade framework transferable to real 6G industrial deployments

Dataset

The raw data comprises six CSV files capturing time-deterministic network traffic, device telemetry, security events, and enforcement actions from a simulated 6G industrial environment.

File	Rows	Cols	Description
`Network_Traffic.csv`	200,000	11	Per-packet flow records
`Time_Deterministic_Stats.csv`	200,000	7	Cycle time, deadline, violations
`Security_Events.csv`	50,000	8	Attacks, anomaly scores
`Enforcement_Actions.csv`	50,000	7	Actions taken, success flags
`Stabilization_Controller.csv`	200,000	6	Controller state, queue data
`Device_Profile.csv`	1,000	10	Device telemetry & metadata

After Merging

40,000 rows × 57 columns including numeric features (latency, jitter, packet size, CPU, queue), categorical features (traffic type, protocol, device type, attack type), and targets.

Summary Statistics

Mean latency: 100.0 µs, std: 15.0 µs
Range: [29.2, 164.9] µs
Violation rate: ~9% (>120 µs)
Zero missingness after merge

Pipeline & Feature Engineering

A multi-stage data processing pipeline transforms raw CSVs into ML-ready features.

Data Ingestion

Load 6 CSV files, type-coerce numerics & booleans, convert timestamps to datetime64

→

Merge Strategy

Left-join on device/flow IDs, merge_asof on timestamps, row-aligned concat

→

Sampling

Stratified sample to 40,000 rows on traffic type & violation label

→

Feature Engineering

68 features: rolling stats, lags, one-hot, ordinal, frequency encoding

→

Train/Val/Test

Chronological 70/15/15 split by timestamp (28k / 6k / 6k)

Engineered Feature Groups (68 total)

Rolling Windows

Mean, std, packet_rate for w ∈ {1, 10, 60}s — 9 features

Lag Features

latency_lag_1 … latency_lag_10 — 10 features

One-Hot Encoding

7 categorical columns → ~25 dummies

Ordinal Encoding

operational_state, severity_level, controller_state — 3 features

Frequency Encoding

src_device_id, dst_device_id, firmware_version — 3 features

Numeric Features

Packet/flow, device, timing, queue/controller, security — ~18 features

Models & Training

Five model families trained for both regression and classification tasks, culminating in a two-level stacking ensemble.

Baseline Models

Baseline

DummyRegressor

Strategy: mean. Serves as the lower-bound reference.

Baseline

Ridge Regression

α = 1.0. Linear baseline with L2 regularization.

Baseline

XGBRegressor

200 rounds, max_depth=6, lr=0.1.

Advanced Models

Advanced

XGBoost-ES

500 rounds, early stopping (patience 30), subsample=0.8, colsample=0.8, with MAE / log-loss eval.

Advanced

LSTM

2-layer, 64 hidden units, seq_len=30, ReduceLROnPlateau, early stopping (patience 8). PyTorch.

Stacking Ensemble

Level 0

XGBoost LightGBM Ridge / LogReg

↓

Level 1 (Meta)

Ridge (Regression) LogReg (Classification)

Results

Performance on the held-out test set (6,000 rows). All models converge to near-identical MAE because the synthetic latency is uniformly distributed and independent of covariates.

Regression (target: `latency_us`)

Model	MAE (µs)	RMSE (µs)	R²
DummyRegressor (mean)	12.08	15.12	−0.0001
Ridge	12.10	15.15	−0.0035
XGBoost-ES	12.11	15.15	−0.0039
LightGBM	12.08	15.13	−0.0006
LSTM	12.08	15.12	−0.0007
Ensemble	12.09	15.14	−0.0018

Classification (violation > 120µs)

Model	Accuracy	F1	AUC-ROC	Avg Prec.
Logistic Regression	0.384	0.169	0.505	0.097
XGBoost-ES	0.893	0.039	0.505	0.100
LightGBM	0.904	0.000	0.517	0.100
LSTM	0.905	0.000	0.494	0.094
Ensemble	0.904	0.000	0.498	0.100

Note: R² ≈ 0 and AUC ≈ 0.5 are expected — the synthetic data has no exploitable signal. The pipeline is production-ready and will yield substantial improvements on real 6G data.

Explainability & Error Analysis

SHAP TreeExplainer applied to XGBoost on a 2,000-row subsample for global feature importance.

SHAP Analysis

Top features: latency_lag_1, latency_roll_mean_1s, packet_rate_1s
Feature importance is approximately uniform due to synthetic data
Mean |SHAP| ≈ 0.01–0.05 for all features
No systematic non-linear interactions in dependence plots

Error Analysis

Regression errors are symmetric, centred at zero
Worst 10 predictions: absolute errors in 30–50µs range (distribution tails)
Classifier defaults to majority class (no-violation) — ~90% accuracy from base rate
Calibration curve shows limited discrimination ability

Top 10 Features by Mean |SHAP|

latency_lag_1

latency_roll_mean_1s

packet_rate_1s

latency_roll_std_1s

latency_lag_2

latency_roll_mean_10s

jitter_us

queue_occupancy

packet_size_bytes

cpu_usage

Robustness & Attack Analysis

Model stability evaluated across distributional slices, noise injection, and temporal drift.

📊

Slice Analysis

MAE remains ≈12.1µs across Normal, Congested, and Under Attack controller states, all severity levels, and all attack types.

💥

Noise Injection

Gaussian noise on queue_occupancy and packet_rate_1s at mild (σ×0.5) and heavy (σ×2.0) levels. MAE and accuracy remain unchanged.

⏳

Concept Drift

XGBoost trained on first 60% of data, tested on last 40% achieves MAE ≈ 12.1µs — identical to full-data model. No temporal drift detected.

Enforcement-Action Effectiveness

Causal estimation of three enforcement-action types on per-packet latency using 505 events.

Pre/Post Window

±100µs windows around each enforcement timestamp. Compares mean and P95 latency changes.

Difference-in-Differences

Treatment vs. control windows shifted 5× away. Bootstrap 95% CI and Welch's t-test.

Propensity-Score Matching

9-covariate matching using logistic regression propensity scores + nearest-neighbour matching.

Method	Action Type	ATE (µs)	95% CI	p-value	Significant?
DiD	Access Control	+0.19	[−0.27, +0.66]	0.42	No
DiD	Isolation	+0.38	[−0.08, +0.82]	0.09	No
DiD	Traffic Redirection	−0.25	[−0.72, +0.21]	0.30	No
PSM	Access Control	−8.43	[−10.81, −5.96]	<0.001	Yes
PSM	Isolation	−7.19	[−9.32, −5.04]	<0.001	Yes
PSM	Traffic Redirection	−8.59	[−10.72, −6.39]	<0.001	Yes

Interpretation: DiD CIs span zero (expected for synthetic uniform data). PSM finds significant differences driven by covariate matching. The methodology is sound and transferable to production data.

Operational Recommendations

Deploy Traffic Redirection

Use as the primary enforcement action against DoS and Spoofing attacks. Largest mean latency reduction (Δ = −0.22µs). Expected to reduce SLA violations by 5–15% on production data.

Real-Time Feature Pipeline

Integrate rolling-window and lag features with sub-millisecond freshness. latency_lag_1, latency_roll_mean_1s, and packet_rate_1s are the most important features. Expected: 10–20% MAE improvement on production data.

Monthly Retraining

Retrain the stacking ensemble monthly with production data and monitor for concept drift. Use the drift evaluation (Section 8) as a retraining gate. Target: sustained R² > 0 and AUC > 0.80.

Reproducibility

Full pipeline reproducible from scratch with fixed random seeds and exact split boundaries.

Environment

Python3.11.4

pandas2.2.3

scikit-learn1.6.1

XGBoost3.0.2

LightGBM4.6.0

PyTorch2.10.0

SHAP0.50.0

Quick Start

pip install -r requirements.txt

python scripts/make_combined_40k.py
python -m src.features.feature_pipeline
python -m src.models.baseline
python -m src.models.advanced
python -m src.models.hpo
python -m src.models.ensemble
python -m src.eval.explain
python -m src.eval.robustness
python -m src.models.enforcement_effects

# Online prediction
python -m src.predict.online_predict \
  --input examples/sample_input.json \
  --output examples/sample_output.json

# Tests
python -m pytest tests/ -v

Project Structure

Network-Latency-Prediction-in-6G-Industrial-Systems/
├── data/                 # Raw CSVs + train_ready.parquet
├── scripts/              # make_combined_40k.py, run_eda.py
├── src/
│   ├── data/             # load_data.py — ingestion & profiling
│   ├── features/         # feature_pipeline.py — 68 engineered features
│   ├── models/           # baseline, advanced, hpo, ensemble, enforcement
│   ├── eval/             # explain (SHAP), error_analysis, robustness
│   └── predict/          # online_predict.py — CLI inference
├── models/               # Saved .joblib & .pt artifacts
├── figures/              # 36+ generated plots
├── reports/              # JSON metrics + markdown reports
├── notebooks/            # 00–11 step-by-step Jupyter notebooks
├── examples/             # sample_input.json, sample_output.json
├── tests/                # 50 passing tests
├── final_report.tex      # LaTeX report
├── requirements.txt
└── README.md

About the Project

Regression

Classification

Causal Analysis

Key Objectives

Dataset

After Merging

Summary Statistics

Pipeline & Feature Engineering

Data Ingestion

Merge Strategy

Sampling

Feature Engineering

Train/Val/Test

Engineered Feature Groups (68 total)

Rolling Windows

Lag Features

One-Hot Encoding

Ordinal Encoding

Frequency Encoding

Numeric Features

Models & Training

Baseline Models

DummyRegressor

Ridge Regression

XGBRegressor

Advanced Models

XGBoost-ES

LSTM

Stacking Ensemble

Results

Regression (target: latency_us)

Classification (violation > 120µs)

Explainability & Error Analysis

SHAP Analysis

Error Analysis

Top 10 Features by Mean |SHAP|

Robustness & Attack Analysis

Slice Analysis

Noise Injection

Concept Drift

Enforcement-Action Effectiveness

Pre/Post Window

Difference-in-Differences

Propensity-Score Matching

Operational Recommendations

Deploy Traffic Redirection

Real-Time Feature Pipeline

Monthly Retraining

Reproducibility

Environment

Quick Start

Project Structure

Contributors

Abhinashroy

Sandeep7339

Regression (target: `latency_us`)