| name | gnnwr-spatial-analysis |
| description | Use when analyzing spatial or spatiotemporal data with geographic non-stationarity, building GNNWR/GTNNWR models, generating spatial coefficient maps, or interpreting geographically varying regression results. Triggers on keywords like spatial regression, GWR, GNNWR, spatial non-stationarity, geographic weighting, coefficient mapping, PM2.5 spatial modeling, land price spatial analysis. |
| license | MIT |
| metadata | {"author":"SteadfastAsArt","version":"1.0.0"} |
GNNWR Spatial Intelligent Analysis
Spatial/spatiotemporal regression with GNNWR (Geographically Neural Network Weighted Regression). Produces publication-ready coefficient maps and diagnostic reports.
Quick Reference
from gnnwr import models, datasets, utils
import pandas as pd
Data → Model → Results (Minimal)
data = pd.read_csv("data.csv")
train, val, test = datasets.init_dataset(
data=data, test_ratio=0.2, valid_ratio=0.1,
x_column=["x1", "x2", "x3"], y_column=["y"],
spatial_column=["lon", "lat"],
batch_size=32, process_fn="minmax_scale"
)
model = models.GNNWR(train, val, test, use_gpu=True, optimizer="Adam", start_lr=0.01)
model.run(max_epoch=200, early_stop=30)
result = model.reg_result(only_return=True)
print(model.result())
Spatiotemporal (GTNNWR)
train, val, test = datasets.init_dataset(
data=data, ...,
spatial_column=["lon", "lat"],
temp_column=["year", "month"],
use_model="gtnnwr"
)
model = models.GTNNWR(train, val, test, use_gpu=True)
Large-Scale (N > 10k) — KNN Mode
train, val, test = datasets.init_dataset(
data=data, ..., knn_k=500
)
API Essentials
init_dataset Key Parameters
| Parameter | Default | Notes |
|---|
knn_k | None | KNN sparse distance; None=full matrix |
process_fn | "minmax_scale" | or "standard_scale" |
spatial_fun | BasicDistance | Euclidean; or ManhattanDistance |
Reference | None | "train", "train_val", or custom DataFrame |
sample_seed | 42 | Reproducibility |
Model Hyperparameters
| Parameter | Recommended | Notes |
|---|
optimizer | "Adam" | Also: SGD, AdamW, Adagrad, RMSprop |
start_lr | 0.01–0.1 | Critical tuning point |
drop_out | 0.2 | 0.0–0.5 |
dense_layers | None (auto) | Auto: power-of-2 sequence from input_dim to n_coef |
early_stop | 20–50 | Patience; -1=disabled |
batch_norm | True | Stabilizes training |
use_ols | True | OLS-initialized output layer |
Diagnostics (DIAGNOSIS)
diag = model._test_diagnosis
diag.R2()
diag.RMSE()
diag.AIC()
diag.AICc()
diag.F1_Global()
diag.F2_Global()
diag.F3_Local()
lite=True (auto when N>10k): only R²/RMSE; Hat-matrix diagnostics skipped.
Visualization Patterns
1. Folium Interactive Maps (built-in)
viz = utils.Visualize(model, lon_lat_columns=["lon", "lat"], zoom=5)
m1 = viz.display_dataset(name="all", y_column="y")
m1.save("dataset_map.html")
for col in [c for c in result.columns if c.startswith("coef_")]:
m = viz.coefs_heatmap(data_column=col, steps=20)
m.save(f"map_{col}.html")
m3 = viz.dot_map(result, "lon", "lat", "denormalized_pred_result", zoom=5)
2. Matplotlib Static Maps (publication-ready)
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import numpy as np
result = model.reg_result(only_return=True)
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
coef_cols = [c for c in result.columns if c.startswith("coef_")]
for ax, col in zip(axes.flat, coef_cols):
sc = ax.scatter(
result["lon"], result["lat"],
c=result[col], cmap="RdYlBu_r", s=5, alpha=0.8,
vmin=result[col].quantile(0.02), vmax=result[col].quantile(0.98)
)
ax.set_title(col.replace("coef_", "β_"), fontsize=14)
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
plt.colorbar(sc, ax=ax, shrink=0.8)
plt.suptitle("Spatially Varying Coefficients (GNNWR)", fontsize=16)
plt.tight_layout()
plt.savefig("coefficients_map.png", dpi=300, bbox_inches="tight")
3. Residual Spatial Distribution
result["residual"] = result["denormalized_pred_result"] - result[y_column]
fig, ax = plt.subplots(figsize=(10, 8))
sc = ax.scatter(
result["lon"], result["lat"],
c=result["residual"], cmap="coolwarm", s=5,
vmin=-result["residual"].abs().quantile(0.95),
vmax=result["residual"].abs().quantile(0.95)
)
ax.set_title("Spatial Residual Distribution")
plt.colorbar(sc, ax=ax, label="Residual")
plt.savefig("residuals_map.png", dpi=300, bbox_inches="tight")
4. Prediction vs Observed Scatter
fig, ax = plt.subplots(figsize=(8, 8))
ax.scatter(result[y_column], result["denormalized_pred_result"], s=3, alpha=0.5)
lim = [result[y_column].min(), result[y_column].max()]
ax.plot(lim, lim, "r--", linewidth=2, label="1:1 line")
ax.set_xlabel("Observed"); ax.set_ylabel("Predicted")
ax.set_title(f"GNNWR: R²={model._test_diagnosis.R2().item():.4f}")
ax.legend()
plt.savefig("pred_vs_obs.png", dpi=300, bbox_inches="tight")
5. GeoPandas + Contextily (with basemap)
import geopandas as gpd
import contextily as ctx
gdf = gpd.GeoDataFrame(result, geometry=gpd.points_from_xy(result.lon, result.lat), crs="EPSG:4326")
gdf_web = gdf.to_crs(epsg=3857)
fig, ax = plt.subplots(figsize=(12, 10))
gdf_web.plot(column="coef_x1", ax=ax, cmap="RdYlBu_r", legend=True,
markersize=5, alpha=0.7, legend_kwds={"shrink": 0.6})
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Positron)
ax.set_title("β_x1 Spatial Variation")
ax.set_axis_off()
plt.savefig("coef_basemap.png", dpi=300, bbox_inches="tight")
Workflow Checklist
- EDA: Check spatial distribution, feature correlations, OLS baseline
- Data split:
init_dataset with appropriate ratios and sample_seed=42
- Train: Start with defaults, tune
start_lr and early_stop
- Diagnose: R², RMSE, F1 (GNNWR vs OLS), F2 (spatial weight significance)
- Visualize: Coefficient maps (spatial non-stationarity), residual maps (model adequacy), pred vs obs
- Interpret: Where do coefficients vary most? Which variables show strongest non-stationarity? (F3_Local)
- Report: Model summary table + coefficient maps + diagnostic statistics
Common Pitfalls
- Forgot
spatial_column: Model degenerates to global regression
- N > 10k without
knn_k: OOM on distance matrix; use knn_k=500–2000
start_lr too high: Loss explodes; start with 0.01
- No
early_stop: Overfitting; always set early_stop=20–50
- Interpreting normalized coefficients: Use
reg_result() which returns denormalized predictions; coefficients are on normalized scale
- GTNNWR without
temp_column: Silently falls back to GNNWR behavior