| name | geomaster |
| description | Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, and 7 programming languages (Python, R, Julia, JavaScript, C++, Java, Go) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task. |
| license | MIT License |
| metadata | {"skill-author":"K-Dense Inc."} |
GeoMaster
GeoMaster is a comprehensive geospatial science skill covering the full spectrum of geographic information systems, remote sensing, spatial analysis, and machine learning for Earth observation. This skill provides expert knowledge across 70+ topics with 500+ code examples in 7 programming languages.
Installation
Core Python Geospatial Stack
conda install -c conda-forge gdal rasterio fiona shapely pyproj geopandas
uv pip install geopandas rasterio fiona shapely pyproj
Remote Sensing & Image Processing
uv pip install rsgislib torchgeo eo-learn
uv pip install earthengine-api
GIS Software Integration
conda install -c conda-forge grassgrass
conda install -c conda-forge saga-gis
Machine Learning for Geospatial
uv pip install torch-geometric tensorflow-caney
uv pip install libpysal esda mgwr
uv pip install scikit-learn xgboost lightgbm
Point Cloud & 3D
uv pip install laspy pylas
uv pip install open3d pdal
uv pip install opendm
Network & Routing
uv pip install osmnx networkx
uv pip install osrm pyrouting
Visualization
uv pip install cartopy contextily mapclassify
uv pip install folium ipyleaflet keplergl
uv pip install pydeck pythreejs
Big Data & Cloud
uv pip install dask-geopandas
uv pip install xarray rioxarray
uv pip install pystac-client planetary-computer
Database Support
conda install -c conda-forge postgis
conda install -c conda-forge spatialite
uv pip install geoalchemy2
Additional Programming Languages
Quick Start
Reading Satellite Imagery and Calculating NDVI
import rasterio
import numpy as np
with rasterio.open('sentinel2.tif') as src:
red = src.read(4)
nir = src.read(8)
ndvi = (nir.astype(float) - red.astype(float)) / (nir + red)
ndvi = np.nan_to_num(ndvi, nan=0)
profile = src.profile
profile.update(count=1, dtype=rasterio.float32)
with rasterio.open('ndvi.tif', 'w', **profile) as dst:
dst.write(ndvi.astype(rasterio.float32), 1)
print(f"NDVI range: {ndvi.min():.3f} to {ndvi.max():.3f}")
Spatial Analysis with GeoPandas
import geopandas as gpd
zones = gpd.read_file('zones.geojson')
points = gpd.read_file('points.geojson')
if zones.crs != points.crs:
points = points.to_crs(zones.crs)
joined = gpd.sjoin(points, zones, how='inner', predicate='within')
stats = joined.groupby('zone_id').agg({
'value': ['count', 'mean', 'std', 'min', 'max']
}).round(2)
print(stats)
Google Earth Engine Time Series
import ee
import pandas as pd
ee.Initialize(project='your-project-id')
roi = ee.Geometry.Point([-122.4, 37.7]).buffer(10000)
s2 = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
.filterBounds(roi)
.filterDate('2020-01-01', '2023-12-31')
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20)))
def add_ndvi(image):
ndvi = image.normalizedDifference(['B8', 'B4']).rename('NDVI')
return image.addBands(ndvi)
s2_ndvi = s2.map(add_ndvi)
def extract_series(image):
stats = image.reduceRegion(
reducer=ee.Reducer.mean(),
geometry=roi.centroid(),
scale=10,
maxPixels=1e9
)
return ee.Feature(None, {
'date': image.date().format('YYYY-MM-dd'),
'ndvi': stats.get('NDVI')
})
series = s2_ndvi.map(extract_series).getInfo()
df = pd.DataFrame([f['properties'] for f in series['features']])
df['date'] = pd.to_datetime(df['date'])
print(df.head())
Core Concepts
Coordinate Reference Systems (CRS)
Understanding CRS is fundamental to geospatial work:
- Geographic CRS: EPSG:4326 (WGS 84) - uses lat/lon degrees
- Projected CRS: EPSG:3857 (Web Mercator) - uses meters
- UTM Zones: EPSG:326xx (North), EPSG:327xx (South) - minimizes distortion
See coordinate-systems.md for comprehensive CRS reference.
Vector vs Raster Data
Vector Data: Points, lines, polygons with discrete boundaries
- Shapefiles, GeoJSON, GeoPackage, PostGIS
- Best for: administrative boundaries, roads, infrastructure
Raster Data: Grid of cells with continuous values
- GeoTIFF, NetCDF, HDF5, COG
- Best for: satellite imagery, elevation, climate data
Spatial Data Types
| Type | Examples | Libraries |
|---|
| Vector | Shapefiles, GeoJSON, GeoPackage | GeoPandas, Fiona, GDAL |
| Raster | GeoTIFF, NetCDF, IMG | Rasterio, GDAL, Xarray |
| Point Cloud | LAZ, LAS, PCD | Laspy, PDAL, Open3D |
| Topology | TopoJSON, TopoArchive | TopoJSON, NetworkX |
| Spatiotemporal | Trajectories, Time-series | MovingPandas, PyTorch Geometric |
OGC Standards
Key Open Geospatial Consortium standards:
- WMS: Web Map Service - raster maps
- WFS: Web Feature Service - vector data
- WCS: Web Coverage Service - raster coverage
- WPS: Web Processing Service - geoprocessing
- WMTS: Web Map Tile Service - tiled maps
Common Operations
Remote Sensing Operations
Spectral Indices Calculation
import rasterio
import numpy as np
def calculate_indices(image_path, output_path):
"""Calculate NDVI, EVI, SAVI, and NDWI from Sentinel-2."""
with rasterio.open(image_path) as src:
blue = src.read(2).astype(float)
green = src.read(3).astype(float)
red = src.read(4).astype(float)
nir = src.read(8).astype(float)
swir1 = src.read(11).astype(float)
ndvi = (nir - red) / (nir + red + 1e-8)
evi = 2.5 * (nir - red) / (nir + 6*red - 7.5*blue + 1)
savi = ((nir - red) / (nir + red + 0.5)) * 1.5
ndwi = (green - nir) / (green + nir + 1e-8)
indices = np.stack([ndvi, evi, savi, ndwi])
profile = src.profile
profile.update(count=4, dtype=rasterio.float32)
with rasterio.open(output_path, 'w', **profile) as dst:
dst.write(indices)
calculate_indices('sentinel2.tif', 'indices.tif')
Image Classification
from sklearn.ensemble import RandomForestClassifier
import geopandas as gpd
import rasterio
from rasterio.features import rasterize
import numpy as np
def classify_imagery(raster_path, training_gdf, output_path):
"""Train Random Forest classifier and classify imagery."""
with rasterio.open(raster_path) as src:
image = src.read()
profile = src.profile
transform = src.transform
X_train, y_train = [], []
for _, row in training_gdf.iterrows():
mask = rasterize(
[(row.geometry, 1)],
out_shape=(profile['height'], profile['width']),
transform=transform,
fill=0,
dtype=np.uint8
)
pixels = image[:, mask > 0].T
X_train.extend(pixels)
y_train.extend([row['class_id']] * len(pixels))
X_train = np.array(X_train)
y_train = np.array(y_train)
rf = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1)
rf.fit(X_train, y_train)
image_reshaped = image.reshape(image.shape[0], -1).T
prediction = rf.predict(image_reshaped)
prediction = prediction.reshape(profile['height'], profile['width'])
profile.update(dtype=rasterio.uint8, count=1)
with rasterio.open(output_path, 'w', **profile) as dst:
dst.write(prediction.astype(rasterio.uint8), 1)
return rf
Vector Operations
import geopandas as gpd
from shapely.ops import unary_union
gdf['buffer_1km'] = gdf.geometry.to_crs(epsg=32633).buffer(1000)
intersects = gdf[gdf.geometry.intersects(other_geometry)]
contains = gdf[gdf.geometry.contains(point_geometry)]
gdf['centroid'] = gdf.geometry.centroid
gdf['convex_hull'] = gdf.geometry.convex_hull
gdf['simplified'] = gdf.geometry.simplify(tolerance=0.001)
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
union = gpd.overlay(gdf1, gdf2, how='union')
difference = gpd.overlay(gdf1, gdf2, how='difference')
Terrain Analysis
import rasterio
from rasterio.features import shapes
import numpy as np
def calculate_terrain_metrics(dem_path):
"""Calculate slope, aspect, hillshade from DEM."""
with rasterio.open(dem_path) as src:
dem = src.read(1)
transform = src.transform
dy, dx = np.gradient(dem)
slope = np.arctan(np.sqrt(dx**2 + dy**2)) * 180 / np.pi
aspect = np.arctan2(-dy, dx) * 180 / np.pi
aspect = (90 - aspect) % 360
azimuth = 315
altitude = 45
azimuth_rad = np.radians(azimuth)
altitude_rad = np.radians(altitude)
hillshade = (np.sin(altitude_rad) * np.sin(np.radians(slope)) +
np.cos(altitude_rad) * np.cos(np.radians(slope)) *
np.cos(np.radians(aspect) - azimuth_rad))
return slope, aspect, hillshade
Network Analysis
import osmnx as ox
import networkx as nx
G = ox.graph_from_place('San Francisco, CA', network_type='drive')
G = ox.add_edge_speeds(G)
G = ox.add_edge_travel_times(G)
orig_node = ox.distance.nearest_nodes(G, -122.4, 37.7)
dest_node = ox.distance.nearest_nodes(G, -122.3, 37.8)
route = nx.shortest_path(G, orig_node, dest_node, weight='travel_time')
accessibility = {}
for node in G.nodes():
subgraph = nx.ego_graph(G, node, radius=5, distance='time')
accessibility[node] = len(subgraph.nodes())
Detailed Documentation
Comprehensive reference documentation is organized by topic:
- Core Libraries - GDAL, Rasterio, Fiona, Shapely, PyProj, GeoPandas fundamentals
- Remote Sensing - Satellite missions, optical/SAR/hyperspectral analysis, image processing
- GIS Software - QGIS/PyQGIS, ArcGIS/ArcPy, GRASS, SAGA integration
- Scientific Domains - Marine, atmospheric, hydrology, agriculture, forestry applications
- Advanced GIS - 3D GIS, spatiotemporal analysis, topology, network analysis
- Programming Languages - R, Julia, JavaScript, C++, Java, Go geospatial tools
- Machine Learning - Deep learning for RS, spatial ML, GNNs, XAI for geospatial
- Big Data - Distributed processing, cloud platforms, GPU acceleration
- Industry Applications - Urban planning, disaster management, precision agriculture
- Specialized Topics - Geostatistics, optimization, ethics, best practices
- Data Sources - Satellite data catalogs, open data repositories, API access
- Code Examples - 500+ code examples across 7 programming languages
Common Workflows
End-to-End Land Cover Classification
import rasterio
import geopandas as gpd
from sklearn.ensemble import RandomForestClassifier
import numpy as np
training = gpd.read_file('training_polygons.gpkg')
with rasterio.open('sentinel2.tif') as src:
bands = src.read()
profile = src.profile
meta = src.meta
X, y = [], []
for _, row in training.iterrows():
mask = rasterize_features(row.geometry, profile['shape'])
pixels = bands[:, mask > 0].T
X.extend(pixels)
y.extend([row['class']] * len(pixels))
model = RandomForestClassifier(n_estimators=100, max_depth=20)
model.fit(X, y)
pixels_reshaped = bands.reshape(bands.shape[0], -1).T
prediction = model.predict(pixels_reshaped)
classified = prediction.reshape(bands.shape[1], bands.shape[2])
profile.update(dtype=rasterio.uint8, count=1, nodata=255)
with rasterio.open('classified.tif', 'w', **profile) as dst:
dst.write(classified.astype(rasterio.uint8), 1)
Flood Hazard Mapping Workflow
Time Series Analysis for Vegetation Monitoring
import ee
import pandas as pd
import matplotlib.pyplot as plt
ee.Initialize(project='your-project')
roi = ee.Geometry.Point([x, y]).buffer(5000)
landsat = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2')\
.filterBounds(roi)\
.filterDate('2015-01-01', '2024-12-31')\
.filter(ee.Filter.lt('CLOUD_COVER', 20))
def add_ndvi(img):
ndvi = img.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI')
return img.addBands(ndvi)
landsat_ndvi = landsat.map(add_ndvi)
ts = landsat_ndvi.getRegion(roi, 30).getInfo()
df = pd.DataFrame(ts[1:], columns=ts[0])
df['date'] = pd.to_datetime(df['time'])
from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(
range(len(df)), df['NDVI']
)
print(f"Trend: {slope:.6f} NDVI/year (p={p_value:.4f})")
Multi-Criteria Suitability Analysis
import geopandas as gpd
import rasterio
import numpy as np
from sklearn.preprocessing import MinMaxScaler
criteria = {
'slope': rasterio.open('slope.tif').read(1),
'distance_to_water': rasterio.open('water_dist.tif').read(1),
'soil_quality': rasterio.open('soil.tif').read(1),
'land_use': rasterio.open('landuse.tif').read(1)
}
weights = {'slope': 0.3, 'distance_to_water': 0.2,
'soil_quality': 0.3, 'land_use': 0.2}
normalized = {}
for key, raster in criteria.items():
if key in ['slope', 'distance_to_water']:
normalized[key] = 1 - MinMaxScaler().fit_transform(raster.reshape(-1, 1))
else:
normalized[key] = MinMaxScaler().fit_transform(raster.reshape(-1, 1))
suitability = sum(normalized[key] * weights[key] for key in criteria)
suitability = suitability.reshape(criteria['slope'].shape)
profile = rasterio.open('slope.tif').profile
profile.update(dtype=rasterio.float32, count=1)
with rasterio.open('suitability.tif', 'w', **profile) as dst:
dst.write(suitability.astype(rasterio.float32), 1)
Performance Tips
-
Use Spatial Indexing: R-tree indexes speed up spatial queries by 10-100x
gdf.sindex
-
Chunk Large Rasters: Process in blocks to avoid memory errors
with rasterio.open('large.tif') as src:
for window in src.block_windows():
block = src.read(window=window)
-
Use Dask for Big Data: Parallel processing on large datasets
import dask.array as da
dask_array = da.from_rasterio('large.tif', chunks=(1, 1024, 1024))
-
Enable GDAL Caching: Speed up repeated reads
import gdal
gdal.SetCacheMax(2**30)
-
Use Arrow for I/O: Faster file reading/writing
gdf.to_file('output.gpkg', use_arrow=True)
-
Reproject Once: Do all analysis in a single projected CRS
-
Use Efficient Formats: GeoPackage > Shapefile, Parquet for large datasets
-
Simplify Geometries: Reduce complexity when precision isn't critical
gdf['geometry'] = gdf.geometry.simplify(tolerance=0.0001)
-
Use COG for Cloud: Cloud-Optimized GeoTIFF for remote data
-
Enable Parallel Processing: Most libraries support n_jobs=-1
Best Practices
-
Always Check CRS before any spatial operation
assert gdf1.crs == gdf2.crs, "CRS mismatch!"
-
Use Appropriate CRS:
- Geographic (EPSG:4326) for global data, storage
- Projected (UTM) for area/distance calculations
- Web Mercator (EPSG:3857) for web mapping only
-
Validate Geometries before operations
gdf = gdf[gdf.is_valid]
gdf['geometry'] = gdf.geometry.make_valid()
-
Handle Missing Data appropriately
gdf['geometry'] = gdf['geometry'].fillna(None)
-
Document Projections in metadata
-
Use Vector Tiles for web maps with many features
-
Apply Cloud Masking for optical imagery
-
Calibrate Radiometric Values for quantitative analysis
-
Preserve Lineage for reproducible research
-
Use Appropriate Spatial Resolution for your analysis scale