GeoParquet

Overview

GeoParquet is a variant of the Parquet file format that includes support for geospatial data types and metadata, enabling efficient storage and analysis of spatial data in big data environments. GeoParquet stores geometries as Well-Known Binary (WKB) and includes spatial metadata in the file footer.

Key Concepts

Geometry column stores WKB-encoded geometries. Spatial metadata includes CRS, geometry type, and bounding box. Column metadata describes the geometry column in Parquet schema. Encoding uses WKB for geometry serialization. Bounding box is stored in metadata for spatial filtering.

Metadata

{
  "version": "1.0.0",
  "primary_column": "geometry",
  "columns": {
    "geometry": {
      "encoding": "WKB",
      "geometry_types": ["Polygon", "MultiPolygon"],
      "crs": {...},
      "bbox": [-180, -90, 180, 90]
    }
  }
}

Benefits over Shapefile/GeoJSON

  • No file size limits
  • Columnar access for analytics
  • Better compression
  • Type preservation
  • Cloud-native access patterns

Appendix

Created: 2025-12-13 | Modified: 2025-12-13

See Also