pyogrio - bulk-oriented spatial vector file I/O using GDAL/OGR
Pyogrio provides fast, bulk-oriented read and write access to GDAL/OGR vector data sources, such as ESRI Shapefile, GeoPackage, GeoJSON, and several others. Vector data sources typically have geometries, such as points, lines, or polygons, and associated records with potentially many columns worth of data.
The typical use is to read or write these data sources to/from
GeoPandas GeoDataFrames
. Because
the geometry column is optional, reading or writing only non-spatial data is
also possible. Hence, GeoPackage attribute tables, DBF files, or CSV files are
also supported.
Pyogrio is fast because it uses pre-compiled bindings for GDAL/OGR to read and write the data records in bulk. This approach avoids multiple steps of converting to and from Python data types within Python, so performance becomes primarily limited by the underlying I/O speed of data source drivers in GDAL/OGR.
We have seen >5-10x speedups reading files and >5-20x speedups writing files compared to using row-per-row approaches (e.g. Fiona).
Contents
- About
- Concepts and Terminology
- Supported vector formats
- Installation
- Introduction to Pyogrio
- Display GDAL version
- List available drivers
- List available layers
- Read basic information about a data layer
- Read a data layer into a GeoPandas GeoDataFrame
- Read a subset of columns
- Read a subset of features
- Filter records by attribute value
- Filter records by spatial extent
- Filter records by a geometry
- Execute a sql query
- Force geometries to be read as 2D geometries
- Read without geometry into a Pandas DataFrame
- Read feature bounds
- Write a GeoPandas GeoDataFrame
- Appending to an existing data source
- Reading from compressed files / archives
- Reading from remote filesystems
- Reading and writing DateTimes
- Dataset and layer creation options
- Reading from and writing to in-memory datasets
- Configuration options
- API reference
- Error handling
- Limitations and Known Issues