Assets: read, metadata, VRT, download, GeoParquet#
The asset-level surface of pyramids.stac: open a single asset, read its
extension metadata without touching the file, mosaic an asset across items into
a lazy VRT, download assets locally, and round-trip Items through GeoParquet.
- Read one asset —
load_assetdispatches by media type (COG/GeoTIFF →Dataset, NetCDF/Zarr →NetCDF, GRIB →open_grib, JPEG2000 →Dataset);which_enginepreviews the reader without opening;resolved_hrefreturns the (optionally signed) href without opening. - Extension metadata —
read_extension_metadataturns a STAC Item'sproj/raster/eofields into a grid + band-metadata dict (CRS, geotransform, shape, nodata/scale/offset, band names) without opening the asset, the way stackstac / odc-stac / rio-tiler do. - VRT mosaic —
build_vrt_from_stacstitches one asset across many items into a lazy GDAL VRT read on demand via/vsicurl/. - Download —
download_itemcopies assets to local files (optionalstac-asset, shipped in the[stac]extra). - GeoParquet —
to_geoparquet/from_geoparquetserialize an ItemCollection to a single columnar file and back (optionalpyarrow, the[parquet]extra).
Reading assets#
pyramids.stac._loader
#
Open a STAC asset as a pyramids Dataset / NetCDF, dispatched by type.
Takes a STAC Item + asset_key (or an Asset directly), resolves the
asset href, and opens it with the right GDAL-backed reader chosen by the asset's
media_type (with the href extension as a fallback):
| media_type / extension | reader |
|---|---|
image/tiff... / .tif .tiff |
:meth:Dataset.read_file |
image/jp2 / .jp2 .jpx |
:meth:Dataset.read_file |
application/x-netcdf / .nc .nc4 .cdf |
:meth:NetCDF.read_file |
application/wmo-grib2 / .grib2 .grb |
:func:pyramids.grib.open_grib |
application/vnd+zarr / .zarr |
:meth:NetCDF.read_file (GDAL Zarr) |
Everything is duck-typed — pyramids does not import or depend on pystac; the
Item / Asset contract is read via getattr + dict lookup (pystac.Asset has
.href / .media_type; raw STAC JSON uses {"href":..., "type":...}). Assets
resolve to pyramids' GDAL-backed wrappers.
which_engine(item_or_asset, asset_key=None)
#
Return the reader name :func:load_asset would use, without opening.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item_or_asset
|
Any
|
A STAC Item or Asset (pystac object or raw dict). |
required |
asset_key
|
str | None
|
Asset name when passing an Item; |
None
|
Returns:
| Type | Description |
|---|---|
str
|
One of |
Examples:
- A COG asset dispatches to the GDAL reader:
- A GRIB2 asset (recognised by extension when type is absent):
- An Item + asset key resolves the named asset:
Source code in src/pyramids/stac/_loader.py
resolved_href(item_or_asset, asset_key=None, *, signer=None)
#
Return an asset's resolved (optionally signed) href without opening it.
The read-free companion to :func:load_asset: it resolves the asset href
and, when a signer is given, applies signer.sign_href — but never opens
the asset. Useful for building a VRT over many assets
(:func:pyramids.stac.build_vrt_from_stac), pre-flighting URLs, or
debugging what load_asset would open.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item_or_asset
|
Any
|
A STAC Item (pystac object or raw dict) or an Asset. |
required |
asset_key
|
str | None
|
Asset name when passing an Item; |
None
|
signer
|
Any
|
Optional signer; when given, its |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The resolved asset href, signed when a |
Raises:
| Type | Description |
|---|---|
StacAssetError
|
The asset is missing or has no href (subclasses
:class: |
Examples:
- Resolve a plain asset href:
- Resolve an item's asset and sign it with a simple signer:
Source code in src/pyramids/stac/_loader.py
load_asset(item_or_asset, asset_key=None, *, signer=None, vsi=None)
#
Open a STAC asset as a pyramids Dataset / NetCDF.
Resolves the asset href, optionally rewrites it through a signer
(a :class:~pyramids.stac.signers.Signer), then opens it with the
GDAL-backed reader chosen by media_type / extension. When a signer is
given, both of its hooks are applied: signer.sign_href rewrites the
href, and signer.gdal_env is installed as GDAL config for the duration
of the open (via :class:~pyramids.base.remote.CloudConfig), so the
underlying VSI handle is created with the right credentials / requester-pays
knobs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item_or_asset
|
Any
|
A STAC Item (pystac.Item or raw dict) or an Asset. |
required |
asset_key
|
str | None
|
Asset name when passing an Item; |
None
|
signer
|
Any
|
Optional signer. |
None
|
vsi
|
str | None
|
Optional explicit archive kind forwarded to the reader (e.g. a
GeoTIFF/GRIB inside a |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
Dataset
|
class: |
Dataset
|
class: |
|
Dataset
|
NetCDF / Zarr / GRIB assets. |
Raises:
| Type | Description |
|---|---|
KeyError
|
The asset is missing or has no href. |
ValueError
|
The asset's type/extension matches no supported reader. |
Examples:
- Open a COG asset from a STAC Item (requires network access):
- Sign the href with an MPC/CDSE-style bearer signer before opening
(the token is installed as a GDAL
Authorizationheader for the open): - Read a Requester-Pays bucket: the signer's
gdal_envopts intoAWS_REQUEST_PAYER=requesterfor the duration of the open:
Source code in src/pyramids/stac/_loader.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 | |
Extension metadata (proj / raster / eo)#
pyramids.stac._extensions
#
Read STAC proj / raster / eo extension metadata (PB-1).
This module builds a cube/grid skeleton — CRS, geotransform, shape, per-band
nodata / scale / offset, band names — directly from the STAC Item JSON, without
opening any asset: pure dict reads (via :func:pyramids.stac._item.asset_field,
no pystac
dependency) that yield a grid/band-metadata dict downstream code can use to build
a VRT (PB-5), a multi-asset cube (PB-2), or a grid match (PC-2) without a
header open.
Scope note: these are readers only. They deliberately do not stamp the
metadata onto a :class:~pyramids.dataset.Dataset returned by
:func:pyramids.stac.load_asset, because that reader opens assets read-only
(remote /vsicurl COGs cannot be opened for write), and mutating a read-only
GDAL handle (SetProjection / SetNoDataValue / SetScale) raises under
gdal.UseExceptions(). Writable consumers (VRT/stack builders) apply the
metadata themselves from the dict this module returns.
parse_number(value, default=None)
#
Coerce a STAC numeric field to a float, honouring nan/inf strings.
The raster extension allows non-finite nodata values to be encoded as
the strings "nan" / "inf" / "-inf".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Any
|
The raw field value (number, numeric string, nan/inf string,
or |
required |
default
|
Any
|
Returned when |
None
|
Returns:
| Type | Description |
|---|---|
Any
|
A float for numeric / nan-inf inputs, otherwise |
Examples:
- A plain number passes through as a float:
- The string
"-inf"becomes negative infinity: - An unparseable value falls back to the default:
Source code in src/pyramids/stac/_extensions.py
affine_to_geotransform(transform)
#
Convert a STAC proj:transform affine to a GDAL geotransform.
proj:transform is the affine ordering [a, b, c, d, e, f]
(mapping (col, row) to (x, y): x = a*col + b*row + c,
y = d*col + e*row + f). GDAL's geotransform is the reordering
(c, a, b, f, d, e) — i.e. (x_origin, x_res, x_rot, y_origin, y_rot,
y_res). A 9-element affine (with a trailing [0, 0, 1] row) is
accepted; only the first six coefficients are used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Any
|
A 6- or 9-element |
required |
Returns:
| Type | Description |
|---|---|
tuple[float, ...]
|
The 6-tuple GDAL geotransform. |
Raises:
| Type | Description |
|---|---|
ValueError
|
When |
Examples:
- A north-up 30 m grid reorders to the GDAL geotransform:
- The trailing
[0, 0, 1]row of a 9-element affine is ignored:
Source code in src/pyramids/stac/_extensions.py
geotransform_to_affine(geotransform)
#
Convert a GDAL geotransform to a STAC proj:transform affine.
The inverse of :func:affine_to_geotransform. GDAL's geotransform is
(c, a, b, f, d, e) — (x_origin, x_res, x_rot, y_origin, y_rot,
y_res) — and proj:transform is the affine ordering
[a, b, c, d, e, f].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
geotransform
|
Any
|
A 6-element GDAL geotransform. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
The 6-element |
Raises:
| Type | Description |
|---|---|
ValueError
|
When |
Examples:
- A north-up 30 m grid maps back to the affine order:
Source code in src/pyramids/stac/_extensions.py
read_extension_metadata(item, asset_key=None)
#
Read proj / raster / eo extension fields for a STAC asset.
Item-level fields (under properties) are read first and an asset-level
value of the same key overrides them, matching the STAC convention that an
asset narrows item-level metadata. No asset file is opened.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Any
|
A STAC Item (pystac object or raw dict). When |
required |
asset_key
|
str | None
|
The asset key whose metadata to read, or |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dict with keys: |
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
Raises:
| Type | Description |
|---|---|
StacAssetError
|
When |
Examples:
- Read a Sentinel-2-style asset's projection metadata from raw JSON:
>>> from pyramids.stac._extensions import read_extension_metadata >>> item = { ... "properties": {"proj:epsg": 32633}, ... "assets": {"B04": { ... "href": "s3://b/B04.tif", ... "proj:shape": [10980, 10980], ... "proj:transform": [10.0, 0.0, 600000.0, 0.0, -10.0, 5300040.0], ... "raster:bands": [{"nodata": 0, "scale": 0.0001}], ... "eo:bands": [{"name": "B04", "common_name": "red"}], ... }}, ... } >>> meta = read_extension_metadata(item, "B04") >>> meta["crs"] 'EPSG:32633' >>> meta["geotransform"] (600000.0, 10.0, 0.0, 5300040.0, 0.0, -10.0) >>> meta["band_names"] ['B04'] - An asset-level
proj:epsgoverrides the item-level value: - A bare asset with no extension fields yields all-empty metadata:
Source code in src/pyramids/stac/_extensions.py
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 | |
VRT mosaic#
pyramids.stac._vrt.build_vrt_from_stac(items, asset, *, signer=None, separate=False)
#
Mosaic one STAC asset across items into a lazy VRT-backed Dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
Any
|
Iterable of STAC Items (pystac objects, raw JSON dicts, or any
duck-typed equivalent — same contract as
:meth: |
required |
asset
|
str
|
The asset key to mosaic (e.g. |
required |
signer
|
Any
|
Optional signer (e.g. a :class: |
None
|
separate
|
bool
|
When |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
Dataset |
Dataset
|
A lazy |
Dataset
|
underlying sources on demand ( |
|
Dataset
|
hrefs). |
Raises:
| Type | Description |
|---|---|
ValueError
|
When |
RuntimeError
|
When |
Examples:
- Mosaic the
visualasset of several items into one lazy Dataset (requires network for remote hrefs):
Source code in src/pyramids/stac/_vrt.py
Download to local files#
pyramids.stac.download.download_item(item, directory, *, include=None, exclude=None, s3_requester_pays=False)
#
Download a STAC Item's assets to a local directory.
A thin, synchronous wrapper over stac_asset.blocking.download_item (the
async download_item cannot run inside a live event loop). The per-protocol
client is chosen by stac_asset from each asset href.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Any
|
A |
required |
directory
|
str | Path
|
Destination directory for the downloaded assets. |
required |
include
|
list[str] | None
|
Optional asset keys to include (others skipped). |
None
|
exclude
|
list[str] | None
|
Optional asset keys to exclude. |
None
|
s3_requester_pays
|
bool
|
Opt into Requester-Pays for |
False
|
Returns:
| Type | Description |
|---|---|
Any
|
The downloaded |
Any
|
paths), as returned by |
Raises:
| Type | Description |
|---|---|
OptionalPackageDoesNotExist
|
When |
Examples:
- Download an item's assets, then build a collection from the locals
(requires the
[stac]extra + network):
Source code in src/pyramids/stac/download.py
GeoParquet round-trip#
pyramids.stac._geoparquet
#
Serialize STAC Items to/from GeoParquet (PD-3).
stac-geoparquet stores a STAC ItemCollection as one columnar GeoParquet file
(geometry as WKB, WGS84) for bulk transfer + fast spatial filtering, avoiding
thousands of per-item JSON requests. pyramids already has geopandas (core) and a
FeatureCollection (a GeoDataFrame subclass) with GeoParquet I/O, plus the
[parquet] extra (pyarrow) — so the round-trip needs no new dependency.
This is a lossless pyramids variant: each row carries the item geometry (so the
file is a valid, spatially-filterable GeoParquet) plus the full STAC Item as a
JSON column, so :func:from_geoparquet reconstructs the exact item dicts —
ready to feed :meth:pyramids.dataset.DatasetCollection.from_stac.
Requires the [parquet] extra (pyarrow) for the Parquet read/write itself.
to_geoparquet(items, path)
#
Write a sequence of STAC Items to a GeoParquet file.
Each item becomes a row carrying its geometry (a valid, spatially-filterable GeoParquet geometry in EPSG:4326) and the full item as a JSON column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
Any
|
Iterable of STAC Items ( |
required |
path
|
str | Path
|
Destination |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
When |
OptionalPackageDoesNotExist
|
When pyarrow (the |
Examples:
- Round-trip a couple of item dicts through GeoParquet:
>>> import tempfile, os >>> from pyramids.stac import to_geoparquet, from_geoparquet # doctest: +SKIP >>> items = [{"id": "a", "geometry": {"type": "Point", "coordinates": [1.0, 2.0]}, ... "properties": {"datetime": "2023-01-01T00:00:00Z"}, "assets": {}}] >>> path = os.path.join(tempfile.mkdtemp(), "items.parquet") # doctest: +SKIP >>> to_geoparquet(items, path) # doctest: +SKIP >>> from_geoparquet(path)[0]["id"] # doctest: +SKIP 'a'
Source code in src/pyramids/stac/_geoparquet.py
from_geoparquet(path)
#
Read STAC Items back from a GeoParquet written by :func:to_geoparquet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to a |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
The list of STAC Item dicts (ready for |
list[dict[str, Any]]
|
meth: |
Raises:
| Type | Description |
|---|---|
OptionalPackageDoesNotExist
|
When pyarrow (the |
KeyError
|
When the file lacks the |