Skip to content

Cloud Optimized GeoTIFF (COG)#

The COG surface lets you write, inspect, validate, and partially read Cloud Optimized GeoTIFFs. It is GDAL-native — no extra third-party dependency — and is split between a small subpackage of helpers (pyramids.dataset.cog) and the COG engine that owns the user-facing Dataset methods.

Module layout#

Hold "Ctrl" to enable pan & zoom
classDiagram
    class Dataset {
        +to_cog(...)
        +to_cog_bytes(...)
        +cog_info()
        +is_cog
        +validate_cog(...)
        +read_part(...) preview(...) point(...) read_tile(...)
    }
    class COG {
        <<engine>>
    }
    class write_cog {
        <<facade>>
    }
    Dataset --> COG : ds.cog
    write_cog ..> COG : delegates

Dataset.to_cog(...) is the single owner of write policy; the multi-input write_cog(...) facade (NumPy array / xarray.DataArray / gdal.Dataset / path / Dataset) normalises its input and delegates to it, so both produce identical output.

Public API at a glance#

Concern Symbol Page
Write a COG (typed kwargs) Dataset.to_cog this page
Write from array/DataArray/path write_cog this page
Encode to in-memory bytes Dataset.to_cog_bytes this page
Named compression profiles PROFILES · profile_options · validate_profile this page
Creation-option helpers merge_options · to_gdal_options · validate_blocksize this page
Structured inspection cog_info · COGInfo · OverviewLevel Read & inspect
Validate validate · ValidationReport · Dataset.is_cog Read & inspect
Overview-decimated reads read_part · preview · point · read_tile Read & inspect
Command line pyramids cog create\|validate\|info CLI

See the COG cookbook for an end-to-end walkthrough and the COG basics notebook for a runnable, offline example.

Defaults#

to_cog resolves the two pixel-affecting options per source dtype: the predictor (2 for integer, 3 for float) and the overview resampling (mode for categorical sources — integer dtype or a colour table — and average for continuous float). Both can be overridden explicitly.

Writing#

pyramids.dataset.engines.cog.COG #

Bases: _Engine

Cloud Optimized GeoTIFF read/write/validate operations for Dataset.

Owns the real implementations of to_cog, is_cog (property), and validate_cog. Dataset exposes a same-named facade for each so ds.to_cog(...) and ds.cog.to_cog(...) are equivalent.

to_cog is the single owner of COG write policy: it applies the house defaults, resolves the dtype-aware predictor and overview resampling, and runs the STATISTICS retry. The :func:pyramids.dataset.cog.write_cog facade is a thin delegator that only normalises its input and forwards overrides here, so both entry points produce identical output for identical input. The categorical-raster resampling guardrail (_warn_if_categorical_with_averaging) lives here too.

Source code in src/pyramids/dataset/engines/cog.py
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
class COG(_Engine):
    """Cloud Optimized GeoTIFF read/write/validate operations for `Dataset`.

    Owns the real implementations of `to_cog`, `is_cog` (property),
    and `validate_cog`. `Dataset` exposes a same-named facade for each
    so `ds.to_cog(...)` and `ds.cog.to_cog(...)` are equivalent.

    `to_cog` is the **single owner of COG write policy**: it applies the
    house defaults, resolves the dtype-aware predictor and overview
    resampling, and runs the `STATISTICS` retry. The
    :func:`pyramids.dataset.cog.write_cog` facade is a thin delegator that
    only normalises its input and forwards overrides here, so both entry
    points produce identical output for identical input. The
    categorical-raster resampling guardrail
    (`_warn_if_categorical_with_averaging`) lives here too.
    """

    def to_cog(
        self,
        path: str | Path,
        *,
        profile: str | None = None,
        compress: str | None = None,
        level: int | None = None,
        quality: int | None = None,
        blocksize: int = 512,
        predictor: str | int | None = None,
        bigtiff: str = "IF_SAFER",
        num_threads: int | str = "ALL_CPUS",
        overview_resampling: str | None = None,
        overview_count: int | None = None,
        overview_compress: str | None = None,
        tiling_scheme: str | None = None,
        zoom_level: int | None = None,
        zoom_level_strategy: str = "auto",
        aligned_levels: int | None = None,
        resampling: str = "nearest",
        add_mask: bool = False,
        sparse_ok: bool = False,
        target_srs: int | str | None = None,
        statistics: bool = True,
        indexes: list[int] | None = None,
        out_dtype: str | None = None,
        nodata: float | int | None = None,
        band_tags: dict[int, dict[str, Any]] | None = None,
        colormap: dict[int, tuple[int, int, int, int]] | None = None,
        metadata: dict[str, Any] | None = None,
        config: dict[str, str] | None = None,
        extra: Mapping[str, Any] | list[str] | None = None,
    ) -> Path:
        """Save the dataset as a Cloud Optimized GeoTIFF.

        Args:
            path: Destination path. Parent directory must exist.
            profile: Named compression preset (case-insensitive) — one of
                `deflate`, `zstd`, `lzw`, `packbits`, `jpeg`, `webp`,
                `lerc`, `lerc_deflate`, `lerc_zstd`, `raw`. Seeds the
                compression options; explicit `compress`/`level`/`quality`
                and `extra` override it. `jpeg`/`webp` enforce dtype/band
                constraints (Byte; 1-3 / 3-4 bands).
            compress: Compression method. `DEFLATE`, `LZW`, and
                `NONE` are guaranteed by every GDAL build. `JPEG`
                is almost always available. `ZSTD`, `WEBP`,
                `LERC`, `LERC_DEFLATE`, and `LERC_ZSTD` require
                the GDAL build to have been compiled with the
                corresponding library (libzstd / libwebp / LERC); on
                a GDAL build lacking them, the COG driver will raise
                at write time. To probe what your GDAL supports:

                ```python
                from osgeo import gdal
                meta = gdal.GetDriverByName("GTiff").GetMetadataItem(
                    "DMD_CREATIONOPTIONLIST"
                )
                print("ZSTD" in (meta or ""))
                ```
            level: Compression level (e.g., 1-12 for DEFLATE, 1-22 ZSTD).
            quality: Lossy-compression quality 1-100 (JPEG/WEBP).
            blocksize: Internal tile size; power of 2 in [64, 4096].
            predictor: `"YES"`/`"STANDARD"`/`"FLOATING_POINT"` or 1/2/3.
                Defaults to `None`, which auto-resolves per the source
                dtype: `2` (horizontal differencing) for integer rasters,
                `3` (floating-point predictor) for float rasters. Pass an
                explicit value to override.
            bigtiff: `"IF_SAFER"` (default), `"YES"`, `"NO"`,
                `"IF_NEEDED"`.
            num_threads: Worker threads; `"ALL_CPUS"` or an int.
            overview_resampling: `nearest`, `average`, `bilinear`,
                `cubic`, `cubicspline`, `lanczos`, `mode`,
                `rms`, `gauss`. Defaults to `None`, which auto-resolves
                per the source dtype: `mode` for categorical sources
                (integer dtype or a colour table) and `average` for
                continuous (float) sources. The categorical guardrail
                warns only when *you* explicitly pass an averaging method
                on categorical data — never for this auto-resolved default.
            overview_count: Number of overview levels (default: auto).
            overview_compress: Compression for overview IFDs.
            tiling_scheme: e.g., `"GoogleMapsCompatible"` for a
                web-optimized COG (EPSG:3857).
            zoom_level: Advanced tiling-scheme knob: pin the maximum zoom level.
            zoom_level_strategy: Advanced tiling-scheme knob: `auto` (default),
                `lower`, or `upper` zoom-level selection.
            aligned_levels: Advanced tiling-scheme knob: number of overview
                levels aligned to the tiling scheme.
            resampling: Warp resampling when `tiling_scheme` or
                `target_srs` reprojects.
            add_mask: Add an alpha band for transparency.
            sparse_ok: Allow sparse (unfilled) tiles.
            target_srs: Reproject before write. Int for EPSG or a WKT
                / PROJ string.
            statistics: Compute and embed band statistics.
            indexes: 0-based band indices to keep, in order (e.g. `[3, 2, 1]`
                to select and reorder bands). `None` keeps all bands. When
                set, the source is pre-processed through an in-memory
                `gdal.Translate` before the COG write.
            out_dtype: Output NumPy dtype name to cast to (e.g. `"uint8"`,
                `"int16"`). `None` keeps the source dtype. The dtype-aware
                predictor is resolved from the *post-cast* dtype.
            nodata: NoData value to set on the output. `None` keeps the
                source NoData.
            band_tags: Per-band metadata to stamp onto the output, keyed by
                0-based band index, e.g. `{0: {"name": "NDVI"}}`. Useful when
                the source is a bare array/DataArray that carries no band
                descriptions.
            colormap: Palette to attach to band 1, mapping pixel value to an
                `(R, G, B, A)` tuple, e.g. `{0: (0, 0, 0, 255), 1: (255, 0, 0, 255)}`.
            metadata: Dataset-level metadata items to stamp onto the output.
            config: GDAL config options (e.g. `{"GDAL_NUM_THREADS": "4"}`)
                applied via `gdal.config_options` for the duration of the
                write. `None` (default) applies no extra config.
            extra: Additional GDAL creation options as a mapping or
                legacy `['KEY=VALUE',...]` list. Overrides
                conflicting kwargs.

        Returns:
            Path: The resolved destination path.

        Raises:
            ValueError: Invalid blocksize or unknown option key.
            FileNotFoundError: Parent directory does not exist.
            FailedToSaveError: GDAL CreateCopy failed.
            DriverNotExistError: GDAL build lacks the COG driver.

        Warnings:
            UserWarning: When the source looks categorical (integer
                dtype or has a color table) and `overview_resampling`
                is an averaging method.

        Note:
            Setting `tiling_scheme` (e.g., `GoogleMapsCompatible`)
            implies a specific SRS — `target_srs` is ignored in that
            case. A `UserWarning` is emitted if both are provided.

        Note:
            **Larger-than-RAM / parallel writes.** The GDAL COG driver does the
            two-pass overview layout internally and *streams* from the source
            dataset, so a raster bigger than RAM can be COG-encoded as long as
            the source is **on-disk** (or a `/vsi*` file) rather than a fully
            in-RAM array — anchor a MEM dataset with `to_file(path)` first if
            needed. There is no truly dask-parallel COG writer yet:
            `to_file(compute=False)` returns a `dask.delayed` that wraps the
            *synchronous* GDAL write (GeoTIFF writes are serialised by GDAL's
            own file lock), so it defers *scheduling*, not memory or per-tile
            parallelism. For parallel cloud writes use a Zarr-backed output.

        Examples:
            - Write a compressed COG from an in-memory Dataset:
                ```python
                >>> import numpy as np  # doctest: +SKIP
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> arr = np.random.rand(256, 256).astype("float32")  # doctest: +SKIP
                >>> ds = Dataset.create_from_array(  # doctest: +SKIP
                ...     arr, top_left_corner=(0, 0), cell_size=0.001, epsg=4326,
                ... )
                >>> out = ds.to_cog("out.tif", compress="ZSTD")  # doctest: +SKIP
                >>> out.name  # doctest: +SKIP
                'out.tif'

                ```
            - Produce a web-optimized COG for a tile server:
                ```python
                >>> web = ds.to_cog("web.tif", tiling_scheme="GoogleMapsCompatible")  # doctest: +SKIP
                >>> reopened = Dataset.read_file(web)  # doctest: +SKIP
                >>> reopened.epsg  # doctest: +SKIP
                3857

                ```
            - Forward additional GDAL options through `extra`:
                ```python
                >>> _ = ds.to_cog(  # doctest: +SKIP
                ...     "precise.tif",
                ...     compress="LERC",
                ...     extra={"MAX_Z_ERROR": 0.001},
                ... )

                ```
        """
        validate_blocksize(blocksize)
        if tiling_scheme is not None and target_srs is not None:
            warnings.warn(
                "Both tiling_scheme and target_srs provided; "
                "tiling_scheme wins and target_srs is ignored.",
                UserWarning,
                stacklevel=2,
            )
            target_srs = None

        # Build the effective source (PB-4): when band-subsetting, casting the
        # dtype, or (re)setting NoData, pre-process through an in-memory
        # gdal.Translate so the predictor/overview policy below — and the COG
        # write itself — see the *output* bands, not the original source.
        source_ds, source_band0 = self._effective_source(
            indexes, out_dtype, nodata, band_tags, colormap, metadata
        )

        # Resolve a named profile (PB-5): it seeds the compression options;
        # explicit kwargs and `extra` override it. jpeg/webp enforce dtype/band
        # constraints against the *effective* source.
        profile_opts: dict[str, Any] = {}
        if profile is not None:
            validate_profile(
                profile,
                gdal.GetDataTypeName(source_band0.DataType),
                source_ds.RasterCount,
            )
            profile_opts = profile_options(profile)
        eff_compress = (
            compress
            if compress is not None
            else profile_opts.get("COMPRESS", "DEFLATE")
        )
        eff_level = level if level is not None else profile_opts.get("LEVEL")
        eff_quality = quality if quality is not None else profile_opts.get("QUALITY")
        profile_extra = {
            k: v
            for k, v in profile_opts.items()
            if k not in ("COMPRESS", "LEVEL", "QUALITY")
        }

        # Single house policy lives here (ARC-1): `to_cog` resolves the
        # dtype-dependent defaults so a direct `ds.to_cog(...)` and the
        # `write_cog(...)` facade — which now just delegates here — produce
        # identical output for identical input.
        if predictor is None:
            # Per-dtype predictor (ARC-2): 2 for integer, 3 for float. GeoTIFF
            # bands share a dtype, so band 0 decides for the whole file. Pass an
            # explicit `predictor=` to override for an (atypical) mixed source.
            predictor = resolve_cog_predictor(source_band0.DataType)
        caller_chose_resampling = overview_resampling is not None
        if overview_resampling is None:
            # Category-safe default (ARC-3): `mode` for integer/colour-table
            # sources, `average` for continuous. Chosen so the default never
            # corrupts categorical rasters and never trips the guardrail below.
            overview_resampling = default_cog_overview_resampling(
                source_band0.DataType, source_band0.GetColorTable() is not None
            )
        if caller_chose_resampling:
            # Only warn when the *caller* explicitly asked for an averaging
            # resampler on categorical data — never for a default we picked.
            self._warn_if_categorical_with_averaging(
                overview_resampling, band=source_band0
            )

        num_threads_str = (
            num_threads if isinstance(num_threads, str) else str(num_threads)
        )
        defaults: dict[str, Any] = {
            "COMPRESS": eff_compress,
            "LEVEL": eff_level,
            "QUALITY": eff_quality,
            **profile_extra,
            "BLOCKSIZE": blocksize,
            "PREDICTOR": predictor,
            "BIGTIFF": bigtiff,
            "NUM_THREADS": num_threads_str,
            "OVERVIEW_RESAMPLING": overview_resampling,
            "OVERVIEW_COUNT": overview_count,
            "OVERVIEW_COMPRESS": overview_compress,
            "TILING_SCHEME": tiling_scheme,
            "ZOOM_LEVEL": zoom_level,
            "ZOOM_LEVEL_STRATEGY": zoom_level_strategy,
            "ALIGNED_LEVELS": aligned_levels,
            "WARP_RESAMPLING": (resampling if (tiling_scheme or target_srs) else None),
            "ADD_ALPHA": True if add_mask else None,
            "SPARSE_OK": True if sparse_ok else None,
            "STATISTICS": "YES" if statistics else None,
        }
        if target_srs is not None:
            defaults["TARGET_SRS"] = (
                f"EPSG:{target_srs}" if isinstance(target_srs, int) else target_srs
            )

        options = merge_options(defaults, extra)
        with config_context(config):
            self._translate_with_statistics_retry(path, options, src=source_ds)
        return Path(path)

    def _effective_source(
        self,
        indexes: list[int] | None,
        out_dtype: str | None,
        nodata: float | int | None,
        band_tags: dict[int, dict[str, Any]] | None = None,
        colormap: dict[int, tuple[int, int, int, int]] | None = None,
        metadata: dict[str, Any] | None = None,
    ) -> tuple[gdal.Dataset, Any]:
        """Return the source dataset (optionally pre-processed) and its band 0.

        When band-subsetting / dtype-casting / setting NoData (PB-4) the backing
        raster is run through an in-memory ``gdal.Translate``; when stamping
        band tags / colourmap / metadata (PC-2) it is copied to a MEM dataset
        first so the user's open dataset is **never mutated**. With neither, the
        backing raster is returned unchanged.

        Args:
            indexes: 0-based band indices to keep/reorder, or ``None``.
            out_dtype: Output NumPy dtype name to cast to, or ``None``.
            nodata: NoData value to set, or ``None``.
            band_tags: Per-band metadata keyed by 0-based band index, or ``None``.
            colormap: Palette for band 1 (value -> RGBA), or ``None``.
            metadata: Dataset-level metadata, or ``None``.

        Returns:
            A ``(dataset, band1)`` tuple where ``band1`` is GDAL band 1 of the
            returned dataset (used for predictor/resampling resolution).
        """
        needs_translate = (
            indexes is not None or out_dtype is not None or nodata is not None
        )
        needs_stamp = bool(band_tags or colormap or metadata)
        if not needs_translate and not needs_stamp:
            ds = self._ds._raster
            return ds, ds.GetRasterBand(1)

        if needs_translate:
            translate_kwargs: dict[str, Any] = {}
            if indexes is not None:
                # pyramids band indices are 0-based; GDAL bandList is 1-based.
                translate_kwargs["bandList"] = [i + 1 for i in indexes]
            if out_dtype is not None:
                translate_kwargs["outputType"] = numpy_to_gdal_dtype(out_dtype)
            if nodata is not None:
                translate_kwargs["noData"] = nodata
            mem = gdal.Translate(
                "", self._ds._raster, format="MEM", **translate_kwargs
            )
        else:
            # Stamp-only: copy so the user's dataset is not mutated.
            mem = gdal.GetDriverByName("MEM").CreateCopy("", self._ds._raster)
        if mem is None:
            raise FailedToSaveError(
                "failed to build the pre-processed COG source "
                f"(indexes={indexes}, out_dtype={out_dtype}, nodata={nodata})"
            )
        if needs_stamp:
            self._stamp_metadata(mem, band_tags, colormap, metadata)
        return mem, mem.GetRasterBand(1)

    @staticmethod
    def _stamp_metadata(
        ds: gdal.Dataset,
        band_tags: dict[int, dict[str, Any]] | None,
        colormap: dict[int, tuple[int, int, int, int]] | None,
        metadata: dict[str, Any] | None,
    ) -> None:
        """Stamp band tags / colourmap / dataset metadata onto a MEM dataset.

        Args:
            ds: The (copied) dataset to mutate.
            band_tags: Per-band metadata keyed by 0-based band index.
            colormap: Palette for band 1 (value -> RGBA tuple). GeoTIFF only
                supports a colour table on a single-band `Byte` / `UInt16`
                raster; a `ValueError` is raised up-front for other dtypes
                (GDAL would otherwise fail deep in `CreateCopy`).
            metadata: Dataset-level metadata items.

        Raises:
            ValueError: When `colormap` is applied to a band whose dtype is
                not `Byte`/`UInt16`.
        """
        if metadata:
            ds.SetMetadata({str(k): str(v) for k, v in metadata.items()})
        if colormap:
            band = ds.GetRasterBand(1)
            if band.DataType not in _PALETTE_GDAL_DTYPES:
                raise ValueError(
                    f"colormap is only supported on Byte/UInt16 rasters; got "
                    f"{gdal.GetDataTypeName(band.DataType)}. Cast first with "
                    f"to_cog(..., out_dtype='uint8'), or drop the colormap."
                )
            color_table = gdal.ColorTable()
            for value, rgba in colormap.items():
                color_table.SetColorEntry(int(value), tuple(rgba))
            band.SetColorTable(color_table)
            band.SetColorInterpretation(gdal.GCI_PaletteIndex)
        if band_tags:
            for index, tags in band_tags.items():
                # 0-based index -> GDAL 1-based band number.
                ds.GetRasterBand(index + 1).SetMetadata(
                    {str(k): str(v) for k, v in tags.items()}
                )

    def to_cog_bytes(self, **kwargs: Any) -> bytes:
        """Encode the dataset as a COG and return the file contents as bytes.

        Writes the COG to an in-memory GDAL ``/vsimem/`` file (no temp file on
        disk), reads the bytes back, and unlinks the virtual file. Useful for
        uploading a COG directly to an object store (S3 / GCS / Azure) without
        touching the local filesystem.

        Args:
            **kwargs: Forwarded verbatim to :meth:`to_cog` (e.g. ``compress``,
                ``blocksize``, ``predictor``, ``extra``). The same house
                defaults and dtype-aware resolution apply.

        Returns:
            bytes: The complete COG file contents.

        Raises:
            FailedToSaveError: GDAL failed to encode the COG.

        Examples:
            - Encode an in-memory Dataset to COG bytes and upload them:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene.tif")  # doctest: +SKIP
                >>> blob = ds.to_cog_bytes(compress="ZSTD")  # doctest: +SKIP
                >>> len(blob) > 0  # doctest: +SKIP
                True
                >>> blob[:2] in (b"II", b"MM")  # TIFF byte-order marker  # doctest: +SKIP
                True

                ```
        """
        vsi_path = f"/vsimem/{uuid.uuid4().hex}.tif"
        try:
            self.to_cog(vsi_path, **kwargs)
            handle = gdal.VSIFOpenL(vsi_path, "rb")
            if handle is None:
                raise FailedToSaveError(
                    f"could not reopen in-memory COG at {vsi_path}"
                )
            try:
                gdal.VSIFSeekL(handle, 0, 2)  # SEEK_END
                size = gdal.VSIFTellL(handle)
                gdal.VSIFSeekL(handle, 0, 0)  # SEEK_SET
                data = gdal.VSIFReadL(1, size, handle)
            finally:
                gdal.VSIFCloseL(handle)
        finally:
            gdal.Unlink(vsi_path)
        return bytes(data)

    def _translate_with_statistics_retry(
        self,
        path: str | Path,
        options: dict[str, Any],
        src: gdal.Dataset | None = None,
    ) -> None:
        """Write the COG, retrying once without STATISTICS on the known failure.

        Some GDAL builds abort the ``STATISTICS=YES`` sampling pass on float
        on-disk sources with "no valid pixels found in sampling". The COG
        itself is fine without embedded statistics, so on that specific error
        we retry once with ``STATISTICS`` dropped. Lives here (ARC-4) rather
        than in the :func:`write_cog` facade so a direct ``ds.to_cog(...)`` is
        equally robust.

        Args:
            path: Destination file path.
            options: Fully-merged COG creation options.
            src: Source :class:`gdal.Dataset` to encode. Defaults to the
                backing raster; a pre-processed in-memory dataset is passed
                when band-subsetting / casting / setting NoData (PB-4).
        """
        source = self._ds._raster if src is None else src

        def _run(opts: dict[str, Any]) -> None:
            dst: gdal.Dataset | None = None
            try:
                dst = translate_to_cog(source, path, opts)
                dst.FlushCache()
            finally:
                dst = None

        try:
            _run(options)
        except (RuntimeError, FailedToSaveError) as exc:
            # translate_to_cog wraps CreateCopy RuntimeErrors into
            # FailedToSaveError; a deferred STATISTICS failure at FlushCache
            # time surfaces as a raw RuntimeError — catch both.
            statistics_on = str(options.get("STATISTICS", "")).upper() in ("YES", "TRUE")
            if statistics_on and "valid pixels" in str(exc).lower():
                retry = {k: v for k, v in options.items() if k != "STATISTICS"}
                _run(retry)
            else:
                raise

    @property
    def is_cog(self) -> bool:
        """`True` iff the backing file on disk is a valid COG.

        `False` for MEM datasets, `/vsimem/` paths, and unsaved
        datasets (empty :attr:`file_name`).

        Examples:
            - Check the backing file of a newly-opened COG:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene.tif")  # doctest: +SKIP
                >>> ds.is_cog  # doctest: +SKIP
                True

                ```
            - Plain GeoTIFFs and MEM datasets return False:
                ```python
                >>> plain = Dataset.read_file("plain.tif")  # doctest: +SKIP
                >>> plain.is_cog  # doctest: +SKIP
                False

                ```
            - Use in a conditional pipeline:
                ```python
                >>> if not ds.is_cog:  # doctest: +SKIP
                ...     ds.to_cog("fixed.tif")

                ```
        """
        result: bool
        fn = self._on_disk_path()
        if fn is None:
            result = False
        else:
            result = self._is_cog_cheap(fn)
        return result

    @staticmethod
    def _is_cog_cheap(path: str) -> bool:
        """Fast, metadata-only heuristic for "is this file a COG?" (ARC-7).

        Avoids the full COG validator on every `is_cog` access (which reads the
        whole IFD/offset table — costly over `/vsicurl`). Checks: GTiff driver,
        no external `.ovr` sidecar, internally tiled (square blocks or a single
        tile), and internal overviews present when the image is larger than one
        tile. This can FALSE-POSITIVE on a tiled GeoTIFF that is not laid out in
        strict COG order — use :meth:`validate_cog` for the authoritative check.

        Args:
            path: On-disk or remote `/vsi*` path.

        Returns:
            bool: `True` when the file looks like a COG by the cheap heuristic.
        """
        cfg = _resolve_read_config(path, None)
        with config_context(cfg):
            try:
                ds = gdal.Open(path)
            except RuntimeError:
                return False
            if ds is None:
                return False
            try:
                if ds.GetDriver().ShortName != "GTiff":
                    return False
                files = ds.GetFileList() or []
                if any(str(f).lower().endswith(".ovr") for f in files):
                    return False
                band = ds.GetRasterBand(1)
                block_x, block_y = band.GetBlockSize()
                width, height = ds.RasterXSize, ds.RasterYSize
                single_tile = block_x >= width and block_y >= height
                tiled = block_x == block_y or single_tile
                if not tiled:
                    return False
                needs_overviews = max(width, height) > max(block_x, block_y)
                if needs_overviews and band.GetOverviewCount() == 0:
                    return False
                return True
            finally:
                ds = None

    def validate_cog(
        self, strict: bool = False, config: dict[str, str] | None = None
    ) -> ValidationReport:
        """Validate the backing file as a COG.

        Args:
            strict: If `True`, warnings are treated as errors.
            config: GDAL config options for the read; defaults to the remote
                read tuning for `/vsicurl` paths (see
                :func:`pyramids.dataset.cog.validate.validate`).

        Returns:
            ValidationReport with errors, warnings, and structural details.

        Raises:
            FileNotFoundError: Dataset has no on-disk backing file
                (MEM-only or `/vsimem/`).

        Examples:
            - Validate and branch on the result:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene.tif")  # doctest: +SKIP
                >>> report = ds.validate_cog()  # doctest: +SKIP
                >>> bool(report)  # doctest: +SKIP
                True

                ```
            - Strict mode promotes warnings to errors:
                ```python
                >>> strict = ds.validate_cog(strict=True)  # doctest: +SKIP
                >>> if not strict:  # doctest: +SKIP
                ...     for err in strict.errors: print(err)

                ```
            - Inspect structural details from the report:
                ```python
                >>> report.details.get("blocksize")  # doctest: +SKIP
                [512, 512]

                ```
        """
        fn = self._on_disk_path()
        if fn is None:
            raise FileNotFoundError(
                "Dataset has no on-disk backing file to validate "
                "(is this a MEM or /vsimem/ dataset?)"
            )
        return validate(fn, strict=strict, config=config)

    def info(self, config: dict[str, str] | None = None) -> COGInfo:
        """Return structured COG metadata for the backing file.

        Reads only headers/metadata (no pixels) and reports compression,
        predictor, blocksize, dtype, CRS/bounds/resolution, the overview
        pyramid, per-band tags, and colour-table presence. See
        :class:`pyramids.dataset.cog.inspect.COGInfo`.

        Args:
            config: GDAL config options for the read; defaults to the remote
                read tuning for `/vsicurl` paths.

        Returns:
            COGInfo: The structured metadata for the on-disk file.

        Raises:
            FileNotFoundError: Dataset has no on-disk backing file
                (MEM-only or `/vsimem/`).

        Examples:
            - Inspect a COG's compression and overview pyramid:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene_cog.tif")  # doctest: +SKIP
                >>> info = ds.cog_info()  # doctest: +SKIP
                >>> info.compression  # doctest: +SKIP
                'DEFLATE'
                >>> [o.decimation for o in info.overviews]  # doctest: +SKIP
                [2, 4, 8]

                ```
            - Read the tile size and band count:
                ```python
                >>> info.blocksize  # doctest: +SKIP
                (512, 512)
                >>> info.band_count  # doctest: +SKIP
                1

                ```
        """
        fn = self._on_disk_path()
        if fn is None:
            raise FileNotFoundError(
                "Dataset has no on-disk backing file to inspect "
                "(is this a MEM or /vsimem/ dataset?)"
            )
        return cog_info(fn, config=config)

    def _on_disk_path(self) -> str | None:
        """Return the validatable on-disk path of the backing raster, or None.

        A single predicate shared by :attr:`is_cog`, :meth:`validate_cog`, and
        :meth:`info` (ARC-5) so the definition of "has a real backing file to
        validate/inspect" cannot drift between them.

        Returns:
            str | None: The file path when the dataset is backed by a real
            on-disk (or remote `/vsi*`, but not in-memory `/vsimem/`) file;
            `None` for MEM datasets, `/vsimem/` paths, and unsaved datasets.
        """
        fn = self._ds.file_name
        if not fn or fn.startswith("/vsimem/"):
            return None
        return fn

    def read_part(
        self,
        bbox: tuple[float, float, float, float],
        *,
        dst_width: int | None = None,
        dst_height: int | None = None,
        bbox_crs: int = 4326,
        resampling: str = "bilinear",
        band: int | None = None,
    ) -> np.ndarray:
        """Read a geographic window, decimated from the nearest overview.

        Requesting a `dst_width`/`dst_height` smaller than the source window
        makes GDAL serve the data from the nearest overview level, so for a COG
        over `/vsicurl/` only the relevant byte ranges are fetched — the
        cloud-native partial-read pattern.

        Args:
            bbox: `(min_x, min_y, max_x, max_y)` window in `bbox_crs`.
            dst_width: Output width in pixels. Defaults to the source window
                width (no decimation).
            dst_height: Output height in pixels. Defaults to the source window
                height.
            bbox_crs: EPSG code of `bbox`. Reprojected to the dataset CRS
                when different. Defaults to 4326 (WGS84 lon/lat).
            resampling: One of `nearest`, `bilinear`, `cubic`,
                `cubicspline`, `lanczos`, `average`, `mode`.
            band: 0-based band index. `None` reads all bands.

        Returns:
            numpy.ndarray: `(rows, cols)` for a single band, or
            `(bands, rows, cols)` for all bands; always sized
            `dst_height x dst_width` (the requested output size). Pixel values
            only — no transform, bounds, or CRS is attached.

        Raises:
            ValueError: Unknown `resampling`.
            OutOfBoundsError: The window does not intersect the raster at all.

        Note:
            A window that only **partially** overlaps the raster is **not**
            stretched to fill the output: the intersection is read and placed
            at its correct offset inside a `dst_height x dst_width` buffer
            whose out-of-raster remainder is filled with NoData (the band's
            NoData value, else NaN for float / `0` for integer — see
            :meth:`_nodata_fill`). A fully-inside window is returned without
            padding. This keeps the result aligned to the requested window,
            which matters for edge tiles served by :meth:`read_tile`.

        Examples:
            - Read a 256x256 decimated thumbnail of a bbox:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene_cog.tif")  # doctest: +SKIP
                >>> arr = ds.read_part(  # doctest: +SKIP
                ...     (12.4, 41.8, 12.6, 42.0), dst_width=256, dst_height=256,
                ... )
                >>> arr.shape[-2:]  # doctest: +SKIP
                (256, 256)

                ```
        """
        if resampling not in _RESAMPLING_ALG:
            raise ValueError(
                f"unknown resampling {resampling!r}; "
                f"choose from {sorted(_RESAMPLING_ALG)}"
            )
        ds = self._ds._raster
        min_x, min_y, max_x, max_y = self._reproject_bbox(bbox, bbox_crs)
        inv = gdal.InvGeoTransform(ds.GetGeoTransform())
        px_tl, py_tl = gdal.ApplyGeoTransform(inv, min_x, max_y)
        px_br, py_br = gdal.ApplyGeoTransform(inv, max_x, min_y)

        # The full requested window, in source pixel coordinates (may extend
        # beyond the raster on any side).
        req_xoff = int(math.floor(min(px_tl, px_br)))
        req_yoff = int(math.floor(min(py_tl, py_br)))
        req_xsize = int(math.ceil(max(px_tl, px_br))) - req_xoff
        req_ysize = int(math.ceil(max(py_tl, py_br))) - req_yoff
        if req_xsize <= 0 or req_ysize <= 0:
            raise OutOfBoundsError(
                f"bbox {bbox} (crs {bbox_crs}) has zero pixel extent"
            )

        # Intersection of the requested window with the raster.
        ix0 = max(0, req_xoff)
        iy0 = max(0, req_yoff)
        ix1 = min(ds.RasterXSize, req_xoff + req_xsize)
        iy1 = min(ds.RasterYSize, req_yoff + req_ysize)
        if ix1 - ix0 <= 0 or iy1 - iy0 <= 0:
            raise OutOfBoundsError(
                f"bbox {bbox} (crs {bbox_crs}) does not intersect the raster"
            )

        out_w = dst_width if dst_width is not None else req_xsize
        out_h = dst_height if dst_height is not None else req_ysize
        alg = _RESAMPLING_ALG[resampling]
        source = ds if band is None else ds.GetRasterBand(band + 1)

        fully_inside = (
            ix0 == req_xoff
            and iy0 == req_yoff
            and ix1 == req_xoff + req_xsize
            and iy1 == req_yoff + req_ysize
        )
        if fully_inside:
            return np.asarray(
                source.ReadAsArray(
                    ix0,
                    iy0,
                    ix1 - ix0,
                    iy1 - iy0,
                    buf_xsize=out_w,
                    buf_ysize=out_h,
                    resample_alg=alg,
                )
            )

        # Partial overlap: read only the intersection, then place it at its
        # correct offset inside a full-size output buffer padded with NoData,
        # so the returned array stays aligned to the requested window.
        scale_x = out_w / req_xsize
        scale_y = out_h / req_ysize
        ox0 = max(0, min(out_w, int(round((ix0 - req_xoff) * scale_x))))
        oy0 = max(0, min(out_h, int(round((iy0 - req_yoff) * scale_y))))
        ox1 = max(ox0 + 1, min(out_w, int(round((ix1 - req_xoff) * scale_x))))
        oy1 = max(oy0 + 1, min(out_h, int(round((iy1 - req_yoff) * scale_y))))
        sub = np.asarray(
            source.ReadAsArray(
                ix0,
                iy0,
                ix1 - ix0,
                iy1 - iy0,
                buf_xsize=ox1 - ox0,
                buf_ysize=oy1 - oy0,
                resample_alg=alg,
            )
        )
        fill = self._nodata_fill(ds.GetRasterBand(1))
        if sub.ndim == 3:
            out = np.full((sub.shape[0], out_h, out_w), fill, dtype=sub.dtype)
            out[:, oy0:oy1, ox0:ox1] = sub
        else:
            out = np.full((out_h, out_w), fill, dtype=sub.dtype)
            out[oy0:oy1, ox0:ox1] = sub
        return out

    @staticmethod
    def _nodata_fill(band: Any) -> float:
        """Pick a fill value for padding partial reads.

        Args:
            band: The GDAL band whose NoData value (if any) to use.

        Returns:
            float: The band's NoData value, else NaN for floating-point bands
            and ``0`` for integer bands.
        """
        nodata = band.GetNoDataValue()
        if nodata is not None:
            return nodata
        return 0 if is_integer_gdal_dtype(band.DataType) else float("nan")

    def preview(
        self,
        *,
        max_size: int = 1024,
        resampling: str = "bilinear",
        band: int | None = None,
    ) -> np.ndarray:
        """Read a whole-image thumbnail downsampled to `max_size` on the long edge.

        Pulls from a coarse overview when one exists, so previewing a huge COG
        is cheap.

        Args:
            max_size: Maximum pixels on the longer edge. Defaults to 1024.
            resampling: Resampling method (see :meth:`read_part`).
            band: 0-based band index. `None` reads all bands.

        Returns:
            numpy.ndarray: The downsampled array, `(rows, cols)` or
            `(bands, rows, cols)`. Pixel values only — no transform, bounds,
            or CRS is attached to the returned array.

        Raises:
            ValueError: Unknown `resampling`.

        Examples:
            - Build a 128px thumbnail of a single band:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene_cog.tif")  # doctest: +SKIP
                >>> thumb = ds.preview(max_size=128, band=0)  # doctest: +SKIP
                >>> max(thumb.shape)  # doctest: +SKIP
                128

                ```
        """
        if resampling not in _RESAMPLING_ALG:
            raise ValueError(
                f"unknown resampling {resampling!r}; "
                f"choose from {sorted(_RESAMPLING_ALG)}"
            )
        width, height = self._ds.columns, self._ds.rows
        scale = max(width, height) / max_size
        if scale <= 1:
            out_w, out_h = width, height
        else:
            out_w, out_h = max(1, round(width / scale)), max(1, round(height / scale))
        alg = _RESAMPLING_ALG[resampling]
        ds = self._ds._raster
        source = ds if band is None else ds.GetRasterBand(band + 1)
        return np.asarray(
            source.ReadAsArray(buf_xsize=out_w, buf_ysize=out_h, resample_alg=alg)
        )

    def point(
        self,
        x: float,
        y: float,
        *,
        point_crs: int = 4326,
        band: int | None = None,
    ) -> np.ndarray:
        """Sample band value(s) at a single coordinate.

        Args:
            x: X / longitude / easting in `point_crs`.
            y: Y / latitude / northing in `point_crs`.
            point_crs: EPSG code of `(x, y)`. Reprojected to the dataset CRS
                when different. Defaults to 4326.
            band: 0-based band index. `None` samples all bands.

        Returns:
            numpy.ndarray: A scalar 0-d array for a single band, or a
            `(bands,)` array when `band` is `None`. Pixel values only — no
            coordinate metadata is attached.

        Raises:
            OutOfBoundsError: The point falls outside the raster extent.

        Examples:
            - Sample all bands at a lon/lat coordinate:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("scene_cog.tif")  # doctest: +SKIP
                >>> ds.point(12.5, 41.9)  # doctest: +SKIP
                array([1234.], dtype=float32)

                ```
        """
        col, row = self._world_to_pixel(x, y, point_crs)
        if not (0 <= col < self._ds.columns and 0 <= row < self._ds.rows):
            raise OutOfBoundsError(
                f"point ({x}, {y}) in crs {point_crs} is outside the raster extent"
            )
        ds = self._ds._raster
        source = ds if band is None else ds.GetRasterBand(band + 1)
        arr = np.asarray(source.ReadAsArray(col, row, 1, 1))
        return arr.reshape(-1) if band is None else arr.reshape(())

    def read_tile(
        self,
        z: int,
        x: int,
        y: int,
        *,
        tilesize: int = 256,
        resampling: str = "bilinear",
        band: int | None = None,
    ) -> np.ndarray:
        """Read a Web-Mercator XYZ/slippy-map tile.

        Computes the EPSG:3857 bounds of tile `(z, x, y)` from the closed-form
        Web-Mercator formula and delegates to :meth:`read_part` at `tilesize`
        resolution — no extra tiling dependency needed.

        Args:
            z: Zoom level.
            x: Tile column index.
            y: Tile row index (origin top-left / north-west).
            tilesize: Output tile size in pixels (square). Defaults to 256.
            resampling: Resampling method (see :meth:`read_part`).
            band: 0-based band index. `None` reads all bands.

        Returns:
            numpy.ndarray: A `(tilesize, tilesize)` or
            `(bands, tilesize, tilesize)` array. Pixel values only — the tile's
            georeferencing is defined by its `(z, x, y)`, not attached to the
            array; edge tiles are NoData-padded (see :meth:`read_part`).

        Raises:
            OutOfBoundsError: The tile does not intersect the raster.

        Examples:
            - Read the zoom-0 world tile of a global COG:
                ```python
                >>> from pyramids.dataset import Dataset  # doctest: +SKIP
                >>> ds = Dataset.read_file("global_cog.tif")  # doctest: +SKIP
                >>> tile = ds.read_tile(0, 0, 0)  # doctest: +SKIP
                >>> tile.shape[-2:]  # doctest: +SKIP
                (256, 256)

                ```
        """
        bounds = _xyz_bounds_3857(z, x, y)
        return self.read_part(
            bounds,
            dst_width=tilesize,
            dst_height=tilesize,
            bbox_crs=3857,
            resampling=resampling,
            band=band,
        )

    def _reproject_bbox(
        self, bbox: tuple[float, float, float, float], bbox_crs: int
    ) -> tuple[float, float, float, float]:
        """Reproject a bbox into the dataset CRS, returning its envelope.

        Args:
            bbox: `(min_x, min_y, max_x, max_y)` in `bbox_crs`.
            bbox_crs: EPSG code of `bbox`.

        Returns:
            `(min_x, min_y, max_x, max_y)` in the dataset CRS. When
            `bbox_crs` already matches the dataset EPSG the bbox is
            returned unchanged.
        """
        min_x, min_y, max_x, max_y = bbox
        if self._ds.epsg == bbox_crs:
            return min_x, min_y, max_x, max_y
        transformer = Transformer.from_crs(bbox_crs, self._ds.epsg, always_xy=True)
        corners = [
            transformer.transform(min_x, min_y),
            transformer.transform(min_x, max_y),
            transformer.transform(max_x, min_y),
            transformer.transform(max_x, max_y),
        ]
        xs = [c[0] for c in corners]
        ys = [c[1] for c in corners]
        return min(xs), min(ys), max(xs), max(ys)

    def _world_to_pixel(self, x: float, y: float, point_crs: int) -> tuple[int, int]:
        """Convert a world coordinate to integer `(col, row)` pixel indices.

        Args:
            x: X / longitude in `point_crs`.
            y: Y / latitude in `point_crs`.
            point_crs: EPSG code of `(x, y)`.

        Returns:
            `(col, row)` integer pixel indices (floored).
        """
        if self._ds.epsg != point_crs:
            transformer = Transformer.from_crs(point_crs, self._ds.epsg, always_xy=True)
            x, y = transformer.transform(x, y)
        inv = gdal.InvGeoTransform(self._ds._raster.GetGeoTransform())
        col, row = gdal.ApplyGeoTransform(inv, x, y)
        return int(math.floor(col)), int(math.floor(row))

    def _warn_if_categorical_with_averaging(
        self, overview_resampling: str, band: Any | None = None
    ) -> None:
        """Emit a `UserWarning` if an averaging resampler is used on categorical data.

        Args:
            overview_resampling: The resampling method requested by the
                caller. Case-insensitive. Only averaging-family methods
                (`average`, `bilinear`, `cubic`, `cubicspline`,
                `lanczos`) trigger the check.
            band: GDAL band whose dtype/colour-table decides "categorical".
                Defaults to band 1 of the backing raster; a pre-processed
                (cast/subset) band is passed when those options are used so
                the check reflects the *output* dtype (PB-4).

        Warns:
            UserWarning: When `overview_resampling` is an averaging
                method and the source has a color table OR integer
                dtype — both strong signals of categorical data.

        Note:
            Silent when `overview_resampling` is `nearest` or
            `mode` (both category-safe) or when the source is
            floating-point and has no color table (continuous data).

        Examples:
            - Integer dataset + averaging method emits a warning:
                ```python
                >>> import warnings  # doctest: +SKIP
                >>> with warnings.catch_warnings(record=True) as caught:  # doctest: +SKIP
                ...     warnings.simplefilter("always")
                ...     byte_ds.cog._warn_if_categorical_with_averaging("average")
                ...     [str(w.message) for w in caught if issubclass(w.category, UserWarning)]
                ['overview_resampling=\\'average\\' averages pixel values, ...']

                ```
            - Nearest resampling is always silent:
                ```python
                >>> with warnings.catch_warnings(record=True) as caught:  # doctest: +SKIP
                ...     warnings.simplefilter("always")
                ...     byte_ds.cog._warn_if_categorical_with_averaging("nearest")
                ...     len(caught)
                0

                ```
        """
        if overview_resampling.lower() not in _AVERAGING_RESAMPLERS:
            return
        first_band = band if band is not None else self._ds._raster.GetRasterBand(1)
        has_color_table = first_band.GetColorTable() is not None
        is_integer = is_integer_gdal_dtype(first_band.DataType)
        if has_color_table or is_integer:
            warnings.warn(
                f"overview_resampling={overview_resampling!r} averages pixel "
                "values, which corrupts categorical rasters (land cover, IDs). "
                "Use overview_resampling='nearest' or 'mode' instead.",
                UserWarning,
                stacklevel=3,
            )

to_cog(path, *, profile=None, compress=None, level=None, quality=None, blocksize=512, predictor=None, bigtiff='IF_SAFER', num_threads='ALL_CPUS', overview_resampling=None, overview_count=None, overview_compress=None, tiling_scheme=None, zoom_level=None, zoom_level_strategy='auto', aligned_levels=None, resampling='nearest', add_mask=False, sparse_ok=False, target_srs=None, statistics=True, indexes=None, out_dtype=None, nodata=None, band_tags=None, colormap=None, metadata=None, config=None, extra=None) #

Save the dataset as a Cloud Optimized GeoTIFF.

Parameters:

Name Type Description Default
path str | Path

Destination path. Parent directory must exist.

required
profile str | None

Named compression preset (case-insensitive) — one of deflate, zstd, lzw, packbits, jpeg, webp, lerc, lerc_deflate, lerc_zstd, raw. Seeds the compression options; explicit compress/level/quality and extra override it. jpeg/webp enforce dtype/band constraints (Byte; 1-3 / 3-4 bands).

None
compress str | None

Compression method. DEFLATE, LZW, and NONE are guaranteed by every GDAL build. JPEG is almost always available. ZSTD, WEBP, LERC, LERC_DEFLATE, and LERC_ZSTD require the GDAL build to have been compiled with the corresponding library (libzstd / libwebp / LERC); on a GDAL build lacking them, the COG driver will raise at write time. To probe what your GDAL supports:

from osgeo import gdal
meta = gdal.GetDriverByName("GTiff").GetMetadataItem(
    "DMD_CREATIONOPTIONLIST"
)
print("ZSTD" in (meta or ""))
None
level int | None

Compression level (e.g., 1-12 for DEFLATE, 1-22 ZSTD).

None
quality int | None

Lossy-compression quality 1-100 (JPEG/WEBP).

None
blocksize int

Internal tile size; power of 2 in [64, 4096].

512
predictor str | int | None

"YES"/"STANDARD"/"FLOATING_POINT" or 1/2/3. Defaults to None, which auto-resolves per the source dtype: 2 (horizontal differencing) for integer rasters, 3 (floating-point predictor) for float rasters. Pass an explicit value to override.

None
bigtiff str

"IF_SAFER" (default), "YES", "NO", "IF_NEEDED".

'IF_SAFER'
num_threads int | str

Worker threads; "ALL_CPUS" or an int.

'ALL_CPUS'
overview_resampling str | None

nearest, average, bilinear, cubic, cubicspline, lanczos, mode, rms, gauss. Defaults to None, which auto-resolves per the source dtype: mode for categorical sources (integer dtype or a colour table) and average for continuous (float) sources. The categorical guardrail warns only when you explicitly pass an averaging method on categorical data — never for this auto-resolved default.

None
overview_count int | None

Number of overview levels (default: auto).

None
overview_compress str | None

Compression for overview IFDs.

None
tiling_scheme str | None

e.g., "GoogleMapsCompatible" for a web-optimized COG (EPSG:3857).

None
zoom_level int | None

Advanced tiling-scheme knob: pin the maximum zoom level.

None
zoom_level_strategy str

Advanced tiling-scheme knob: auto (default), lower, or upper zoom-level selection.

'auto'
aligned_levels int | None

Advanced tiling-scheme knob: number of overview levels aligned to the tiling scheme.

None
resampling str

Warp resampling when tiling_scheme or target_srs reprojects.

'nearest'
add_mask bool

Add an alpha band for transparency.

False
sparse_ok bool

Allow sparse (unfilled) tiles.

False
target_srs int | str | None

Reproject before write. Int for EPSG or a WKT / PROJ string.

None
statistics bool

Compute and embed band statistics.

True
indexes list[int] | None

0-based band indices to keep, in order (e.g. [3, 2, 1] to select and reorder bands). None keeps all bands. When set, the source is pre-processed through an in-memory gdal.Translate before the COG write.

None
out_dtype str | None

Output NumPy dtype name to cast to (e.g. "uint8", "int16"). None keeps the source dtype. The dtype-aware predictor is resolved from the post-cast dtype.

None
nodata float | int | None

NoData value to set on the output. None keeps the source NoData.

None
band_tags dict[int, dict[str, Any]] | None

Per-band metadata to stamp onto the output, keyed by 0-based band index, e.g. {0: {"name": "NDVI"}}. Useful when the source is a bare array/DataArray that carries no band descriptions.

None
colormap dict[int, tuple[int, int, int, int]] | None

Palette to attach to band 1, mapping pixel value to an (R, G, B, A) tuple, e.g. {0: (0, 0, 0, 255), 1: (255, 0, 0, 255)}.

None
metadata dict[str, Any] | None

Dataset-level metadata items to stamp onto the output.

None
config dict[str, str] | None

GDAL config options (e.g. {"GDAL_NUM_THREADS": "4"}) applied via gdal.config_options for the duration of the write. None (default) applies no extra config.

None
extra Mapping[str, Any] | list[str] | None

Additional GDAL creation options as a mapping or legacy ['KEY=VALUE',...] list. Overrides conflicting kwargs.

None

Returns:

Name Type Description
Path Path

The resolved destination path.

Raises:

Type Description
ValueError

Invalid blocksize or unknown option key.

FileNotFoundError

Parent directory does not exist.

FailedToSaveError

GDAL CreateCopy failed.

DriverNotExistError

GDAL build lacks the COG driver.

Warns:

Type Description
UserWarning

When the source looks categorical (integer dtype or has a color table) and overview_resampling is an averaging method.

Note

Setting tiling_scheme (e.g., GoogleMapsCompatible) implies a specific SRS — target_srs is ignored in that case. A UserWarning is emitted if both are provided.

Note

Larger-than-RAM / parallel writes. The GDAL COG driver does the two-pass overview layout internally and streams from the source dataset, so a raster bigger than RAM can be COG-encoded as long as the source is on-disk (or a /vsi* file) rather than a fully in-RAM array — anchor a MEM dataset with to_file(path) first if needed. There is no truly dask-parallel COG writer yet: to_file(compute=False) returns a dask.delayed that wraps the synchronous GDAL write (GeoTIFF writes are serialised by GDAL's own file lock), so it defers scheduling, not memory or per-tile parallelism. For parallel cloud writes use a Zarr-backed output.

Examples:

  • Write a compressed COG from an in-memory Dataset:
    >>> import numpy as np  # doctest: +SKIP
    >>> from pyramids.dataset import Dataset  # doctest: +SKIP
    >>> arr = np.random.rand(256, 256).astype("float32")  # doctest: +SKIP
    >>> ds = Dataset.create_from_array(  # doctest: +SKIP
    ...     arr, top_left_corner=(0, 0), cell_size=0.001, epsg=4326,
    ... )
    >>> out = ds.to_cog("out.tif", compress="ZSTD")  # doctest: +SKIP
    >>> out.name  # doctest: +SKIP
    'out.tif'
    
  • Produce a web-optimized COG for a tile server:
    >>> web = ds.to_cog("web.tif", tiling_scheme="GoogleMapsCompatible")  # doctest: +SKIP
    >>> reopened = Dataset.read_file(web)  # doctest: +SKIP
    >>> reopened.epsg  # doctest: +SKIP
    3857
    
  • Forward additional GDAL options through extra:
    >>> _ = ds.to_cog(  # doctest: +SKIP
    ...     "precise.tif",
    ...     compress="LERC",
    ...     extra={"MAX_Z_ERROR": 0.001},
    ... )
    
Source code in src/pyramids/dataset/engines/cog.py
def to_cog(
    self,
    path: str | Path,
    *,
    profile: str | None = None,
    compress: str | None = None,
    level: int | None = None,
    quality: int | None = None,
    blocksize: int = 512,
    predictor: str | int | None = None,
    bigtiff: str = "IF_SAFER",
    num_threads: int | str = "ALL_CPUS",
    overview_resampling: str | None = None,
    overview_count: int | None = None,
    overview_compress: str | None = None,
    tiling_scheme: str | None = None,
    zoom_level: int | None = None,
    zoom_level_strategy: str = "auto",
    aligned_levels: int | None = None,
    resampling: str = "nearest",
    add_mask: bool = False,
    sparse_ok: bool = False,
    target_srs: int | str | None = None,
    statistics: bool = True,
    indexes: list[int] | None = None,
    out_dtype: str | None = None,
    nodata: float | int | None = None,
    band_tags: dict[int, dict[str, Any]] | None = None,
    colormap: dict[int, tuple[int, int, int, int]] | None = None,
    metadata: dict[str, Any] | None = None,
    config: dict[str, str] | None = None,
    extra: Mapping[str, Any] | list[str] | None = None,
) -> Path:
    """Save the dataset as a Cloud Optimized GeoTIFF.

    Args:
        path: Destination path. Parent directory must exist.
        profile: Named compression preset (case-insensitive) — one of
            `deflate`, `zstd`, `lzw`, `packbits`, `jpeg`, `webp`,
            `lerc`, `lerc_deflate`, `lerc_zstd`, `raw`. Seeds the
            compression options; explicit `compress`/`level`/`quality`
            and `extra` override it. `jpeg`/`webp` enforce dtype/band
            constraints (Byte; 1-3 / 3-4 bands).
        compress: Compression method. `DEFLATE`, `LZW`, and
            `NONE` are guaranteed by every GDAL build. `JPEG`
            is almost always available. `ZSTD`, `WEBP`,
            `LERC`, `LERC_DEFLATE`, and `LERC_ZSTD` require
            the GDAL build to have been compiled with the
            corresponding library (libzstd / libwebp / LERC); on
            a GDAL build lacking them, the COG driver will raise
            at write time. To probe what your GDAL supports:

            ```python
            from osgeo import gdal
            meta = gdal.GetDriverByName("GTiff").GetMetadataItem(
                "DMD_CREATIONOPTIONLIST"
            )
            print("ZSTD" in (meta or ""))
            ```
        level: Compression level (e.g., 1-12 for DEFLATE, 1-22 ZSTD).
        quality: Lossy-compression quality 1-100 (JPEG/WEBP).
        blocksize: Internal tile size; power of 2 in [64, 4096].
        predictor: `"YES"`/`"STANDARD"`/`"FLOATING_POINT"` or 1/2/3.
            Defaults to `None`, which auto-resolves per the source
            dtype: `2` (horizontal differencing) for integer rasters,
            `3` (floating-point predictor) for float rasters. Pass an
            explicit value to override.
        bigtiff: `"IF_SAFER"` (default), `"YES"`, `"NO"`,
            `"IF_NEEDED"`.
        num_threads: Worker threads; `"ALL_CPUS"` or an int.
        overview_resampling: `nearest`, `average`, `bilinear`,
            `cubic`, `cubicspline`, `lanczos`, `mode`,
            `rms`, `gauss`. Defaults to `None`, which auto-resolves
            per the source dtype: `mode` for categorical sources
            (integer dtype or a colour table) and `average` for
            continuous (float) sources. The categorical guardrail
            warns only when *you* explicitly pass an averaging method
            on categorical data — never for this auto-resolved default.
        overview_count: Number of overview levels (default: auto).
        overview_compress: Compression for overview IFDs.
        tiling_scheme: e.g., `"GoogleMapsCompatible"` for a
            web-optimized COG (EPSG:3857).
        zoom_level: Advanced tiling-scheme knob: pin the maximum zoom level.
        zoom_level_strategy: Advanced tiling-scheme knob: `auto` (default),
            `lower`, or `upper` zoom-level selection.
        aligned_levels: Advanced tiling-scheme knob: number of overview
            levels aligned to the tiling scheme.
        resampling: Warp resampling when `tiling_scheme` or
            `target_srs` reprojects.
        add_mask: Add an alpha band for transparency.
        sparse_ok: Allow sparse (unfilled) tiles.
        target_srs: Reproject before write. Int for EPSG or a WKT
            / PROJ string.
        statistics: Compute and embed band statistics.
        indexes: 0-based band indices to keep, in order (e.g. `[3, 2, 1]`
            to select and reorder bands). `None` keeps all bands. When
            set, the source is pre-processed through an in-memory
            `gdal.Translate` before the COG write.
        out_dtype: Output NumPy dtype name to cast to (e.g. `"uint8"`,
            `"int16"`). `None` keeps the source dtype. The dtype-aware
            predictor is resolved from the *post-cast* dtype.
        nodata: NoData value to set on the output. `None` keeps the
            source NoData.
        band_tags: Per-band metadata to stamp onto the output, keyed by
            0-based band index, e.g. `{0: {"name": "NDVI"}}`. Useful when
            the source is a bare array/DataArray that carries no band
            descriptions.
        colormap: Palette to attach to band 1, mapping pixel value to an
            `(R, G, B, A)` tuple, e.g. `{0: (0, 0, 0, 255), 1: (255, 0, 0, 255)}`.
        metadata: Dataset-level metadata items to stamp onto the output.
        config: GDAL config options (e.g. `{"GDAL_NUM_THREADS": "4"}`)
            applied via `gdal.config_options` for the duration of the
            write. `None` (default) applies no extra config.
        extra: Additional GDAL creation options as a mapping or
            legacy `['KEY=VALUE',...]` list. Overrides
            conflicting kwargs.

    Returns:
        Path: The resolved destination path.

    Raises:
        ValueError: Invalid blocksize or unknown option key.
        FileNotFoundError: Parent directory does not exist.
        FailedToSaveError: GDAL CreateCopy failed.
        DriverNotExistError: GDAL build lacks the COG driver.

    Warnings:
        UserWarning: When the source looks categorical (integer
            dtype or has a color table) and `overview_resampling`
            is an averaging method.

    Note:
        Setting `tiling_scheme` (e.g., `GoogleMapsCompatible`)
        implies a specific SRS — `target_srs` is ignored in that
        case. A `UserWarning` is emitted if both are provided.

    Note:
        **Larger-than-RAM / parallel writes.** The GDAL COG driver does the
        two-pass overview layout internally and *streams* from the source
        dataset, so a raster bigger than RAM can be COG-encoded as long as
        the source is **on-disk** (or a `/vsi*` file) rather than a fully
        in-RAM array — anchor a MEM dataset with `to_file(path)` first if
        needed. There is no truly dask-parallel COG writer yet:
        `to_file(compute=False)` returns a `dask.delayed` that wraps the
        *synchronous* GDAL write (GeoTIFF writes are serialised by GDAL's
        own file lock), so it defers *scheduling*, not memory or per-tile
        parallelism. For parallel cloud writes use a Zarr-backed output.

    Examples:
        - Write a compressed COG from an in-memory Dataset:
            ```python
            >>> import numpy as np  # doctest: +SKIP
            >>> from pyramids.dataset import Dataset  # doctest: +SKIP
            >>> arr = np.random.rand(256, 256).astype("float32")  # doctest: +SKIP
            >>> ds = Dataset.create_from_array(  # doctest: +SKIP
            ...     arr, top_left_corner=(0, 0), cell_size=0.001, epsg=4326,
            ... )
            >>> out = ds.to_cog("out.tif", compress="ZSTD")  # doctest: +SKIP
            >>> out.name  # doctest: +SKIP
            'out.tif'

            ```
        - Produce a web-optimized COG for a tile server:
            ```python
            >>> web = ds.to_cog("web.tif", tiling_scheme="GoogleMapsCompatible")  # doctest: +SKIP
            >>> reopened = Dataset.read_file(web)  # doctest: +SKIP
            >>> reopened.epsg  # doctest: +SKIP
            3857

            ```
        - Forward additional GDAL options through `extra`:
            ```python
            >>> _ = ds.to_cog(  # doctest: +SKIP
            ...     "precise.tif",
            ...     compress="LERC",
            ...     extra={"MAX_Z_ERROR": 0.001},
            ... )

            ```
    """
    validate_blocksize(blocksize)
    if tiling_scheme is not None and target_srs is not None:
        warnings.warn(
            "Both tiling_scheme and target_srs provided; "
            "tiling_scheme wins and target_srs is ignored.",
            UserWarning,
            stacklevel=2,
        )
        target_srs = None

    # Build the effective source (PB-4): when band-subsetting, casting the
    # dtype, or (re)setting NoData, pre-process through an in-memory
    # gdal.Translate so the predictor/overview policy below — and the COG
    # write itself — see the *output* bands, not the original source.
    source_ds, source_band0 = self._effective_source(
        indexes, out_dtype, nodata, band_tags, colormap, metadata
    )

    # Resolve a named profile (PB-5): it seeds the compression options;
    # explicit kwargs and `extra` override it. jpeg/webp enforce dtype/band
    # constraints against the *effective* source.
    profile_opts: dict[str, Any] = {}
    if profile is not None:
        validate_profile(
            profile,
            gdal.GetDataTypeName(source_band0.DataType),
            source_ds.RasterCount,
        )
        profile_opts = profile_options(profile)
    eff_compress = (
        compress
        if compress is not None
        else profile_opts.get("COMPRESS", "DEFLATE")
    )
    eff_level = level if level is not None else profile_opts.get("LEVEL")
    eff_quality = quality if quality is not None else profile_opts.get("QUALITY")
    profile_extra = {
        k: v
        for k, v in profile_opts.items()
        if k not in ("COMPRESS", "LEVEL", "QUALITY")
    }

    # Single house policy lives here (ARC-1): `to_cog` resolves the
    # dtype-dependent defaults so a direct `ds.to_cog(...)` and the
    # `write_cog(...)` facade — which now just delegates here — produce
    # identical output for identical input.
    if predictor is None:
        # Per-dtype predictor (ARC-2): 2 for integer, 3 for float. GeoTIFF
        # bands share a dtype, so band 0 decides for the whole file. Pass an
        # explicit `predictor=` to override for an (atypical) mixed source.
        predictor = resolve_cog_predictor(source_band0.DataType)
    caller_chose_resampling = overview_resampling is not None
    if overview_resampling is None:
        # Category-safe default (ARC-3): `mode` for integer/colour-table
        # sources, `average` for continuous. Chosen so the default never
        # corrupts categorical rasters and never trips the guardrail below.
        overview_resampling = default_cog_overview_resampling(
            source_band0.DataType, source_band0.GetColorTable() is not None
        )
    if caller_chose_resampling:
        # Only warn when the *caller* explicitly asked for an averaging
        # resampler on categorical data — never for a default we picked.
        self._warn_if_categorical_with_averaging(
            overview_resampling, band=source_band0
        )

    num_threads_str = (
        num_threads if isinstance(num_threads, str) else str(num_threads)
    )
    defaults: dict[str, Any] = {
        "COMPRESS": eff_compress,
        "LEVEL": eff_level,
        "QUALITY": eff_quality,
        **profile_extra,
        "BLOCKSIZE": blocksize,
        "PREDICTOR": predictor,
        "BIGTIFF": bigtiff,
        "NUM_THREADS": num_threads_str,
        "OVERVIEW_RESAMPLING": overview_resampling,
        "OVERVIEW_COUNT": overview_count,
        "OVERVIEW_COMPRESS": overview_compress,
        "TILING_SCHEME": tiling_scheme,
        "ZOOM_LEVEL": zoom_level,
        "ZOOM_LEVEL_STRATEGY": zoom_level_strategy,
        "ALIGNED_LEVELS": aligned_levels,
        "WARP_RESAMPLING": (resampling if (tiling_scheme or target_srs) else None),
        "ADD_ALPHA": True if add_mask else None,
        "SPARSE_OK": True if sparse_ok else None,
        "STATISTICS": "YES" if statistics else None,
    }
    if target_srs is not None:
        defaults["TARGET_SRS"] = (
            f"EPSG:{target_srs}" if isinstance(target_srs, int) else target_srs
        )

    options = merge_options(defaults, extra)
    with config_context(config):
        self._translate_with_statistics_retry(path, options, src=source_ds)
    return Path(path)

to_cog_bytes(**kwargs) #

Encode the dataset as a COG and return the file contents as bytes.

Writes the COG to an in-memory GDAL /vsimem/ file (no temp file on disk), reads the bytes back, and unlinks the virtual file. Useful for uploading a COG directly to an object store (S3 / GCS / Azure) without touching the local filesystem.

Parameters:

Name Type Description Default
**kwargs Any

Forwarded verbatim to :meth:to_cog (e.g. compress, blocksize, predictor, extra). The same house defaults and dtype-aware resolution apply.

{}

Returns:

Name Type Description
bytes bytes

The complete COG file contents.

Raises:

Type Description
FailedToSaveError

GDAL failed to encode the COG.

Examples:

  • Encode an in-memory Dataset to COG bytes and upload them:
    >>> from pyramids.dataset import Dataset  # doctest: +SKIP
    >>> ds = Dataset.read_file("scene.tif")  # doctest: +SKIP
    >>> blob = ds.to_cog_bytes(compress="ZSTD")  # doctest: +SKIP
    >>> len(blob) > 0  # doctest: +SKIP
    True
    >>> blob[:2] in (b"II", b"MM")  # TIFF byte-order marker  # doctest: +SKIP
    True
    
Source code in src/pyramids/dataset/engines/cog.py
def to_cog_bytes(self, **kwargs: Any) -> bytes:
    """Encode the dataset as a COG and return the file contents as bytes.

    Writes the COG to an in-memory GDAL ``/vsimem/`` file (no temp file on
    disk), reads the bytes back, and unlinks the virtual file. Useful for
    uploading a COG directly to an object store (S3 / GCS / Azure) without
    touching the local filesystem.

    Args:
        **kwargs: Forwarded verbatim to :meth:`to_cog` (e.g. ``compress``,
            ``blocksize``, ``predictor``, ``extra``). The same house
            defaults and dtype-aware resolution apply.

    Returns:
        bytes: The complete COG file contents.

    Raises:
        FailedToSaveError: GDAL failed to encode the COG.

    Examples:
        - Encode an in-memory Dataset to COG bytes and upload them:
            ```python
            >>> from pyramids.dataset import Dataset  # doctest: +SKIP
            >>> ds = Dataset.read_file("scene.tif")  # doctest: +SKIP
            >>> blob = ds.to_cog_bytes(compress="ZSTD")  # doctest: +SKIP
            >>> len(blob) > 0  # doctest: +SKIP
            True
            >>> blob[:2] in (b"II", b"MM")  # TIFF byte-order marker  # doctest: +SKIP
            True

            ```
    """
    vsi_path = f"/vsimem/{uuid.uuid4().hex}.tif"
    try:
        self.to_cog(vsi_path, **kwargs)
        handle = gdal.VSIFOpenL(vsi_path, "rb")
        if handle is None:
            raise FailedToSaveError(
                f"could not reopen in-memory COG at {vsi_path}"
            )
        try:
            gdal.VSIFSeekL(handle, 0, 2)  # SEEK_END
            size = gdal.VSIFTellL(handle)
            gdal.VSIFSeekL(handle, 0, 0)  # SEEK_SET
            data = gdal.VSIFReadL(1, size, handle)
        finally:
            gdal.VSIFCloseL(handle)
    finally:
        gdal.Unlink(vsi_path)
    return bytes(data)

The write_cog facade#

pyramids.dataset.cog.facade.write_cog(data, output, *, crs=None, transform=None, nodata=None, options=None, validate=True, strict=False) #

Write raster data to disk as a Cloud Optimized GeoTIFF.

Thin convenience facade over :meth:pyramids.dataset.engines.cog.COG.to_cog. It accepts a wider range of inputs (NumPy array, xarray.DataArray, gdal.Dataset, path, or :class:~pyramids.dataset.Dataset), normalises them into a :class:~pyramids.dataset.Dataset, then delegates the entire write to to_cog — which owns all COG policy: the house defaults (DEFLATE, 512px tiles, BIGTIFF=IF_SAFER, NUM_THREADS=ALL_CPUS, embedded statistics), the dtype-aware predictor (2 for integer, 3 for float), the dtype-aware default overview resampling (mode for categorical, average for continuous), and the STATISTICS retry. Because policy lives in one place, write_cog and a direct ds.to_cog(...) produce identical output for identical input. Any caller-supplied options are forwarded as extra and override the defaults. By default the result is round-tripped through :func:pyramids.dataset.cog.validate.validate.

Parameters:

Name Type Description Default
data Any

Source raster. Accepted forms:

  • :class:~pyramids.dataset.Dataset — used directly.
  • :class:osgeo.gdal.Dataset — wrapped in a :class:Dataset.
  • :class:numpy.ndarray — requires crs and transform.
  • :class:xarray.DataArray — geotransform is derived from the spatial coordinates; CRS from crs / DataArray metadata.
  • :class:str / :class:~pathlib.Path — an existing raster.
required
output str | Path

Destination path. The parent directory must exist.

required
crs Any | None

CRS for the NumPy-array form (EPSG int, "EPSG:XXXX", WKT, or PROJ string). Also used as a fallback for DataArrays.

None
transform tuple[float, ...] | None

6-tuple GDAL geotransform; required for the array form.

None
nodata float | int | None

NoData scalar. Passed to array construction or set on a pre-built dataset.

None
options CreationOptions | None

Caller overrides merged on top of :data:PYRAMIDS_COG_DEFAULTS. Keys are GDAL COG driver options (validated downstream). When PREDICTOR is absent it is auto-resolved from the raster dtype.

None
validate bool

When True (default), validate the written file and raise :class:RuntimeError if it is not a valid COG.

True
strict bool

Promote validation warnings to errors.

False

Returns:

Type Description
Path

A (output_path, report) tuple. report is a

ValidationReport | None

class:~pyramids.dataset.cog.validate.ValidationReport when

tuple[Path, ValidationReport | None]

validate is True, otherwise None.

Raises:

Type Description
ValueError

Required crs/transform missing for an array.

TypeError

data is an unsupported type.

RuntimeError

validate is True and the file failed COG validation.

Examples:

  • Write a COG from a NumPy array (predictor auto-resolves to 3 for float):
    >>> import numpy as np  # doctest: +SKIP
    >>> arr = np.random.rand(256, 256).astype("float32")  # doctest: +SKIP
    >>> path, report = write_cog(  # doctest: +SKIP
    ...     arr, "out.tif", crs=4326,
    ...     transform=(0.0, 0.01, 0.0, 10.0, 0.0, -0.01),
    ... )
    >>> report.is_valid  # doctest: +SKIP
    True
    
  • Re-encode an existing raster with overrides and skip validation:
    >>> path, report = write_cog(  # doctest: +SKIP
    ...     "plain.tif", "scene_cog.tif",
    ...     options={"COMPRESS": "ZSTD", "LEVEL": 18},
    ...     validate=False,
    ... )
    >>> report is None  # doctest: +SKIP
    True
    
Source code in src/pyramids/dataset/cog/facade.py
def write_cog(
    data: Any,
    output: str | Path,
    *,
    crs: Any | None = None,
    transform: tuple[float, ...] | None = None,
    nodata: float | int | None = None,
    options: CreationOptions | None = None,
    validate: bool = True,
    strict: bool = False,
) -> tuple[Path, ValidationReport | None]:
    """Write raster data to disk as a Cloud Optimized GeoTIFF.

    Thin convenience facade over
    :meth:`pyramids.dataset.engines.cog.COG.to_cog`. It accepts a wider
    range of inputs (NumPy array, ``xarray.DataArray``, ``gdal.Dataset``,
    path, or :class:`~pyramids.dataset.Dataset`), normalises them into a
    :class:`~pyramids.dataset.Dataset`, then **delegates the entire write
    to ``to_cog``** — which owns all COG policy: the house defaults
    (DEFLATE, 512px tiles, ``BIGTIFF=IF_SAFER``, ``NUM_THREADS=ALL_CPUS``,
    embedded statistics), the dtype-aware predictor (``2`` for integer,
    ``3`` for float), the dtype-aware default overview resampling
    (``mode`` for categorical, ``average`` for continuous), and the
    ``STATISTICS`` retry. Because policy lives in one place, ``write_cog``
    and a direct ``ds.to_cog(...)`` produce identical output for identical
    input. Any caller-supplied ``options`` are forwarded as ``extra`` and
    override the defaults. By default the result is round-tripped through
    :func:`pyramids.dataset.cog.validate.validate`.

    Args:
        data: Source raster. Accepted forms:

            - :class:`~pyramids.dataset.Dataset` — used directly.
            - :class:`osgeo.gdal.Dataset` — wrapped in a :class:`Dataset`.
            - :class:`numpy.ndarray` — requires ``crs`` and ``transform``.
            - :class:`xarray.DataArray` — geotransform is derived from the
              spatial coordinates; CRS from ``crs`` / DataArray metadata.
            - :class:`str` / :class:`~pathlib.Path` — an existing raster.
        output: Destination path. The parent directory must exist.
        crs: CRS for the NumPy-array form (EPSG int, ``"EPSG:XXXX"``, WKT,
            or PROJ string). Also used as a fallback for DataArrays.
        transform: 6-tuple GDAL geotransform; required for the array form.
        nodata: NoData scalar. Passed to array construction or set on a
            pre-built dataset.
        options: Caller overrides merged on top of
            :data:`PYRAMIDS_COG_DEFAULTS`. Keys are GDAL COG driver
            options (validated downstream). When ``PREDICTOR`` is absent it
            is auto-resolved from the raster dtype.
        validate: When ``True`` (default), validate the written file and
            raise :class:`RuntimeError` if it is not a valid COG.
        strict: Promote validation warnings to errors.

    Returns:
        A ``(output_path, report)`` tuple. ``report`` is a
        :class:`~pyramids.dataset.cog.validate.ValidationReport` when
        ``validate`` is ``True``, otherwise ``None``.

    Raises:
        ValueError: Required ``crs``/``transform`` missing for an array.
        TypeError: ``data`` is an unsupported type.
        RuntimeError: ``validate`` is ``True`` and the file failed COG
            validation.

    Examples:
        - Write a COG from a NumPy array (predictor auto-resolves to 3 for
          float):
            ```python
            >>> import numpy as np  # doctest: +SKIP
            >>> arr = np.random.rand(256, 256).astype("float32")  # doctest: +SKIP
            >>> path, report = write_cog(  # doctest: +SKIP
            ...     arr, "out.tif", crs=4326,
            ...     transform=(0.0, 0.01, 0.0, 10.0, 0.0, -0.01),
            ... )
            >>> report.is_valid  # doctest: +SKIP
            True

            ```
        - Re-encode an existing raster with overrides and skip validation:
            ```python
            >>> path, report = write_cog(  # doctest: +SKIP
            ...     "plain.tif", "scene_cog.tif",
            ...     options={"COMPRESS": "ZSTD", "LEVEL": 18},
            ...     validate=False,
            ... )
            >>> report is None  # doctest: +SKIP
            True

            ```
    """
    ds = _normalize_to_dataset(data, crs, transform, nodata)

    # Single write policy lives in COG.to_cog (ARC-1): house defaults, the
    # dtype-aware predictor (ARC-2), the category-safe default overview
    # resampling (ARC-3), and the STATISTICS retry (ARC-4) are all applied
    # there. write_cog only normalises the input and forwards the caller's
    # overrides as `extra`, so write_cog and a direct ds.to_cog(...) produce
    # identical output for identical input.
    output_path = ds.to_cog(output, extra=options or None)

    report: ValidationReport | None = None
    if validate:
        report = _validate_file(output_path, strict=strict)
        if not report.is_valid:
            raise RuntimeError(
                f"write_cog produced an invalid COG at {output_path}: "
                f"{report.errors}"
            )
    return output_path, report

Creation options & profiles#

pyramids.dataset.cog.options #

COG creation-option types, serialization, and validation.

Provides the :data:CreationOptions alias (a Mapping[str, Any]), the named :data:PROFILES, and pure-Python helpers used by :mod:pyramids.dataset.cog.write and :class:pyramids.dataset.engines.cog.COG:

  • :func:to_gdal_options — serialize a mapping into GDAL's ['KEY=VALUE',...] list form.
  • :func:merge_options — merge defaults with user-supplied extras (dict or legacy list[str]).
  • :func:validate_blocksize — enforce the COG driver's power-of-2-in-[64, 4096] constraint.
  • :func:validate_option_keys — gate unknown keys against :data:COG_DRIVER_OPTIONS.
  • :func:profile_options / :func:validate_profile — named compression presets.

The module has no GDAL dependency — all helpers operate on plain Python values. GDAL is invoked only at the write call site.

COG_READ_DEFAULTS = {'GDAL_DISABLE_READDIR_ON_OPEN': 'EMPTY_DIR', 'CPL_VSIL_CURL_ALLOWED_EXTENSIONS': '.tif,.tiff', 'GDAL_HTTP_MERGE_CONSECUTIVE_RANGES': 'YES', 'VSI_CACHE': 'TRUE'} module-attribute #

GDAL config options that make remote /vsicurl/ COG reads efficient.

Without GDAL_DISABLE_READDIR_ON_OPEN a remote open issues a directory listing — often the single biggest latency hit. Applied by :func:pyramids.dataset.cog.validate.validate and :func:pyramids.dataset.cog.inspect.cog_info for remote paths when the caller passes no explicit config. Pure strings (no GDAL dependency here).

profile_options(name) #

Return a copy of the named profile's creation options.

Parameters:

Name Type Description Default
name str

Profile name (case-insensitive), e.g. "deflate", "zstd", "jpeg".

required

Returns:

Type Description
dict[str, Any]

A new dict of the profile's options.

Raises:

Type Description
ValueError

When name is not a known profile.

Examples:

  • Look up the zstd preset:
    >>> profile_options("zstd")
    {'COMPRESS': 'ZSTD', 'LEVEL': 9}
    
  • Names are case-insensitive:
    >>> profile_options("LZW")
    {'COMPRESS': 'LZW'}
    
  • Unknown names are rejected:
    >>> profile_options("bogus")  # doctest: +IGNORE_EXCEPTION_DETAIL
    Traceback (most recent call last):
    ...
    ValueError: unknown COG profile 'bogus'...
    
Source code in src/pyramids/dataset/cog/options.py
def profile_options(name: str) -> dict[str, Any]:
    """Return a copy of the named profile's creation options.

    Args:
        name: Profile name (case-insensitive), e.g. ``"deflate"``, ``"zstd"``,
            ``"jpeg"``.

    Returns:
        A new dict of the profile's options.

    Raises:
        ValueError: When ``name`` is not a known profile.

    Examples:
        - Look up the zstd preset:
            ```python
            >>> profile_options("zstd")
            {'COMPRESS': 'ZSTD', 'LEVEL': 9}

            ```
        - Names are case-insensitive:
            ```python
            >>> profile_options("LZW")
            {'COMPRESS': 'LZW'}

            ```
        - Unknown names are rejected:
            ```python
            >>> profile_options("bogus")  # doctest: +IGNORE_EXCEPTION_DETAIL
            Traceback (most recent call last):
            ...
            ValueError: unknown COG profile 'bogus'...

            ```
    """
    key = name.lower()
    if key not in PROFILES:
        raise ValueError(
            f"unknown COG profile {name!r}; choose from {sorted(PROFILES)}"
        )
    return dict(PROFILES[key])

validate_profile(name, dtype_name, band_count) #

Raise :class:ValueError if a source violates a profile's constraints.

Some profiles only accept specific dtypes / band counts (JPEG: Byte with 1-3 bands; WEBP: Byte with 3-4 bands). Other profiles are unconstrained and pass silently.

Parameters:

Name Type Description Default
name str

Profile name (case-insensitive).

required
dtype_name str

GDAL dtype name of the source (e.g. "Byte", "Float32").

required
band_count int

Number of bands in the source.

required

Raises:

Type Description
ValueError

When the source dtype or band count is incompatible.

Examples:

  • An unconstrained profile always passes:
    >>> validate_profile("deflate", "Float32", 4)
    
  • JPEG requires Byte with <= 3 bands:
    >>> validate_profile("jpeg", "Float32", 1)  # doctest: +IGNORE_EXCEPTION_DETAIL
    Traceback (most recent call last):
    ...
    ValueError: jpeg profile requires dtype in...
    
Source code in src/pyramids/dataset/cog/options.py
def validate_profile(name: str, dtype_name: str, band_count: int) -> None:
    """Raise :class:`ValueError` if a source violates a profile's constraints.

    Some profiles only accept specific dtypes / band counts (JPEG: ``Byte``
    with 1-3 bands; WEBP: ``Byte`` with 3-4 bands). Other profiles are
    unconstrained and pass silently.

    Args:
        name: Profile name (case-insensitive).
        dtype_name: GDAL dtype name of the source (e.g. ``"Byte"``,
            ``"Float32"``).
        band_count: Number of bands in the source.

    Raises:
        ValueError: When the source dtype or band count is incompatible.

    Examples:
        - An unconstrained profile always passes:
            ```python
            >>> validate_profile("deflate", "Float32", 4)

            ```
        - JPEG requires Byte with <= 3 bands:
            ```python
            >>> validate_profile("jpeg", "Float32", 1)  # doctest: +IGNORE_EXCEPTION_DETAIL
            Traceback (most recent call last):
            ...
            ValueError: jpeg profile requires dtype in...

            ```
    """
    key = name.lower()
    if key not in _PROFILE_DTYPE_CONSTRAINTS:
        return
    allowed_dtypes, (min_bands, max_bands) = _PROFILE_DTYPE_CONSTRAINTS[key]
    if dtype_name not in allowed_dtypes:
        raise ValueError(
            f"{key} profile requires dtype in {sorted(allowed_dtypes)}; "
            f"got {dtype_name}. Use a different profile (e.g. 'deflate')."
        )
    if not (min_bands <= band_count <= max_bands):
        raise ValueError(
            f"{key} profile requires {min_bands}-{max_bands} bands; "
            f"got {band_count}."
        )

to_gdal_options(opts) #

Serialize a mapping into GDAL's ['KEY=VALUE',...] list form.

Keys are uppercased; values are stringified via :func:_stringify (booleans become "YES"/"NO"). None values are skipped so callers can pass optional kwargs through unchanged.

Parameters:

Name Type Description Default
opts CreationOptions | None

Mapping of option names to values, or None.

required

Returns:

Type Description
list[str]

List of "KEY=VALUE" strings. Empty list when opts is None.

Examples:

  • Serialize a compression config:
    >>> to_gdal_options({"COMPRESS": "DEFLATE", "LEVEL": 9})
    ['COMPRESS=DEFLATE', 'LEVEL=9']
    
  • Booleans become GDAL's YES/NO convention:
    >>> to_gdal_options({"STATISTICS": True, "SPARSE_OK": False})
    ['STATISTICS=YES', 'SPARSE_OK=NO']
    
  • None values are dropped so optional kwargs flow through unchanged:
    >>> to_gdal_options({"COMPRESS": "LZW", "LEVEL": None})
    ['COMPRESS=LZW']
    >>> to_gdal_options(None)
    []
    
Source code in src/pyramids/dataset/cog/options.py
def to_gdal_options(opts: CreationOptions | None) -> list[str]:
    """Serialize a mapping into GDAL's `['KEY=VALUE',...]` list form.

    Keys are uppercased; values are stringified via :func:`_stringify`
    (booleans become `"YES"`/`"NO"`). `None` values are skipped so
    callers can pass optional kwargs through unchanged.

    Args:
        opts: Mapping of option names to values, or `None`.

    Returns:
        List of `"KEY=VALUE"` strings. Empty list when `opts` is `None`.

    Examples:
        - Serialize a compression config:
            ```python
            >>> to_gdal_options({"COMPRESS": "DEFLATE", "LEVEL": 9})
            ['COMPRESS=DEFLATE', 'LEVEL=9']

            ```
        - Booleans become GDAL's YES/NO convention:
            ```python
            >>> to_gdal_options({"STATISTICS": True, "SPARSE_OK": False})
            ['STATISTICS=YES', 'SPARSE_OK=NO']

            ```
        - None values are dropped so optional kwargs flow through unchanged:
            ```python
            >>> to_gdal_options({"COMPRESS": "LZW", "LEVEL": None})
            ['COMPRESS=LZW']
            >>> to_gdal_options(None)
            []

            ```
    """
    result: list[str]
    if opts is None:
        result = []
    else:
        result = [
            f"{str(k).upper()}={_stringify(v)}"
            for k, v in opts.items()
            if v is not None
        ]
    return result

merge_options(defaults, extra) #

Merge default options with user-supplied extras; extras win.

Accepts extra as either a mapping {'KEY': value} or the legacy list form ['KEY=VALUE',...] used by :meth:pyramids.dataset.Dataset.to_file. All keys in the returned dict are uppercased; None values from either source are dropped.

Parameters:

Name Type Description Default
defaults CreationOptions

Baseline options (typically derived from kwargs in :meth:pyramids.dataset.engines.COG.to_cog).

required
extra CreationOptions | list[str] | None

User-provided overrides as a mapping, list[str], or None.

required

Returns:

Name Type Description
New dict[str, Any]

class:dict with all keys uppercased and None values

dict[str, Any]

removed; extra entries override defaults on conflict.

Raises:

Type Description
ValueError

When a legacy list-form entry lacks =.

Examples:

  • Dict extras override defaults on conflict:
    >>> merge_options({"COMPRESS": "DEFLATE"}, {"COMPRESS": "ZSTD"})
    {'COMPRESS': 'ZSTD'}
    
  • Legacy list-of-string form is also accepted for back-compat:
    >>> merge_options({"COMPRESS": "DEFLATE"}, ["LEVEL=9"])
    {'COMPRESS': 'DEFLATE', 'LEVEL': '9'}
    
  • None extras returns a copy of the defaults:
    >>> merge_options({"COMPRESS": "DEFLATE"}, None)
    {'COMPRESS': 'DEFLATE'}
    
Source code in src/pyramids/dataset/cog/options.py
def merge_options(
    defaults: CreationOptions,
    extra: CreationOptions | list[str] | None,
) -> dict[str, Any]:
    """Merge default options with user-supplied extras; extras win.

    Accepts `extra` as either a mapping `{'KEY': value}` or the legacy
    list form `['KEY=VALUE',...]` used by
    :meth:`pyramids.dataset.Dataset.to_file`. All keys in the returned
    dict are uppercased; `None` values from either source are dropped.

    Args:
        defaults: Baseline options (typically derived from kwargs in
            :meth:`pyramids.dataset.engines.COG.to_cog`).
        extra: User-provided overrides as a mapping, `list[str]`, or
            `None`.

    Returns:
        New :class:`dict` with all keys uppercased and `None` values
        removed; `extra` entries override `defaults` on conflict.

    Raises:
        ValueError: When a legacy list-form entry lacks `=`.

    Examples:
        - Dict extras override defaults on conflict:
            ```python
            >>> merge_options({"COMPRESS": "DEFLATE"}, {"COMPRESS": "ZSTD"})
            {'COMPRESS': 'ZSTD'}

            ```
        - Legacy list-of-string form is also accepted for back-compat:
            ```python
            >>> merge_options({"COMPRESS": "DEFLATE"}, ["LEVEL=9"])
            {'COMPRESS': 'DEFLATE', 'LEVEL': '9'}

            ```
        - None extras returns a copy of the defaults:
            ```python
            >>> merge_options({"COMPRESS": "DEFLATE"}, None)
            {'COMPRESS': 'DEFLATE'}

            ```
    """
    merged: dict[str, Any] = {
        str(k).upper(): v for k, v in defaults.items() if v is not None
    }
    if extra is None:
        pass
    elif isinstance(extra, list):
        merged.update(_parse_list_extra(extra))
    else:
        merged.update({str(k).upper(): v for k, v in extra.items() if v is not None})
    return merged

validate_blocksize(value) #

Raise :class:ValueError if value is not a valid COG tile size.

The GDAL COG driver requires BLOCKSIZE to be a power of 2 in the closed range [64, 4096].

Parameters:

Name Type Description Default
value int

Proposed blocksize.

required

Raises:

Type Description
ValueError

If value is outside the allowed set.

Examples:

  • Valid power-of-2 blocksizes return silently:
    >>> validate_blocksize(512)
    >>> validate_blocksize(256)
    
  • Non-power-of-2 is rejected:
    >>> validate_blocksize(500) # doctest: +IGNORE_EXCEPTION_DETAIL
    Traceback (most recent call last):
    ...
    ValueError: blocksize must be a power of 2 in [64, 4096]; got 500...
    
  • Out-of-range values are rejected:
    >>> validate_blocksize(32) # doctest: +IGNORE_EXCEPTION_DETAIL
    Traceback (most recent call last):
    ...
    ValueError: blocksize must be a power of 2 in [64, 4096]; got 32...
    
Source code in src/pyramids/dataset/cog/options.py
def validate_blocksize(value: int) -> None:
    """Raise :class:`ValueError` if `value` is not a valid COG tile size.

    The GDAL COG driver requires `BLOCKSIZE` to be a power of 2 in
    the closed range [64, 4096].

    Args:
        value: Proposed blocksize.

    Raises:
        ValueError: If `value` is outside the allowed set.

    Examples:
        - Valid power-of-2 blocksizes return silently:
            ```python
            >>> validate_blocksize(512)
            >>> validate_blocksize(256)

            ```
        - Non-power-of-2 is rejected:
            ```python
            >>> validate_blocksize(500) # doctest: +IGNORE_EXCEPTION_DETAIL
            Traceback (most recent call last):
            ...
            ValueError: blocksize must be a power of 2 in [64, 4096]; got 500...

            ```
        - Out-of-range values are rejected:
            ```python
            >>> validate_blocksize(32) # doctest: +IGNORE_EXCEPTION_DETAIL
            Traceback (most recent call last):
            ...
            ValueError: blocksize must be a power of 2 in [64, 4096]; got 32...

            ```
    """
    if value not in _VALID_BLOCKSIZES:
        raise ValueError(
            f"blocksize must be a power of 2 in [64, 4096]; got {value}. "
            f"Valid values: {sorted(_VALID_BLOCKSIZES)}"
        )

validate_option_keys(opts) #

Raise :class:ValueError for any key not in :data:COG_DRIVER_OPTIONS.

Keys are compared case-insensitively.

Parameters:

Name Type Description Default
opts CreationOptions

Mapping of option names to values.

required

Raises:

Type Description
ValueError

If any key is not a recognized COG driver option.

Examples:

  • Known keys return silently:
    >>> validate_option_keys({"COMPRESS": "DEFLATE"})
    >>> validate_option_keys({"BLOCKSIZE": 512, "BIGTIFF": "IF_SAFER"})
    
  • Unknown keys raise ValueError naming the offender:
    >>> validate_option_keys({"NONSENSE": "x"}) # doctest: +IGNORE_EXCEPTION_DETAIL
    Traceback (most recent call last):
    ...
    ValueError: Unknown COG driver option(s): ['NONSENSE']...
    
  • Empty mapping is accepted:
    >>> validate_option_keys({})
    
Source code in src/pyramids/dataset/cog/options.py
def validate_option_keys(opts: CreationOptions) -> None:
    """Raise :class:`ValueError` for any key not in :data:`COG_DRIVER_OPTIONS`.

    Keys are compared case-insensitively.

    Args:
        opts: Mapping of option names to values.

    Raises:
        ValueError: If any key is not a recognized COG driver option.

    Examples:
        - Known keys return silently:
            ```python
            >>> validate_option_keys({"COMPRESS": "DEFLATE"})
            >>> validate_option_keys({"BLOCKSIZE": 512, "BIGTIFF": "IF_SAFER"})

            ```
        - Unknown keys raise ValueError naming the offender:
            ```python
            >>> validate_option_keys({"NONSENSE": "x"}) # doctest: +IGNORE_EXCEPTION_DETAIL
            Traceback (most recent call last):
            ...
            ValueError: Unknown COG driver option(s): ['NONSENSE']...

            ```
        - Empty mapping is accepted:
            ```python
            >>> validate_option_keys({})

            ```
    """
    unknown = {str(k).upper() for k in opts.keys()} - COG_DRIVER_OPTIONS
    if unknown:
        raise ValueError(
            f"Unknown COG driver option(s): {sorted(unknown)}. "
            f"Valid options: {sorted(COG_DRIVER_OPTIONS)}"
        )