# NBVAL_IGNORE_OUTPUT
import matplotlib
from IPython.display import HTML
from pyramids.dataset import DatasetCollection
%matplotlib inline
Read multiple files¶
Reading multiple files is being done on two steps
- First use the
read_multiple_filesmethod to parse files names and construct the array that will later have the values - Second use the
open_datacubemethod to open all the raster files and read a specific band from each file
read_multiple_files¶
The given path points to a directory where all the raster we want to read exists The content of the directory is as following
# NBVAL_IGNORE_OUTPUT
import os
path = r"../../../examples/data/geotiff/rhine"
os.listdir(path)
['Qtot_1979-01-01.tif', 'Qtot_1979-01-02.tif', 'Qtot_1979-01-03.tif', 'Qtot_1979-01-04.tif', 'Qtot_1979-01-05.tif', 'Qtot_1979-01-06.tif', 'Qtot_1979-01-07.tif', 'Qtot_1979-01-08.tif', 'Qtot_1979-01-09.tif', 'Qtot_1979-01-10.tif']
We need raster names to follow a certain pattern in order to be able to read them with a certain order, in our case there is a date in the file name and using this date we will read the rasters and assign the values of each one in the right location in the array based on their date
Regex pattern¶
the parameter regex_string accepts any regex string and apply it to all file names to extract the string that is
needed to order the files, this string can be an integer or a date
here are some examples for how the regex_string should look like for different file names
- or
- if there is a number at the beginning of the name
fname = "MSWEP_YYYY_M_D.tif" regex_string = r"\d{4}\d{1}\d{1}"
fname = "MSWEP_YYYY.MM.DD.tif" regex_string = r"\d{4}.\d{2}.\d{2}"
fname = "1_MSWEP_YYYY_M_D.tif" regex_string = r"\d+"
m_dataset = DatasetCollection.read_multiple_files(
path,
with_order=True,
regex_string=r"\d{4}-\d{2}-\d{2}",
date=True,
file_name_data_fmt="%Y-%m-%d",
)
Now the Datacube object is created and we can check it by printing the object
print(m_dataset)
Files: 10
Cell size: 5000.0
EPSG: 4647
Dimension: 125 * 93
Mask: 2147483647.0
open_datacube¶
To read a specific band from each file and assign it to its location in the array we can pass the band index to the
open_datacube method, (the default band value is 0)
m_dataset.open_multi_dataset()
# NBVAL_IGNORE_OUTPUT
print(m_dataset.values)
[[[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]] [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]] [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]] ... [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]] [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]] [[0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] ... [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.] [0. 0. 0. ... 0. 0. 0.]]]
plot¶
To animate the Datacube use the plot function
# NBVAL_IGNORE_OUTPUT
cleo = m_dataset.plot(
exclude_value=0, text_loc=(1, 3), color_scale="linear", vmin=1, vmax=100
)
print(cleo)
Matplotlib backend set to inline for static plots in Jupyter notebook
Min: 1.0
Max: 3375.0
Exclude values: [2147483647.0, 0]
RGB: False
# NBVAL_IGNORE_OUTPUT
HTML(cleo.anim.to_jshtml())