Stacking Two-Dimensional Arrays for Analyzing Earthquake Data with Dask in Python 2026
Two-dimensional arrays are commonly used in earthquake analysis for spectrograms, station × time matrices, or feature matrices per event. Stacking multiple 2D arrays into a higher-dimensional structure (e.g., events × time × stations) enables efficient parallel processing across many seismic events or recording stations.
1. Stacking 2D Arrays from Multiple Events
import dask.array as da
import h5py
# Collect 2D arrays from multiple earthquake events
arrays = []
with h5py.File("earthquake_data.h5", "r") as f:
for i in range(300): # 300 events
# Each event has a 2D array: (time_samples, stations)
arr = f[f"/events/event_{i}/waveforms"][:]
darr = da.from_array(arr, chunks=(1000, 50))
arrays.append(darr)
# Stack along a new axis → shape becomes (events, time, stations)
stacked = da.stack(arrays, axis=0)
print("Stacked shape:", stacked.shape) # (300, time_samples, stations)
print("Chunks:", stacked.chunks)
2. Practical Analysis After Stacking
# Maximum amplitude per event across all stations and time
max_amplitudes = stacked.max(axis=(1, 2)).compute()
# Mean waveform per station across all events
mean_per_station = stacked.mean(axis=0).compute()
# Rolling maximum along time dimension for each event
from dask.array import map_overlap
rolling_max = map_overlap(
lambda x: x.max(axis=1),
stacked,
depth=(0, 500, 0), # overlap along time dimension
boundary='reflect'
)
3. Best Practices for Stacking 2D Arrays in Earthquake Analysis (2026)
- Stack along a new leading axis (events) to keep time and station dimensions intact
- Choose chunk sizes along the time and station dimensions based on your analysis needs
- Use
da.stack()when creating a new dimension from multiple 2D arrays - Rechunk after stacking if the resulting chunks are unbalanced
- Persist the stacked array if you will perform multiple downstream analyses
- Use
map_overlap()for rolling window calculations along the time dimension
Conclusion
Stacking two-dimensional arrays is a fundamental operation when analyzing earthquake data with Dask. By stacking spectrograms, station × time matrices, or feature matrices from multiple events, you can perform parallel computations across hundreds or thousands of seismic records efficiently. In 2026, proper chunking and strategic use of da.stack() combined with map_overlap() for rolling operations is the standard approach for large-scale seismic analysis.
Next steps:
- Try stacking 2D arrays (e.g., waveforms or spectrograms) from multiple earthquake events in your dataset