-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDF5 Dimension Scales #1313
Comments
[email protected] brings support for reading dimension scales. |
Hi teams, we have some applications that stores a waveform (time array + value array) and usually the time array is not uniform sampled to reduce waveform size. It would be nice to have this feature in if we want to visualize the waveform. |
@zhqrbitee any chance you could share a sample file with us? |
@axelboc : Here is code which produces an HDF5 file which uses a dimension scale to link two datasets, giving the requisitie metadata to matplotlib so that it understand the datastructure is #!/usr/bin/env python3
import h5py
import numpy
import matplotlib.pyplot as plt
from math import pi as π
def chirp(t: float):
f0 = 1e4
c = 3e8
φ0 = 0.0
return numpy.sin(φ0 + 2 * π * (c * t * t / 2 + f0 * t))
def create_nonuniform_timeseries():
# N.B.: This is to emulate a more realistic goal (plotting the output of an adaptive ODE stepper)
# without a huge amount of code:
times = numpy.random.uniform(0.0, 1e-3/2, 10000)
times = numpy.sort(times)
values = chirp(times)
with h5py.File('chirp.h5', 'w') as f:
# Create the time dataset and add dimension scale and units
time_ds = f.create_dataset('times', data=times)
time_ds.attrs['units'] = 'seconds'
time_ds.make_scale('times')
# Create the values dataset, and attach the time dataset as its dimension scale
values_ds = f.create_dataset('values', data=values)
values_ds.attrs['units'] = 'dimensionless'
values_ds.dims[0].attach_scale(time_ds)
print(f"Created 'chirp.h5' with times and values datasets.")
def read_and_plot_timeseries(filename='chirp.h5'):
with h5py.File(filename, 'r') as f:
# Find the 'values' dataset and check its dimension scale
values_ds = f['values']
values = values_ds[:]
# Iterate through attached scales to find the time dataset
scales = values_ds.dims[0]
for scale in scales:
time_ds = f[scale] # Get the time dataset by its name
times = time_ds[:] # Read the time values
# Now plot the data
plt.figure(figsize=(10, 6))
plt.plot(times, values, label='Chirp Signal')
plt.xlabel(f'Time ({time_ds.attrs['units']})')
plt.ylabel(f'Values ({values_ds.attrs['units']})')
plt.title('Chirp Signal vs. Time')
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()
break
else:
print("No dimension scale found for the 'values' dataset.")
if __name__ == '__main__':
create_nonuniform_timeseries()
read_and_plot_timeseries() Running
|
It would be nice to support HDF5's dimension scales. We've received multiple feature requests relating to those, including in a couple of recent emails. Dimension scales are apparently used quite extensively in NetCDF4 files, to describe how to plot datasets (axes, units, etc.).
In his email, Jean-Christophe describes, for instance, how to create a time-based dimension scale (which could then be "attached" to a dataset - e.g.
values.dims[0].attach_scale(time)
).More reading:
units
attribute above to indicate that a dimension scale dataset contains relative timestamps.)The text was updated successfully, but these errors were encountered: