Skip to content

json header extension

Matthew Brett edited this page Jul 9, 2014 · 20 revisions

Nibabel JSON header

A draft specification of the JSON header for Nibabel.

JSON, as y'all know, encodes strings, numbers, objects and arrays, An object is like a Python dict, with strings as keys, and an array is like a Python list.

In what follows, I will build dicts and arrays corresponding to the JSON of the header. The dicts correspond to JSON objects, and lists correspond to JSON arrays. In each case, the json.dumps of the given Python object gives the corresponding JSON string.

I'll use the term field to refer to a key, value pair from a Python dict / JSON object.

General principles

We'll try to follow some general principles from the NRRD format. In particular, we encourage you, gentle user, to remove all values from headers that correspond to the default, so that written headers only express deviation from default values. Call this the delete defaults principle.

We specify image axes by name in the header, and give the correspondence of the names to the image array axes by the order of the names. This is the axis_names field at the top level of the header.

If the user transposes or otherwise reorders the axes of the data array, the header should change only in the ordering of the axis names in axis_names. Call this the "axis transpose" principle.

The JSON header should make sense as a key, value pair store for DICOM fields using a standard way of selecting DICOM fields -- the simple DICOM principle.

The header must contain the header version

>>> hdr = dict(nipy_hdr_version='1.0')

We chose the name "nipy_hdr_version" in the hope that this would not often occur in an unrelated JSON file.

  • First version will be "1.0".
  • Versioning will use Semantic Versioning of form major.minor[.pico[-extra]] where major, minor, pico are all integers, extra may be a string, and both pico and extra are optional. Header versions with the same major value are forwards compatible -- that is, a reader that can read a header with a particular major version should be able to read any header with that major version. Versions with different major numbers do not need to be compatible in this sense.
  • All fields other than nipy_hdr_version are optional. hdr above is therefore the minimal valid header.

The header will usually contain image metadata fields

The base level header will usually also have image metadata fields giving information about the whole image. A field is an "image metadata field" if it is defined at the top level of the header. For example:

>>> hdr = dict(nipy_hdr_version='1.0',
...            Manufacturer="SIEMENS")

All image metadata fields are optional.

As for all keys in this standard, IM keys are case sensitive. IM keys that begin with a capital letter must be from the DICOM data dictionary standard short names. Call these "DICOM IM keys". This is to conform to the simple DICOM principle.

Keys beginning with "extended" will be read and written, but not further processed by a header reader / writer. If you want to put extra fields into the header that are outside this standard you could use a dict / object of form:

>>> hdr = dict(nipy_hdr_version='1.0',
...            extended=dict(my_field1=0.1, my_field2='a string'))

or:

>>> hdr = dict(nipy_hdr_version='1.0',
...            extended_mysoft=dict(mysoft_one='expensive', mysoft_two=1000))

Values for DICOM IM keys are constrained by the DICOM standard. This standard constrains values for ("nipy_hdr_version", "axis_names", "axis_metadata"). Other values have no constraint.

Questions:

  • Should all DICOM values be allowed?
  • Should DICOM values be allowed at this level that in fact refer to a particular axis, and therefore might go in the axis_metadata elements?
  • How should we relate the DICOM standard values to JSON? For example, how should we store dates and times?

The header will usually contain axis names

axis_names is a list of strings corresponding to the axes of the image data to which the header refers.

>>> hdr = dict(nipy_hdr_version='1.0',
...            axis_names=["frequency", "phase", "slice", "time"])

The names must be valid Python identifiers (should not begin with a digit, nor contain spaces etc).

There must be the same number of names as axes in the image to which the header refers. For example, the header above is valid for a 4D image but invalid for a 3D or 5D image.

The names appear in fastest-slowest order in which the image data is stored on disk. The first name in axis_names corresponds to the axis over which the data on disk varies fastest, and the last corresponds to the axis over which the data varies slowest.

For a Nifti image, nibabel (and nipy) will create an image where the axes have this same fastest to slowest ordering in memory. For example, let's say the read image is called img. img has shape (4, 5, 6, 10), and a 2-byte datatype such as int16. In the case of the nifti default fastest-slowest ordered array, the distance in memory between img[0, 0, 0, 0] and img[1, 0, 0, 0] is 2 bytes, and the distance between img[0, 0, 0, 0] and img[0, 0, 0, 1] is 4 * 5 * 6 * 2 = 240 bytes. The names in axis_names will then refer to the first, second, third and fourth axes respectively. In the example above, "frequency" is the first axis and "time" is the last.

axis_names is optional iff axis_metadata is empty or absent. Otherwise, the set() of axis_names must be a superset of the union of all axis names specified in the applies_to fields of axis_metadata elements.

The header will often contain axis data

axis_metadata is a list of axis data elements.

Each axis data element in the axis_metadata list gives data that applies to a particular axis, or combination of axes. axis_metadata can be empty:

>>> hdr['axis_metadata'] = []

but we prefer you delete this section if empty, following the delete default principle.

The axis data element

An axis data element must contain a field applies_to, with a value that is a list that contains one or more values from axis_names. From the above example, the following would be valid axis data elements:

>>> element = dict(applies_to=['time'])
>>> element = dict(applies_to=['slice'])
>>> element = dict(applies_to=['slice', 'time'])

The element will usually also have axis metadata fields. For example:

>>> element = dict(applies_to=['time'],
...                TimeSliceVector=[0, 2, 4])

As for image metadata keys, keys that begin with a capital letter are DICOM standard short names.

A single axis name for applies_to specifies that any axis metadata values in the element apply to the named axis.

In this case, axis metadata values may be:

  • a scalar. The value applies to every point along the corresponding image axis.
  • a vector of length N (where N is the length of the corresponding image axis). Value $v_i$ in the vector $v$ corresponds to the image slice at point $i$ on the corresponding axis.
  • an array of shape (N, ...) or (1, ...) where "..." can be any further shape. The (N, ...) case gives N vectors or arrays with one (vector, array) corresponding to each point in the image axis, The (1, ...) case gives a single vector or array corresponding to every point on the image axis.

More than one axis name for applies_to specifies that any values in the element apply to the combination of the given axes.

In the case of more than one axis for applies_to, the axis metadata values apply to the cartesian product of the image axis values. For example, if the values of applies_to == ['slice', 'time'], and the slice and time axes in the array are lengths (6, 10) respectively, then the values apply to all combinations of the 6 possible values for slice indices and the 10 possible values for the time indices. The axis metadata values in this case can be:

  • a scalar. The value applies to every combination of (slice, time)
  • an array of shape (S, T) (where S is the length of the slice axis and T is the length of the time axis). Value $a_{i,j}$ in the array $a$ corresponds to the image slice at point $i$ on the slice axis and $j$ on the time axis.
  • an array of shape (S, T, ...) or (1, 1, ...) where "..." can be any further shape. The (S, T, ...) case gives N vectors or arrays with one vector / array corresponding to each combination of slice, time points in the image, The (1, 1, ...) case gives a single vector / array corresponding to every (slice, time) point in the image.
  • In the spirit of numpy array broadcasting, we also allow a value array for (slice, time) to be of shape (S, 1, ...) or (1, T, ...) where the arrays give (respectively): one vector / array for each point of the slice axis, but applying to every value of the time axis, or; one vector / array for each point of the time axis but applyig to every point of the slice axis. Of course these may better go as axis metadata for the slice and time axes respectively. See :ref:`q_vector` for an example where this kind of specification can be useful.

In general, for a given value applies_to, we can take the corresponding axis lengths:

>>> shape_of_image = [4, 5, 6, 10]
>>> image_names = ['frequency', 'phase', 'slice', 'time']
>>> applies_to = ['slice', 'time']
>>> axis_indices = [image_names.index(name) for name in applies_to]
>>> axis_lengths = [shape_of_image[i] for i in axis_indices]
>>> axis_lengths
[6, 10]

The axis metadata value can therefore be of shape:

  • () (a scalar) (a scalar value for every combination of points)
  • axis_lengths (a scalar value for each combination of points)
  • [1 for i in axis_lengths] + any_other_list (a single array or vector for every combination of points, where the shape of the array or vector is given by any_other_list)
  • axis_lengths + any_other_list (an array or vector corresponding to each combination of points, where the shape of the array or vector is given by any_other_list)
  • axis_lengths_or_1 + any_other_list -- where axis_lengths_or_1 is a vector that, at each position i, contains either 1 or axis_length[i]. The array or vector implied by the any_other_list indices applies to every point where axis_lengths_or_1[i] == 1 for axis i, and each point along axis i where axis_lengths_or_1[i] == axis_lengths[i]. Obviously the case of [1 for i in axis_lengths] + any_other_list is a specific case in this category.

The q_vector axis metadata field

We define an axis metadata field q_vector which gives the q vector corresponding to the diffision gradients applied.

The q_vector should apply to (applies_to) four axes, of which three are the 3D spatial dimensions, and the fourth corresponds to image volume.

For example:

>>> import numpy as np
>>> element = dict(applies_to=['frequency', 'phase', 'slice', 'time'],
...                q_vector = [[[[
...                              [0, 0, 0],
...                              [1000, 0, 0],
...                              [0, 1000, 0],
...                              [0, 0, 1000],
...                              [0, 0, 0],
...                              [1000, 0, 0],
...                              [0, 1000, 0],
...                              [0, 0, 1000],
...                              [0, 0, 0],
...                              [1000, 0, 0]
...                            ]]]])
>>> np.array(element['q_vector']).shape
(1, 1, 1, 10, 3)

We specify the spatial axes in applies_to because of the axis transpose principle -- we need to know which axes the vector directions correspond to, to be robust to the case of an image transpose.

An individual (3,) vector is the unit vector expressing the direction of the gradient, multiplied by the scalar b value of the gradient. In the example, there are three b == 0 scans (corresponding to volumes 0, 4, 8), with the rest having b value of 1000.

The first value corresponds to the direction along the first named image axis ('frequency'), the second value to direction along the second named axis ('phase'), and the third corresponds to 'slice'.

Note that the q_vector is always specified in the axes of the image. This is the same convention as FSL uses for the bvals, bvecs files.

Clone this wiki locally