Skip to content

API Reference

MLArray Module

mlarray.mlarray.MLArray

MLArray(array: Optional[Union[ndarray, str, Path]] = None, spacing: Optional[Union[List, Tuple, ndarray]] = None, origin: Optional[Union[List, Tuple, ndarray]] = None, direction: Optional[Union[List, Tuple, ndarray]] = None, meta: Optional[Union[Dict, Meta]] = None, channel_axis: Optional[int] = None, num_threads: int = 1, copy: Optional[MLArray] = None)

Initializes a MLArray instance.

The MLArray file format (".mla") is a Blosc2-compressed container with standardized metadata support for N-dimensional medical images.

Parameters:

Name Type Description Default
array Optional[Union[ndarray, str, Path]]

Input data or file path. Use a numpy ndarray for in-memory arrays, or a string/Path to load a ".b2nd" or ".mla" file. If None, an empty MLArray instance is created.

None
spacing Optional[Union[List, Tuple, ndarray]]

Spacing per spatial axis. Provide a list/tuple/ndarray with length equal to the number of spatial dimensions (e.g., [sx, sy, sz]).

None
origin Optional[Union[List, Tuple, ndarray]]

Origin per axis. Provide a list/tuple/ndarray with length equal to the number of spatial dimensions.

None
direction Optional[Union[List, Tuple, ndarray]]

Direction cosine matrix. Provide a 2D list/tuple/ndarray with shape (ndims, ndims) for spatial dimensions.

None
meta Optional[Dict | Meta]

Free-form metadata dictionary or Meta instance. Must be JSON-serializable when saving. If meta is passed as a Dict, it will internally be converted into a Meta object with the dict being interpreted as meta.image metadata.

None
channel_axis Optional[int]

Axis index that represents channels in the array (e.g., 0 for CHW or -1 for HWC). If None, the array is treated as purely spatial.

None
num_threads int

Number of threads for Blosc2 operations.

1
copy Optional[MLArray]

Another MLArray instance to copy metadata fields from. If provided, its metadata overrides any metadata set via arguments.

None

affine property

affine: ndarray

Computes the affine transformation matrix for the image.

Returns:

Name Type Description
list ndarray

Affine matrix with shape (ndims + 1, ndims + 1), or None if no array is loaded.

direction property

direction

Returns the image direction.

Returns:

Name Type Description
list

Direction cosine matrix with shape (ndims, ndims).

dtype property

dtype

Returns the dtype of the array.

Returns:

Type Description

np.dtype: Dtype of the underlying array, or None if no array is loaded.

ndim property

ndim: int

Returns the number of dimensions of the array.

Returns:

Name Type Description
int int

Number of dimensions, or None if no array is loaded.

origin property

origin

Returns the image origin.

Returns:

Name Type Description
list

Origin per spatial axis with length equal to the number of

spatial dimensions.

rotation property

rotation

Extracts the rotation matrix from the affine matrix.

Returns:

Name Type Description
list

Rotation matrix with shape (ndims, ndims), or None if no array is loaded.

scale property

scale

Extracts the scaling factors from the affine matrix.

Returns:

Name Type Description
list

Scaling factors per axis with length equal to the number of spatial dimensions, or None if no array is loaded.

shape property

shape

Returns the shape of the array.

Returns:

Name Type Description
tuple

Shape of the underlying array, or None if no array is loaded.

shear property

shear

Computes the shear matrix from the affine matrix.

Returns:

Name Type Description
list

Shear matrix with shape (ndims, ndims), or None if no array is loaded.

spacing property

spacing

Returns the image spacing.

Returns:

Name Type Description
list

Spacing per spatial axis with length equal to the number of

spatial dimensions.

translation property

translation

Extracts the translation vector from the affine matrix.

Returns:

Name Type Description
list

Translation vector with length equal to the number of spatial dimensions, or None if no array is loaded.

close

close()

Flush metadata and close the underlying store.

After closing, the MLArray instance no longer has an attached array.

comp_blosc2_params

comp_blosc2_params(image_size: Union[Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int]], patch_size: Union[Tuple[int, int], Tuple[int, int, int]], channel_axis: Optional[int] = None, bytes_per_pixel: int = 4, l1_cache_size_per_core_in_bytes: int = 32768, l3_cache_size_per_core_in_bytes: int = 1441792, safety_factor: float = 0.8)

Computes a recommended block and chunk size for saving arrays with Blosc v2.

Blosc2 NDIM documentation: "Having a second partition allows for greater flexibility in fitting different partitions to different CPU cache levels. Typically, the first partition (also known as chunks) should be sized to fit within the L3 cache, while the second partition (also known as blocks) should be sized to fit within the L2 or L1 caches, depending on whether the priority is compression ratio or speed." (Source: https://www.blosc.org/posts/blosc2-ndim-intro/)

Our approach is not fully optimized for this yet. Currently, we aim to fit the uncompressed block within the L1 cache, accepting that it might occasionally spill over into L2, which we consider acceptable.

Note: This configuration is specifically optimized for nnU-Net data loading, where each read operation is performed by a single core, so multi-threading is not an option.

The default cache values are based on an older Intel 4110 CPU with 32KB L1, 128KB L2, and 1408KB L3 cache per core. We haven't further optimized for modern CPUs with larger caches, as our data must still be compatible with the older systems.

Parameters:

Name Type Description Default
image_size Union[Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int]]

Image shape. Use a 2D, 3D, or 4D size; 2D/3D inputs are internally expanded to 4D (with channels first).

required
patch_size Union[Tuple[int, int], Tuple[int, int, int]]

Patch size for spatial dimensions. Use a 2-tuple (x, y) or 3-tuple (x, y, z).

required
channel_axis Optional[int]

Axis index for channels in the original array. If set, the size is moved to channels-first for cache calculations.

None
bytes_per_pixel int

Number of bytes per element. Defaults to 4 for float32.

4
l1_cache_size_per_core_in_bytes int

L1 cache per core in bytes.

32768
l3_cache_size_per_core_in_bytes int

L3 cache per core in bytes.

1441792
safety_factor float

Safety factor to avoid filling caches.

0.8

Returns:

Type Description

Tuple[List[int], List[int]]: Recommended chunk size and block size.

load classmethod

load(filepath: Union[str, Path], num_threads: int = 1)

Loads a Blosc2-compressed file. Both MLArray ('.mla') and Blosc2 ('.b2nd') files are supported.

WARNING

MLArray supports both ".b2nd" and ".mla" files. The MLArray format standard and standardized metadata are honored only for ".mla". For ".b2nd", metadata is ignored when loading.

Parameters:

Name Type Description Default
filepath Union[str, Path]

Path to the Blosc2 file to be loaded. The filepath needs to have the extension ".b2nd" or ".mla".

required
num_threads int

Number of threads to use for loading the file.

1

Raises:

Type Description
RuntimeError

If the file extension is not ".b2nd" or ".mla".

open classmethod

open(filepath: Union[str, Path], shape: Optional[Union[List, Tuple, ndarray]] = None, dtype: Optional[dtype] = None, channel_axis: Optional[int] = None, mmap: str = 'r', patch_size: Optional[Union[int, List, Tuple]] = 'default', chunk_size: Optional[Union[int, List, Tuple]] = None, block_size: Optional[Union[int, List, Tuple]] = None, num_threads: int = 1, cparams: Optional[Dict] = None, dparams: Optional[Dict] = None)

Open an existing Blosc2 file or create a new one with memory mapping.

This method supports both MLArray (".mla") and plain Blosc2 (".b2nd") files. When creating a new file, both shape and dtype must be provided.

WARNING

MLArray supports both ".b2nd" and ".mla" files. The MLArray format standard and standardized metadata are honored only for ".mla". For ".b2nd", metadata is ignored when loading.

Parameters:

Name Type Description Default
filepath Union[str, Path]

Target file path. Must end with ".b2nd" or ".mla".

required
shape Optional[Union[List, Tuple, ndarray]]

Shape of the array to create. If provided, a new file is created. Length must match the full array dimensionality (including channels if present).

None
dtype Optional[dtype]

Numpy dtype for a newly created array.

None
channel_axis Optional[int]

Axis index for channels in the array. Used for patch/chunk/block calculations.

None
mmap str

Blosc2 mmap mode. One of "r", "r+", "w+", "c".

'r'
patch_size Optional[Union[int, List, Tuple]]

Patch size hint for chunk/block optimization. Provide an int for isotropic sizes or a list/tuple with length equal to the number of spatial dimensions. Use "default" to use the default patch size of 192.

'default'
chunk_size Optional[Union[int, List, Tuple]]

Explicit chunk size. Provide an int or tuple/list with length equal to the array dimensions. Ignored when patch_size is provided.

None
block_size Optional[Union[int, List, Tuple]]

Explicit block size. Provide an int or tuple/list with length equal to the array dimensions. Ignored when patch_size is provided.

None
num_threads int

Number of threads for Blosc2 operations.

1
cparams Optional[Dict]

Blosc2 compression parameters.

None
dparams Optional[Dict]

Blosc2 decompression parameters.

None

Returns:

Name Type Description
MLArray

The current instance (for chaining).

Raises:

Type Description
RuntimeError

If the file extension is invalid, if shape/dtype are inconsistent, or if mmap mode is invalid for creation.

save

save(filepath: Union[str, Path], patch_size: Optional[Union[int, List, Tuple]] = 'default', chunk_size: Optional[Union[int, List, Tuple]] = None, block_size: Optional[Union[int, List, Tuple]] = None, num_threads: int = 1, cparams: Optional[Dict] = None, dparams: Optional[Dict] = None)

Saves the array to a Blosc2-compressed file. Both MLArray ('.mla') and Blosc2 ('.b2nd') files are supported.

WARNING

MLArray supports both ".b2nd" and ".mla" files. The MLArray format standard and standardized metadata are honored only for ".mla". For ".b2nd", metadata is ignored when saving.

Parameters:

Name Type Description Default
filepath Union[str, Path]

Path to save the file. Must end with ".b2nd" or ".mla".

required
patch_size Optional[Union[int, List, Tuple]]

Patch size hint for chunk/block optimization. Provide an int for isotropic sizes or a list/tuple with length equal to the number of dimensions. Use "default" to use the default patch size of 192.

'default'
chunk_size Optional[Union[int, List, Tuple]]

Explicit chunk size. Provide an int or a tuple/list with length equal to the number of dimensions, or None to let Blosc2 decide. Ignored when patch_size is not None.

None
block_size Optional[Union[int, List, Tuple]]

Explicit block size. Provide an int or a tuple/list with length equal to the number of dimensions, or None to let Blosc2 decide. Ignored when patch_size is not None.

None
num_threads int

Number of threads to use for saving the file.

1

Raises:

Type Description
RuntimeError

If the file extension is not ".b2nd" or ".mla".

to_numpy

to_numpy()

Return the underlying data as a NumPy array.

Returns:

Type Description

np.ndarray: A NumPy view or copy of the stored array data.

Raises:

Type Description
TypeError

If no array data is loaded.

Metadata Module

mlarray.meta.BaseMeta dataclass

BaseMeta()

Base class for metadata containers.

Subclasses should implement _validate_and_cast to coerce and validate fields after initialization or mutation.

copy_from

copy_from(other: T, *, overwrite: bool = False) -> None

Copy fields from another instance of the same class.

Parameters:

Name Type Description Default
other T

Source instance.

required
overwrite bool

When True, overwrite all fields. When False, only fill destination fields that are "unset" (None or empty containers). Nested BaseMeta fields are merged recursively unless the entire destination sub-meta is default, in which case it is replaced.

False

Raises:

Type Description
TypeError

If other is not the same class as self.

ensure classmethod

ensure(x: Any) -> T

Coerce x into an instance of cls.

Parameters:

Name Type Description Default
x Any

None, an instance of cls, or a mapping of fields.

required

Returns:

Type Description
T

An instance of cls.

Raises:

Type Description
TypeError

If x is not None, cls, or a mapping.

from_mapping classmethod

from_mapping(d: Mapping[str, Any]) -> T

Construct an instance from a mapping.

Parameters:

Name Type Description Default
d Mapping[str, Any]

Input mapping matching dataclass field names.

required

Returns:

Type Description
T

A new instance of cls.

Raises:

Type Description
TypeError

If d is not a Mapping.

KeyError

If unknown keys are present.

is_default

is_default() -> bool

Return True if this equals a default-constructed instance.

reset

reset() -> None

Reset all fields to their default or None.

to_mapping

to_mapping(*, include_none: bool = True) -> Dict[str, Any]

Serialize to a mapping, recursively expanding nested BaseMeta.

Parameters:

Name Type Description Default
include_none bool

Include fields with None values when True.

True

Returns:

Type Description
Dict[str, Any]

A dict of field names to serialized values.

to_plain

to_plain(*, include_none: bool = False) -> Any

Convert to plain Python objects recursively.

Parameters:

Name Type Description Default
include_none bool

Include fields with None values when True.

False

Returns:

Type Description
Any

A dict of field values, with nested BaseMeta expanded. SingleKeyBaseMeta

Any

overrides this to return its wrapped value.

mlarray.meta.SingleKeyBaseMeta dataclass

SingleKeyBaseMeta()

Bases: BaseMeta

BaseMeta subclass that wraps a single field as a raw value.

value property writable

value: Any

Return the wrapped value.

ensure classmethod

ensure(x: Any) -> SK

Coerce input into an instance of cls.

Parameters:

Name Type Description Default
x Any

None, instance of cls, mapping, or raw value.

required

Returns:

Type Description
SK

An instance of cls.

from_mapping classmethod

from_mapping(d: Any) -> SK

Construct from either schema-shaped mapping or raw value.

Parameters:

Name Type Description Default
d Any

None, mapping, or raw value.

required

Returns:

Type Description
SK

A new instance of cls.

set

set(v: Any) -> None

Set the wrapped value.

to_mapping

to_mapping(*, include_none: bool = True) -> Dict[str, Any]

Serialize to a mapping with the single key.

Parameters:

Name Type Description Default
include_none bool

Include the key when the value is None.

True

Returns:

Type Description
Dict[str, Any]

A dict with the single field name as the key, or an empty dict.

to_plain

to_plain(*, include_none: bool = False) -> Any

Return the wrapped value for plain output.

Parameters:

Name Type Description Default
include_none bool

Return None when the value is None.

False

Returns:

Type Description
Any

The wrapped value or None.

mlarray.meta.Meta dataclass

Meta(original: 'MetaOriginal' = (lambda: MetaOriginal())(), extra: 'MetaExtra' = (lambda: MetaExtra())(), spatial: 'MetaSpatial' = (lambda: MetaSpatial())(), stats: 'MetaStatistics' = (lambda: MetaStatistics())(), bbox: 'MetaBbox' = (lambda: MetaBbox())(), is_seg: 'MetaIsSeg' = (lambda: MetaIsSeg())(), _blosc2: 'MetaBlosc2' = (lambda: MetaBlosc2())(), _has_array: 'MetaHasArray' = (lambda: MetaHasArray())(), _image_meta_format: 'MetaImageFormat' = (lambda: MetaImageFormat())(), _mlarray_version: 'MetaVersion' = (lambda: MetaVersion())())

Bases: BaseMeta

Top-level metadata container for mlarray.

Attributes:

Name Type Description
original 'MetaOriginal'

Image metadata from the origin source (JSON-serializable dict).

extra 'MetaExtra'

Additional metadata (JSON-serializable dict).

spatial 'MetaSpatial'

Spatial metadata (spacing, origin, direction, shape).

stats 'MetaStatistics'

Summary statistics.

bbox 'MetaBbox'

Bounding boxes.

is_seg 'MetaIsSeg'

Segmentation flag.

_blosc2 'MetaBlosc2'

Blosc2 chunking/tiling metadata.

_has_array 'MetaHasArray'

Payload presence flag.

_image_meta_format 'MetaImageFormat'

Image metadata format identifier.

_mlarray_version 'MetaVersion'

Version string for mlarray.

to_plain

to_plain(*, include_none: bool = False) -> Any

Convert to plain values, suppressing default sub-metas.

Parameters:

Name Type Description Default
include_none bool

Include None values when True.

False

Returns:

Type Description
Any

A dict of field values where default child metas are represented

Any

as None and optionally filtered out.

mlarray.meta.MetaOriginal dataclass

MetaOriginal(data: Dict[str, Any] = dict())

Bases: SingleKeyBaseMeta

Image metadata from the origin source stored as JSON-serializable dict.

Attributes:

Name Type Description
data Dict[str, Any]

Arbitrary JSON-serializable metadata.

mlarray.meta.MetaExtra dataclass

MetaExtra(data: Dict[str, Any] = dict())

Bases: SingleKeyBaseMeta

Generic extra metadata stored as JSON-serializable dict.

Attributes:

Name Type Description
data Dict[str, Any]

Arbitrary JSON-serializable metadata.

mlarray.meta.MetaSpatial dataclass

MetaSpatial(spacing: Optional[List] = None, origin: Optional[List] = None, direction: Optional[List[List]] = None, shape: Optional[List] = None, channel_axis: Optional[int] = None)

Bases: BaseMeta

Spatial metadata describing geometry and layout.

Attributes:

Name Type Description
spacing Optional[List]

Per-dimension spacing values. Length must match ndims.

origin Optional[List]

Per-dimension origin values. Length must match ndims.

direction Optional[List[List]]

Direction cosine matrix of shape [ndims, ndims].

shape Optional[List]

Array shape. Length must match ndims, or (ndims + 1) when channel_axis is set.

channel_axis Optional[int]

Index of the channel dimension, if any.

mlarray.meta.MetaStatistics dataclass

MetaStatistics(min: Optional[float] = None, max: Optional[float] = None, mean: Optional[float] = None, median: Optional[float] = None, std: Optional[float] = None, percentile_min: Optional[float] = None, percentile_max: Optional[float] = None, percentile_mean: Optional[float] = None, percentile_median: Optional[float] = None, percentile_std: Optional[float] = None, percentile_min_key: Optional[float] = None, percentile_max_key: Optional[float] = None)

Bases: BaseMeta

Numeric summary statistics for an array.

Attributes:

Name Type Description
min Optional[float]

Minimum value.

max Optional[float]

Maximum value.

mean Optional[float]

Mean value.

median Optional[float]

Median value.

std Optional[float]

Standard deviation.

percentile_min Optional[float]

Minimum percentile value.

percentile_max Optional[float]

Maximum percentile value.

percentile_mean Optional[float]

Mean percentile value.

percentile_median Optional[float]

Median percentile value.

percentile_std Optional[float]

Standard deviation of percentile values.

percentile_min_key Optional[float]

Minimum percentile key used to determine percentile_min (for example 0.05).

percentile_max_key Optional[float]

Maximum percentile key used to determine percentile_max (for example 0.95).

mlarray.meta.MetaBbox dataclass

MetaBbox(bboxes: Optional[List[List[List[int]]]] = None)

Bases: SingleKeyBaseMeta

Bounding boxes represented as per-dimension min/max pairs.

Attributes:

Name Type Description
bboxes Optional[List[List[List[int]]]]

List of bounding boxes with shape [n_boxes, ndims, 2], where each inner pair is [min, max] for a dimension. Values must be ints.

mlarray.meta.MetaIsSeg dataclass

MetaIsSeg(is_seg: Optional[bool] = None)

Bases: SingleKeyBaseMeta

Flag indicating whether the array is a segmentation mask.

Attributes:

Name Type Description
is_seg Optional[bool]

True/False when known, None when unknown.

mlarray.meta.MetaBlosc2 dataclass

MetaBlosc2(chunk_size: Optional[list] = None, block_size: Optional[list] = None, patch_size: Optional[list] = None)

Bases: BaseMeta

Metadata for Blosc2 tiling and chunking.

Attributes:

Name Type Description
chunk_size Optional[list]

List of per-dimension chunk sizes. Length must match ndims.

block_size Optional[list]

List of per-dimension block sizes. Length must match ndims.

patch_size Optional[list]

List of per-dimension patch sizes. Length must match ndims, or (ndims - 1) when a channel axis is present.

mlarray.meta.MetaHasArray dataclass

MetaHasArray(has_array: bool = False)

Bases: SingleKeyBaseMeta

Flag indicating whether an array is present.

Attributes:

Name Type Description
has_array bool

True when array data is present.

mlarray.meta.MetaImageFormat dataclass

MetaImageFormat(image_meta_format: Optional[str] = None)

Bases: SingleKeyBaseMeta

String describing the image metadata format.

Attributes:

Name Type Description
image_meta_format Optional[str]

Format identifier, or None.

mlarray.meta.MetaVersion dataclass

MetaVersion(mlarray_version: Optional[str] = None)

Bases: SingleKeyBaseMeta

Version metadata for mlarray.

Attributes:

Name Type Description
mlarray_version Optional[str]

Version string, or None.