core#

Main functions.

Functions

get_area(*coords)

Return areas of bins.

get_axes_from_specs(bins, ranges, data)

Check bins input and convert to boost objects.

get_coord(name, ax, dtype, flow)

Return bins coordinates for output.

get_edges(coord)

Return edges positions.

get_shape(axes, flow)

Return shape of histogram.

get_widths(coord)

Return bins width.

histogram(x, /[, bins, range, weights, ...])

Compute histogram of a single variable.

histogram2d(x, y, /[, bins, range, weights, ...])

Compute 2-dimensional histogram.

histogramdd(*data[, bins, range, weights, ...])

Compute N-dimensional histogram.

is_any_dask(data)

Check if any the variables are in dask format.

normalize(hist, bins_all, bins_normalize)

Normalize histogram along given variables.

histogram(x, /, bins=10, range=None, weights=None, density=False, dims=None, flow=False, storage=None)#

Compute histogram of a single variable.

Parameters:
  • x (DataArray) – The array to compute the histogram from.

  • bins (Axis | int) –

    Bins specification that can be:

    • a Boost Axis object.

    • an int for the number of bins in a Regular axis where the minimum and maximum values are specified by range or computed from data.

  • range (tuple[float | None, float | None] | None) – The lower and upper range of the bins. If either is left to None, it will be computed with x.min() or x.max().

  • weights (DataArray | None) – Array of weights, broadcastable against the input data. Each value in data only contributes its associated weight towards the bin count (instead of 1). If density is False, the values of the returned histogram are equal to the sum of the weights belonging to the samples falling into each bin.

  • density (bool) – If False (default), returns the number of samples in each bin. If True, returns the probability density function at the bin, bin_count / sample_count / bin_area.

  • dims (Collection[Hashable] | None) – Dimensions to compute the histogram along to. If left to None the data is flattened along all axes.

  • flow (bool) – If True, include flow bins in the output.

  • storage (Storage) – Storage object used by the histogram. If None, the default one is used (Double). Currently, accumulator storage (with more than one value stored) are not supported.

Returns:

histogram – DataArray named <x name>_histogram. The bins coordinates is named <x name>_bins.

Return type:

DataArray

histogram2d(x, y, /, bins=10, range=None, weights=None, density=False, dims=None, flow=False, storage=None)#

Compute 2-dimensional histogram.

Parameters:
  • x (DataArray) – The arrays to compute the histogram from. They must be broadcastable against each other.

  • y (DataArray) – The arrays to compute the histogram from. They must be broadcastable against each other.

  • bins (Axis | int | Sequence[Axis | int]) –

    Bins specification that can be:

    • a Boost Axis object.

    • an int for the number of bins in a Regular axis where the minimum and maximum values are specified by range or computed from data.

    If a single specification is passed, it will be reused for all variables. Otherwise a sequence of specification must be passed in the same order as the x and y.

  • range (Sequence[tuple[float | None, float | None]] | None) – Sequence of lower and upper ranges of the bins for each variable. If either is left to None, it will be computed with x.min() or x.max().

  • weights (DataArray | None) – Array of weights, broadcastable against the input data. Each value in data only contributes its associated weight towards the bin count (instead of 1). If density is True weights are normalized to 1. If density is False, the values of the returned histogram are equal to the sum of the weights belonging to the samples falling into each bin.

  • density (bool) – If False (default), returns the number of samples in each bin. If True, returns the probability density function at the bin, bin_count / sample_count / bin_area.

  • dims (Collection[Hashable] | None) – Dimensions to compute the histogram along to. If left to None the data is flattened along all axes.

  • flow (bool) – If True, include flow bins in the output.

  • storage (Storage) – Storage object used by the histogram. If None, the default one is used (Double). Currently, accumulator storage (with more than one value stored) are not supported.

Returns:

histogram – DataArray named <x name>_<y name>_histogram. The bins coordinates are named <variable name>_bins.

Return type:

DataArray

histogramdd(*data, bins=10, range=None, weights=None, density=False, dims=None, flow=False, storage=None)#

Compute N-dimensional histogram.

Parameters:
  • data (DataArray) – The arrays to compute the histogram from. To compute a multi-dimensional histogram supply a sequence of as many arrays as the histogram dimensionality. Arrays must be broadcastable against each other. If any underlying data is a dask array, other inputs will be transformed into a dask array of a single chunk.

  • bins (Sequence[Axis | int] | Axis | int) –

    Bins specification that can be:

    • a Boost Axis object.

    • an int for the number of bins in a Regular axis where the minimum and maximum values are specified by range or computed from data.

    If a single specification is passed, it will be reused for all variables. Otherwise a sequence of specification must be passed in the same order as the input data.

  • range (Sequence[tuple[float | None, float | None]] | None) – Sequence of lower and upper ranges of the bins for each variable. If either is left to None, it will be computed with x.min() or x.max().

  • weights (DataArray | None) – Array of weights, broadcastable against the input data. Each value in data only contributes its associated weight towards the bin count (instead of 1). If density is True weights are normalized to 1. If density is False, the values of the returned histogram are equal to the sum of the weights belonging to the samples falling into each bin.

  • density (bool) – If False (default), returns the number of samples in each bin. If True, returns the probability density function at the bin, bin_count / sample_count / bin_area.

  • dims (Collection[Hashable] | None) – Dimensions to compute the histogram along to. If left to None the data is flattened along all axes.

  • flow (bool) – If True, include flow bins in the output.

  • storage (Storage) – Storage object used by the histogram. If None, the default one is used (Double). Currently, accumulator storage (with more than one value stored) are not supported.

Returns:

histogram – DataArray named <variables names separated by an underscore>_histogram. The bins coordinates are named <variable name>_bins.

Return type:

DataArray

get_shape(axes, flow)#

Return shape of histogram.

Parameters:
Return type:

tuple[int, …]

get_axes_from_specs(bins, ranges, data)#

Check bins input and convert to boost objects.

Raises:

ValueError – If there are not as much bins specifications as data arrays.

Parameters:
Return type:

tuple[Axis, …]

is_any_dask(data)#

Check if any the variables are in dask format.

Only return true if Dask is imported.

Parameters:

data (Sequence[DataArray])

Return type:

bool

get_coord(name, ax, dtype, flow)#

Return bins coordinates for output.

Will include attributes suited for accessor.

Parameters:
Return type:

DataArray

get_edges(coord)#

Return edges positions.

Parameters:

coord (DataArray)

Return type:

DataArray

get_widths(coord)#

Return bins width.

Parameters:

coord (DataArray)

Return type:

DataArray

get_area(*coords)#

Return areas of bins.

Parameters:

coords (DataArray)

Return type:

DataArray

normalize(hist, bins_all, bins_normalize)#

Normalize histogram along given variables.

Parameters:
Return type:

DataArray