accessor#
Accessor to manipulate histograms.
An accessor registered as hist is made available on xarray.DataArray for
various histogram manipulations.
Functions
|
Remove flow bins from a coordinate. |
Classes
Histogram accessor for DataArrays. |
- class HistDataArrayAccessor(obj)#
Histogram accessor for DataArrays.
Important
Accessor registered under
hist.Validity
The coordinates of the bins must be named
<variable>_bins.The array must be named as
<variable(s)_name>_<histogram or pdf>. histogram if it is not normalized, and pdf if it is normalized as a probability density function. If the histogram is multi-dimensional, the variables names must be separated by underscores. For instance:Temp_Sal_histogram.
Each bins coordinate may contain attributes:
bin_type: the class name of the Boost axis type that was used. If not present, the accessor will assume the bins are regularly spaced and will try to infer the rightmost edge.right_edge: the rightmost edge position, only necessary for Regular andVariable bins.
underflowandoverflow: integers that indicate if the corresponding flow bins are present. If not present, will assume no flow bins.
Backend for computations
Statistics computations are actually delegated to
scipy.stats.rv_histogram. Therefore, it does not support chunking along the bins dimensions (which should not be a problem in most cases).- Parameters:
obj (DataArray)
- bins(variable=None, flow=True)#
Return bins coordinates for a given variable.
- edges(variable=None, flow=True)#
Return the edges of the bins (including the right most edge).
Not supported for bins of the discrete types “IntCategory” and “StrCategory”.
- centers(variable=None, flow=True)#
Return the center of all bins.
Not supported for bin type “StrCategory”. IntCategory bins centers are
bins+0.5. The centers of flow bins are the same as their position (np.inffor instance).
- widths(variable=None, flow=True)#
Return the widths of all bins.
Widths of flow bins and StrCategory are 1.
- areas(variables=None, flow=True)#
Return the areas of the bins.
The product of the widths of all specified bins. The areas of points that correspond to a flow bin in at least one dimension is equal to one.
- normalize(variables=None)#
Return a normalized histogram.
Will raise if the histogram is already normalized.
- remove_flow(variables=None)#
Remove flow bins.
- apply_func(func, variable=None, flow=True, **kwargs)#
Apply a function to a bins coordinate.
- Parameters:
func (Callable[[DataArray], DataArray]) – Callable that must transform the N+1 edges. It does not need to take care of the right_edge attribute.
variable (str | None) – The variable to transform. (This is equivalent to computing an histogram of
func(ds["variable"], **kwargs)). It can be omitted for 1D histograms.kwargs – Passed to the function.
flow (bool)
- Return type:
- scale(factor, variable=None, flow=True)#
Transform a bins coordinate by scaling it.
- ppf(q, variable=None)#
Return the percent point function at q.
Uses
scipy.stats.rv_histogramfor computation.
- median(variable=None)#
Return the median value of the distribution.
Uses
scipy.stats.rv_histogramfor computation.
- mean(variable=None)#
Return the mean value of the distribution.
Uses
scipy.stats.rv_histogramfor computation.
- cdf(x, variable=None)#
Return the cumulative distribution function at x.
Uses
scipy.stats.rv_histogramfor computation.
- var(variable=None)#
Return the variance of the distribution.
Uses
scipy.stats.rv_histogramfor computation.
- std(variable=None)#
Return the standard deviation of the distribution.
Uses
scipy.stats.rv_histogramfor computation.
- moment(order, variable=None)#
Return the nth moment of the distribution.
Uses
scipy.stats.rv_histogramfor computation.
- interval(confidence, variable=None)#
Return the confidence interval with equal areas around the median.
The interval is computed as
[ppf(p_tail); ppf(1-p_tail)]withp_tail = (1-confidence)/2.Uses
scipy.stats.rv_histogramfor computation.- Parameters:
- Returns:
dataset – Dataset with variables confidence_low and confidence_high, corresponding to the low and high values of the confidence interval.
- Return type: