Statistics

Collection of useful statistic related functionalities.

statistics.binned_mean_and_variance(x, y, bins, weights=None)[source]

Calculates the mean and variance of y in the bins of x. This is effectively a ROOT.TProfile.

Parameters
  • x – data values that are used for binning (e.g. energies)

  • y – data values that should be avaraged

  • bins – bin borders

return: <y>_i, sigma(y)_i : mean and variance of y in bins of x

statistics.mean_and_variance(y, weights=None)[source]

Weighted mean and variance of array y and weights

Parameters
  • y – array for which to calculate mean and variance

  • weights – optional weights for weighted mean and variance

Returns

mean, weights (both like y dimensions)

statistics.median(data, weights)[source]

Weighted median of an array with respect to the last axis. Alias for quantile(data, weights, 0.5). from https://github.com/nudomarinero/wquantiles/blob/master/weighted.py

Parameters
  • data – ndarray for which to calculate the weighted median

  • weights – ndarray with weights for data, it must have the same size of the last axis of data.

statistics.mid(x)[source]

Midpoints of a given array

Parameters

x – array with dimension bigger 1

Returns

all the midpoints as numpy array (shape: x.size -1)

statistics.quantile(data, weights, quant)[source]

Weighted quantile of an array with respect to the last axis. from https://github.com/nudomarinero/wquantiles/blob/master/weighted.py

Parameters
  • data – ndarray for which to calculate weighted quantile

  • weights – ndarray with weights for data, it must have the same size of the last axis of data.

  • quant – quantile to compute, it must have a value between 0 and 1.

Returns

weighted quantiles with respect to last axis

statistics.quantile_1d(data, weights, quant)[source]

Compute the weighted quantile of a 1D numpy array. from https://github.com/nudomarinero/wquantiles/blob/master/weighted.py

Parameters

data – 1d-array for which to calculate mean and variance

:param weights : 1d-array with weights for data (same shape of data) :param quant: quantile to compute, it must have a value between 0 and 1. :return: quantiles

statistics.random_choice_multi(a, k, p)[source]

Pull multiple sets of k random samples out of a given array or list using individual probabilities for each set.

Parameters
  • a – data values to sample from, 1d array

  • k – single value

  • p – probability vectors, array of shape (N, len(a)), with N being the number of sets

Returns

array of shape (N, k)

statistics.sym_interval_around(x, xm, alpha=0.32)[source]

In a distribution represented by a set of samples, find the interval that contains (1-alpha)/2 to each the left and right of xm. If xm is too marginal to allow both sides to contain (1-alpha)/2, add the remaining fraction to the other side.

Parameters
  • x – data values in the distribution

  • xm – symmetric center value for which to find the interval

  • alpha – fraction that will be outside of the interval (default 0.32, corresponds to 68 percent quantile)

Returns

interval (lower, upper) which contains 1-alpha symmetric around xm