synthesizer.utils.stats¶
Statistical functions for weighted means, medians, and quantiles.
This module provides functions to calculate weighted means, medians, and quantiles. All of these are helper wrappers around existing numpy functionality.
- Example usage:
- from synthesizer.utils.stats import (
weighted_mean, weighted_median, weighted_quantile, binned_weighted_quantile,
)
data = [1, 2, 3, 4, 5] weights = [0.1, 0.2, 0.3, 0.4, 0.5] mean = weighted_mean(data, weights) median = weighted_median(data, weights) quantiles = weighted_quantile(
data, [0.25, 0.5, 0.75], sample_weight=weights,
) binned_quantiles = binned_weighted_quantile(
data, data, weights, bins=[0, 2, 4, 6], quantiles=[0.25, 0.5]
)
Functions
- synthesizer.utils.stats.binned_weighted_quantile(x, y, weights, bins, quantiles)[source]¶
Calculate the weighted quantiles of y in bins of x.
- Parameters:
x (np.ndarray or list) – The x values to bin by.
y (np.ndarray or list) – The y values to calculate the quantiles of.
weights (np.ndarray or list) – The weights to apply to the y values.
bins (np.ndarray or list) – The bins to use for the x values.
quantiles (np.ndarray or list) – The quantiles to calculate.
- Returns:
The weighted quantiles of y in the bins of x.
- Return type:
np.ndarray
- synthesizer.utils.stats.n_weighted_moment(values, weights, n)[source]¶
Calculate the n-th weighted moment of the values.
- Parameters:
values (np.ndarray or list) – The values to calculate the moment of.
weights (np.ndarray or list) – The weights to apply to the values.
n (int) – The order of the moment to calculate.
- Returns:
The n-th weighted moment of the values.
- Return type:
float
- synthesizer.utils.stats.weighted_mean(data, weights)[source]¶
Calculate the weighted mean.
This is just a helpful alias around np.average which provides a weighted mean more efficient than using a combination of np.sum and np.mean.
- Parameters:
data (list or np.ndarray) – The data to calculate the mean of.
weights (list or np.ndarray) – The weights to apply to the data.
- Returns:
The weighted mean.
- Return type:
float
- synthesizer.utils.stats.weighted_median(data, weights)[source]¶
Calculate the weighted median.
- Parameters:
data (list or numpy.array) – The data to calculate the median of.
weights (list or numpy.array) – The weights to apply to the data.
- synthesizer.utils.stats.weighted_quantile(values, quantiles, sample_weight=None, values_sorted=False, old_style=False)[source]¶
Calculate a weighted quantile.
Taken from From https://stackoverflow.com/a/29677616/1718096.
Very close to numpy.percentile, but supports weights.
- Parameters:
values (np.ndarray or list) – The values to compute the quantiles of.
quantiles (np.ndarray or list) – The quantiles to compute. Must be in [0, 1].
sample_weight (np.ndarray or list) – The weights to apply to the values.
values_sorted (bool) – If True, then values will not be sorted before the calculation.
old_style (bool) – If True, then the computed quantiles will be returned in the same style as numpy.percentile.
- Returns:
The computed quantiles.
- Return type:
np.ndarray