Extension API¶

The core functions are wrapped using Pybind11 and are callable from mmu.lib._mmu_core.

Note

This section is intended mainly intended for contributors of MMU or those that know what they are doing. _mmu_core is not strictly speaking part of the public API although we intend to keep it stable.

Warning

The functions in _mmu_core perform the bare minimum of condition checks on the input arrays to prevent buffer overruns and segfaults. The function docstrings indicate what conditions are assumed, violating these is likely to result in incorrect results and or poor performance.

Confusion Matrix¶

template <typename T1, typename T2> py::array_t<int64_t> confusion_matrix(const py::array_t<T1>& y, const py::array_t<T2>& yhat)

Where T1 and T2 are one of bool, int, int64_t, float or double.

Compute the confusion matrix given true labels y and estimated labels yhat.

Parameters:

y – true labels
yhat – predicted labels

Raises:

runtime_error – if an array is not aligned or not contiguous.

The arrays are assumed to be one-dimensional, contiguous in memory and have equal size. The size of the smallest array is used.

template <typename T1, typename T2> py::array_t<int64_t> confusion_matrix_score(const py::array_t<T1>& y, const py::array_t<T2>& score, const T2 threshold)

Where T1 is one of bool, int, int64_t, float, double and T2 is float or double.

Compute the confusion matrix given true labels y and classifier scores score.

Parameters:

y – true labels
score – classifier scores
threshold – inclusive classification threshold

Raises:

runtime_error – if an array is not aligned or not contiguous.

The arrays are assumed to be one-dimensional, contiguous in memory and have equal size. The size of the smallest array is used.

template <typename T1, typename T2> py::array_t<int64_t> confusion_matrix_runs(const py::array_t<T1>& y, const py::array_t<T2>& yhat)

Where T1 and T2 are one of bool, int, int64_t, float or double.

Compute the confusion matrix given true labels y and estimated labels yhat.

Parameters:

y – true labels
yhat – predicted labels
obs_axis – the axis containing the observations beloning to the same run, e.g. 0 when a column contains the scores/labels for a single run.

Raises:

runtime_error – if an array is not aligned or not contiguous.

The arrays are assumed to be two-dimensional, contiguous in memory and have equal size. The size of the smallest array is used.

template <typename T1, typename T2> py::array_t<int64_t> confusion_matrix_score_runs(const py::array_t<T1>& y, const py::array_t<T2>& score, const T2 threshold, const int obs_axis)

Where T1 is one of bool, int, int64_t, float, double and T2 is float or double.

Compute the confusion matrix given true labels y and classifier scores score.

Parameters:

y – true labels
score – classifier scores
threshold – inclusive classification threshold
obs_axis – the axis containing the observations beloning to the same run, e.g. 0 when a column contains the scores/labels for a single run.

Raises:

runtime_error – if an array is not aligned or not contiguous.

The arrays are assumed to be two-dimensional, contiguous in memory and have equal size. The size of the smallest array is used.

Binary Metrics¶

The binary metrics functions only operate on confusion matrices.

py::array_t<double> binary_metrics(const py::array_t<int64_t>& conf_mat, const double fill)

Computes the following metrics where [i] indicates the i’th value in the array.

[0] neg.precision aka Negative Predictive Value (NPV)

[1] pos.precision aka Positive Predictive Value (PPV)

[2] neg.recall aka True Negative Rate (TNR) aka Specificity

[3] pos.recall aka True Positive Rate (TPR) aka Sensitivity

[4] neg.f1 score

[5] pos.f1 score

[6] False Positive Rate (FPR)

[7] False Negative Rate (FNR)

[8] Accuracy

[9] MCC

Parameters:

conf_mat – confusion matrix
fill – value to set when computed metric will be undefined

Raises:

runtime_error – if an array is not aligned or not contiguous.

conf_mat should be aligned and contiguous.

py::array_t<double> binary_metrics_2d(const py::array_t<int64_t>& conf_mat, const double fill)

Computes the following metrics where [i] indicates the i’th column in the array.

[0] neg.precision aka Negative Predictive Value (NPV)

[1] pos.precision aka Positive Predictive Value (PPV)

[2] neg.recall aka True Negative Rate (TNR) aka Specificity

[3] pos.recall aka True Positive Rate (TPR) aka Sensitivity

[4] neg.f1 score

[5] pos.f1 score

[6] False Positive Rate (FPR)

[7] False Negative Rate (FNR)

[8] Accuracy

[9] MCC

Parameters:

conf_mat – confusion matrix
fill – value to set when computed metric will be undefined

Raises:

runtime_error – if an array is not aligned or not C-contiguous.

conf_mat should be aligned and C-contiguous and have shape (N, 4).

py::array_t<double> binary_metrics_flattened(const py::array_t<int64_t>& conf_mat, const double fill)

Computes the following metrics where [i] indicates the i’th column in the array.

[0] neg.precision aka Negative Predictive Value (NPV)

[1] pos.precision aka Positive Predictive Value (PPV)

[2] neg.recall aka True Negative Rate (TNR) aka Specificity

[3] pos.recall aka True Positive Rate (TPR) aka Sensitivity

[4] neg.f1 score

[5] pos.f1 score

[6] False Positive Rate (FPR)

[7] False Negative Rate (FNR)

[8] Accuracy

[9] MCC

Parameters:

conf_mat – confusion matrix
fill – value to set when computed metric will be undefined

Raises:

runtime_error – if an array is not aligned or not contiguous.

conf_mat should be aligned and contiguous and have shape (N * 4).