Uncertainty Eval ================ The quality of uncertainty estimates can be tricky to evaluate, since there are no gold labels like for a classification tasks. For that reason, this module contains methods for exactly this purpose. To measure the quality of general uncertainty estimates, we use the common evaluation method of defining a proxy OOD detection tasks, where we quantify how well we can distinguish in- and out-of-distribution inputs based on uncertainty scores given by a model. This is realized using the area under the receiver-operator-characteristic (:py:func:`nlp_uncertainty_zoo.utils.uncertainty_eval.aupr`) and the area under precision-recall curve (:py:func:`nlp_uncertainty_zoo.utils.uncertainty_eval.auroc`). New in `Ulmer et al. (2022) `_, it is also evaluated how indicative the uncertainty score is with the potential loss of a model. For this reason, this module also implements the Kendall's tau correlation coefficient (:py:func:`nlp_uncertainty_zoo.utils.uncertainty_eval.kendalls_tau`). Instances of the :py:class:`nlp_uncertainty_zoo.models.model.Model` class can be evaluated in a single function using all the above metrics with :py:func:`nlp_uncertainty_zoo.uncertainty_eval.evaluate_uncertainty`. Uncertainty Eval Module Documentation ===================================== .. automodule:: nlp_uncertainty_zoo.utils.uncertainty_eval :members: :show-inheritance: :undoc-members: