Task Eval¶

This module currently only contains a simple function to evaluate sequence and token classification tasks, where accuracy and macro F1 score are being evaluated.

Task Eval Module Documentation¶

Implementation of evaluation logic.

nlp_uncertainty_zoo.utils.task_eval.evaluate_task(model, eval_split: DataLoader, ignore_token_ids: Tuple[int] = (-100,), verbose: bool = True) → Dict[str, float]¶

Evaluate a model and save predictions (if applicable).

Parameters:

model: Model: Model to be evaluated.
eval_split: DataSplit: Data split the model is being evaluated on.
ignore_token_ids: Tuple[int]: IDs of tokens that should be ignored by the model during evaluation.
verbose: bool: Whether to display information about the current progress.

Returns:

Dict[str, float]: Return score on test set.