class Evaluation

Calculates performance metrics for trained models.

Loads the best model (validation accuracy) from models directory in job directory. All metrics and graphs are based on test_samples.json in job directory. Plots will only be shown if number of classes 20 or less.

  • image_dir: Path of image directory.

  • job_dir: Path to job directory with samples.

  • batch_size: Number of images per batch (default 64).

  • base_model_name: Name of pretrained CNN (default MobileNet).


def __init__(image_dir, job_dir, batch_size, base_model_name, **kwargs)

Inits evaluation component.

Loads the best model from job directory. Creates evaluation directory if app was started from commandline.


def get_correct_wrong_examples(label)

Gets correctly and wrongly predicted samples for a given label.

  • label: int or str (label for which the predictions should be considered).
  • (correct, wrong): Tuple of two image lists.


def visualize_images(image_list, title, show_heatmap, n_plot)

Visualizes images in a sample list.

  • image_list: sample list.

  • show_heatmap: boolean (generates a gradient based class activation map (grad-CAM), default False).

  • n_plot: maximum number of plots to be shown (default 20).


def run(report_create, report_kernel_name, report_export_html, report_export_pdf)

Runs evaluation pipeline on the best model found in job directory for the specific test set

  • Makes prediction on test set

  • Plots test set distribution

  • Plots classification report (accuracy, precision, recall)

  • Plots confusion matrix (on precsion and on recall)

  • Plots correct and wrong examples

If not in ipython mode an evaluation report is created.

  • report_create: boolean (create ipython kernel)

  • rt_kernel_name: str (name of ipython kernel)

  • rt_export_html: boolean (exports report to html).

  • rt_export_pdf: boolean (exports report to pdf).