Skip to content

Evaluation

class Evaluation

Calculates performance metrics for trained models.

Loads the best model (validation accuracy) from models directory in job directory. All metrics and graphs are based on test_samples.json in job directory. Plots will only be shown if number of classes 20 or less.

Attributes
  • image_dir: Path of image directory.

  • job_dir: Path to job directory with samples.

  • batch_size: Number of images per batch (default 64).

  • base_model_name: Name of pretrained CNN (default MobileNet).

__init__

def __init__(image_dir, job_dir, batch_size, base_model_name, **kwargs)

Inits evaluation component.

Loads the best model from job directory. Creates evaluation directory if app was started from commandline.

get_correct_wrong_examples

def get_correct_wrong_examples(label)

Gets correctly and wrongly predicted samples for a given label.

Args
  • label: int or str (label for which the predictions should be considered).
Returns
  • (correct, wrong): Tuple of two image lists.

visualize_images

def visualize_images(image_list, title, show_heatmap, n_plot)

Visualizes images in a sample list.

Args
  • image_list: sample list.

  • show_heatmap: boolean (generates a gradient based class activation map (grad-CAM), default False).

  • n_plot: maximum number of plots to be shown (default 20).

run

def run(report_create, report_kernel_name, report_export_html, report_export_pdf)

Runs evaluation pipeline on the best model found in job directory for the specific test set

  • Makes prediction on test set

  • Plots test set distribution

  • Plots classification report (accuracy, precision, recall)

  • Plots confusion matrix (on precsion and on recall)

  • Plots correct and wrong examples

If not in ipython mode an evaluation report is created.

Args
  • report_create: boolean (create ipython kernel)

  • rt_kernel_name: str (name of ipython kernel)

  • rt_export_html: boolean (exports report to html).

  • rt_export_pdf: boolean (exports report to pdf).