Evaluation
class Evaluation
Calculates performance metrics for trained models.
Loads the best model (validation accuracy) from models directory in job directory. All metrics and graphs are based on test_samples.json in job directory. Plots will only be shown if number of classes 20 or less.
Attributes
-
image_dir: Path of image directory.
-
job_dir: Path to job directory with samples.
-
batch_size: Number of images per batch (default 64).
-
base_model_name: Name of pretrained CNN (default MobileNet).
__init__
def __init__(image_dir, job_dir, batch_size, base_model_name, **kwargs)
Inits evaluation component.
Loads the best model from job directory. Creates evaluation directory if app was started from commandline.
get_correct_wrong_examples
def get_correct_wrong_examples(label)
Gets correctly and wrongly predicted samples for a given label.
Args
- label: int or str (label for which the predictions should be considered).
Returns
- (correct, wrong): Tuple of two image lists.
visualize_images
def visualize_images(image_list, title, show_heatmap, n_plot)
Visualizes images in a sample list.
Args
-
image_list: sample list.
-
show_heatmap: boolean (generates a gradient based class activation map (grad-CAM), default False).
-
n_plot: maximum number of plots to be shown (default 20).
run
def run(report_create, report_kernel_name, report_export_html, report_export_pdf)
Runs evaluation pipeline on the best model found in job directory for the specific test set
-
Makes prediction on test set
-
Plots test set distribution
-
Plots classification report (accuracy, precision, recall)
-
Plots confusion matrix (on precsion and on recall)
-
Plots correct and wrong examples
If not in ipython mode an evaluation report is created.
Args
-
report_create: boolean (create ipython kernel)
-
rt_kernel_name: str (name of ipython kernel)
-
rt_export_html: boolean (exports report to html).
-
rt_export_pdf: boolean (exports report to pdf).