How to evaluate a background subtraction algorithm given ground truths?

Question

I'm testing a background subtraction algorithm with sample image sequences. I am wondering how one can evaluate the accuracy of a background subtraction algorithm results given ground truths?

The only idea I have in mind right now is to take the difference between a result image and its corresponding ground truth image and calculate difference/total_white_pixels_of(ground_truth).

Cheers!

EDIT: Dear all, by ground truth image I mean binary image with foreground as white blobs and background as black pixels. This is an example of input image and its ground truth:

enter image description here

source: http://wordpress-jodoin.dmi.usherb.ca/dataset2014/

i hope someone explains to me the meaning and etymology of the term: *"ground truths"*, in the context of image processing. i guess i don't know what it is. but i can imagine it's some kind of standard to measure fit of somethin' or 'nother to. — robert bristow-johnson, Jul 03 '14 at 01:44
By ground truth do you mean known image? In that case you can do signal to noise ratio, i.e. object intensity vs background intensity before and after. — Pokey McPokerson, Jul 03 '14 at 13:30
@robertbristow-johnson Ground truth is the ideal expected result. The term comes from statistics, where predictions can be made based on data and compared against *ground truths* which are the actual outcomes we attempted to predict. — Phonon, Jul 03 '14 at 23:22
It would help to know what kind of output you're expecting. Is it binary (background=true/false pixel-wise), or grayscale (actual image of the background/foreground)? — Phonon, Jul 03 '14 at 23:23

Pokey McPokerson · Accepted Answer · 2014-07-07T19:23:05.753

One way would be to create a simple signal-to-noise estimate, using the pixel intensity and the ground truth image as a mask. Here's an example of what I mean in Python:

import numpy as np

# Sample data 
# Indices [0, 1, 2, 6] are background, [3, 4, 5] are object
original  = np.array([1, 2, 2, 5, 6, 8, 2])
truth     = np.array([0, 0, 0, 1, 1, 1, 0])
back_mask = 1 - truth
result    = np.array([0, 1, 0, 4, 5, 7, 1]) # Result of background subtraction

def non_masked_mean(input_array, mask):
    """mean of non-masked elements in the array"""
    return np.ma.masked_array(input_array, mask).mean()

def snr(input_array, back_mask):
    """ratio of non-masked to masked mean intensities"""
    with np.errstate(all='ignore'):
        snr = non_masked_mean(input_array, back_mask) / non_masked_mean(input_array, (1-back_mask))
    return 0 if np.isnan(snr) else snr

snr_before = snr(original, back_mask)  # (mean of objects / mean of background), before
snr_after  = snr(result, back_mask)    # (mean of objects / mean of background), after
snr_ratio  = snr_after / snr_before

if snr_after == np.inf:
    print "Perfect SNR in result! Either great or suspect..."
elif snr_after == 0:
    print "Image was flattened :("
else:
    print "SNR changed by a multiple of %.2f" % (snr_ratio)

The problem with this is that you get a perfect score by favouring a heavy-handed background subtraction. Even if only 1 pixel of the object survives this SNR will be infinite. Another approach would be to consider the background and the signal separately, and weight the results of the two. Then you can decide if it's more important to flatten the background, or preserve the original.

def non_masked_mean_ratio(input_array, result_array, mask):
    """ratio of two array means, with the same mask"""
    return non_masked_mean(result_array, mask) / non_masked_mean(input_array, mask) 

signal_ratio     = non_masked_mean_ratio(original, result, back_mask)     # (mean of objects after) / (mean of objects before)
background_ratio = non_masked_mean_ratio(original, result, truth)         # (mean of background after) / (mean of background before)
weight           = 0.5   # Higher weight favours preservation of objects

score = (weight*signal_ratio) + ((1-weight) * (1-background_ratio))

This gives the score a nice expected range of about 0-1, where 1 is the ideal result. It's not perfect though, as you can boost it by increasing the intensity of the objects.

If you're expecting a binary result, you can use a similar strategy except using the number of non-zero pixels instead of the mean:

def nonzero_count_ratio(result_array, ideal_array):
    """ratio of actual number of non-zero pixels vs ideal non-zero pixels"""
    return float(np.count_nonzero(result_array*ideal_array)) / np.count_nonzero(ideal_array)

true_obj_ratio   = nonzero_count_ratio(result, truth)     # Number of correct object pixels vs number of potential correct
false_back_ratio = nonzero_count_ratio(result, back_mask) # Number of false background pixels vs number of potential false
weight           = 0.5

score = (weight*true_obj_ratio) + ((1-weight) * (1-false_back_ratio))

thanks so much for the very informative answer. I'll try! – tnq177 Jul 07 '14 at 12:36 — tnq177, Jul 07 '14 at 12:36

How to evaluate a background subtraction algorithm given ground truths?

1 Answers1