The task performed by this benchmark is the recognition and tracking of an approximately specified 2 1/2 dimensional "mobile" sculpture that is moving in a cluttered environment, given a series of synthetic images from simulated intensity and range sensors.
The DARPA Image Understanding Benchmark was designed to evaluate parallel architectures as applied to knowledge-based machine vision. Previous vision benchmarks considered only execution times for isolated vision-related tasks, or a very simple image processing scenario. However, the performance of an image interpretation system depends upon a wide range of operations on different levels of representation, from processing arrays of pixels, through manipulation of extracted image events, to symbolic processing of stored models. Vision is also characterized by both bottom-up (image-based) and top-down (model-directed) processing. Thus, the costs of interactions between tasks, input and output, and system overhead must be taken into consideration. Therefore, the static DARPA IU benchmark addressed the issue of system performance on an integrated set of tasks.
The static benchmark consists of a model-based object recognition problem, given two sources of sensory input, intensity and range data, and a database of candidate models. The models consist of configurations of rectangular surfaces, floating in space, viewed under orthographic projection, with the presence of both noise and spurious non-model surfaces. A partially-ordered sequence of operations is specified that solves the problem, along with a recommended algorithmic method for each step. In addition to reporting the total and the final solution, timings were requested for each component operation, and intermediate results were to be output as a check on accuracy. Other factors such as programming time, language code size, and machine configurations were reported. As a result, the benchmark could be used to gain insight into processor strengths and weaknesses, and could be used to guide the development of future parallel vision architectures.
The source code available here was used to run the benchmark on a Unix workstation using a sample set of test images and models. The bibliography provides a resource for more detailed information on the benchmark.