CAVIAR4REID dataset


Custom pictorial structures for re-identification
D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino
In British Machine Vision Conference (BMVC), 2011

Details

CAVIAR4REID is a dataset for evaluating person re-identification algorithms. As the name suggest, the dataset has been extracted from the CAVIAR dataset mostly famous for person tracking and detection evaluations (available here).

Download CAVIAR4REID

How we have built it

The original dataset, CAVIAR, consists of several sequences filmed in the entrance lobby of the INRIA Labs and in a shopping centre in Lisbon. We selected the shopping centre scenario, because it is a less controlled recording and also the cameras are better located (in INRIA Labs scenario, the camera is located overhead. Not a typical scenario for re-identification.). Shopping centre dataset contains 26 sequences recorded from two different points of view at the resolution of 384 X 288 pixels. It includes people walking alone, meeting with others, window shopping, entering and exiting shops. The ground truth has been used to extract the bounding box of each pedestrian. Then we manual select a total of 72 pedestrians: 50 of them with both the camera views and the remaining 22 with one camera view. For each pedestrian, we accurately selected a set of images for each camera view (where available) in order to maximize the variance with respect to resolution changes, light conditions, occlusions, and pose changes so as to make challenging the re-identification task.

Why Yet Another Re-identification Dataset

There exist many publicly available dataset, such as: VIPeR, ETHZ and iLIDS. However, none of them mirror a real re-identification scenario. Even thought VIPeR is one the most promising and challenging dataset in this field, it contains only a image for each individuals. Nowadays we have powerful tools for person tracking, so we can accumulate images over time to make easier the problem. So, we need a dataset that contains multiple images for each person. iLIDS and ETHZ seems to deal with this problem, however they contain images not captured from different cameras and also the intra-person images do not vary a lot in terms of resolution, light, pose, and so on.
Among this dataset, the only one that contains this requirements is CAVIAR4REID: 1) it has broad changes in resolution, the minimum and maximum size of the images is 17 X 39 and 72 X 144, respectively. 2) Unlike ETHZ and iLIDS, it is extracted from a real scenario where re-identification is necessary due to the presence of multiple cameras and 3) pose variations are severe. 4) Unlike VIPeR, it contains more than one image for each view. 5) It contains the union of all the images variations of the other datasets.

Technical info

The zip contains the dataset as a set of images. For each person we have a set of 5 or 10 images. The filename identifies which images are associated to each person:
XXXXYYY.jpg
XXXX = identifier of the person
YYY = identifier of the image for that specific person
E.g.: 0003005.jpg is the 5th image of the 3rd person