Advancing state of the art in face recognition and bridging the gap between laboratory and real-world scenario require availability of challenging databases. One of the challenging applications in face recognition is surveillance where unconstrained video data is captured both in day and night time (visible and near infrared) with multiple subjects in frames which are matched with good quality gallery images. Due to lack of an existing database for such a cross-spectral cross resolution video-to-still face recognition application, this is still an open research problem. This paper presents a video database that can be utilized to benchmark face recognition algorithms addressing cross spectral cross resolution matching. The proposed Cross-Spectral Cross-Resolution Video dataset (CSCRV) contains videos pertaining to 160 subjects with an open-set protocol. We present baseline results with two commercial matchers for two experimental scenarios where we observe very low performance of both the matchers. It is our assertion that this dataset can help researchers develop face recognition algorithms to handle real world surveillance scenarios.
All videos are named in the following format: 'Time_LocationID_VideoID_SubjectID1...SubjectIDn'. Here, time refers to the time of the day the video was captured and may take two values, N or D. LocationID corresponds to the location at which the video was captured. It can take one of four values: S1, S2, S3 or S4(S1 and S2 refer to day-time locations, while S3 and S4 refer to night-time locations). VideoID corresponds to a unique ID given to each video of a location and SubjectID corresponds to a unique ID given to each subject. For example, consider the video name N S4 V 28 67 0, where N corresponds to a night-time video and S4 denotes that the video was captured in the fourth location. V28 denotes that video’s unique ID and the remaining number(s) denote the subject IDs which are present in the video. Subject ID 0 corresponds to subjects belonging to the open-set. This nomenclature ensures that every video obtains a unique and informative name. The high resolution still images have been named as SubjectID 1, SubjectID 2 and SubjectID 3 for each subject.
The dataset also includes annotated frames containing a bounding box for every face in each frame (total 68410 faces), following the nomenclature described above. Along with the loose cropped face images, each subject’s three high resolution still images are also part of the release. A small section of non-overlapping videos acquired under the same setup are also provided as a training set for learning based experiments.NOTE: In order to procure the database link and license agreement file for research purposes kindly send a mail to firstname.lastname@example.org with your credentials.