The first (of many more) face detection datasets of human faces especially created for face detection (finding) instead of recognition:
- BioID Face Detection Database
1521 images with human faces, recorded under natural conditions, i.e. varying illumination and complex background. The eye positions have been set manually (and are included in the set) for calculating the accuracy of a face detector. A formula is presented to normalize the decision of a match or mismatch. This is, to my knowledge, the first attempt to finally create a real test scenario with precise rules on how to calculate the accuracy of a face detector – open for all to compare their results in a scientific way!- A complete revision of all eye position files has been released 2/25/02 – visit https://www.bioid.com/facedb/ to update the dataset.
- The original article describing the database can be downloaded here.
- For comparison, the data (figure 5 of the article above) of the reference test is now available in RTF format for both the BioID-test and the XM2VTS-test.
A new addition: The BioID Face Detection Database is being used within the FGnet project of the European Working Group on face and gesture recognition. Therefore, several additional feature points have been marked up, which are very useful for facial analysis and gesture recognition. This data is also available for public download here.
- Face and Gesture Recognition Working Group FGnet
European FGnet encourages development of face and gesture recognition techniques. Among other contributions worth having a look at, they provide resources especially useful for face detection/recognition. Have a look at “Benchmark Data” to access the list of useful datasets!
- FaceScrub – A Dataset With Over 100,000 Face Images of 530 People
The FaceScrub dataset comprises a total of 107,818 face images of 530 celebrities, with about 200 images per person. As such, it is one of the largest public face detection datasets.
- WIDER FACE: A Face Detection Benchmark
The WIDER FACE dataset is a face detection benchmark dataset. It consists of 32.203 images with 393.703 labelled faces with high variations of scale, pose and occlusion.
- FDDB: Face Detection Data Set and Benchmark
This data set contains the annotations for 5171 faces in a set of 2845 images taken from the well-known Faces in the Wild (LFW) data set.
- MALF: Multi-Attribute Labelled Faces
Contains 5,250 images with 11,931 annotated faces collected from the Internet.
Many other face databases are available nowadays. The current trend is to recognize faces from different views, under varying illumination, or along time differences (aging). Here are some especially useful for testing face detection performance:
- Feret Facial Recognition Technology Database
- Carnegie Mellon Test Images
- Vision Group of Essex University Face Database
- NIST Mug-shot Images Database
- ExtendedM2VTS Database
Valuable markup data for the XM2VTS database has been published by FGnet - VALID multimodal Database
This is an audio-visual database, supplementing the XM2VTS database, from University College Dublin. Eye centers of still face pictures are given! - The AR-Face Database
Valuable markup data for the AR database has been published by FGnet - Yale Face Database
Concentrates on illumination variations. - University Oulu Physics-Based Face Database
- Japanese Female Facial Expression (JAFFE) Database
- Caltech Faces 1999 Database
- Bao Face Database
Lots of face images, mostly people from Asia. Single face pictures are in the “one faces” subdirectory.Researchers, I need your help on this Bao Face Dataset. I received it a long time ago, and now many people who used it in their work need to contact the author to get permission to use his material. Do you know the author? Can you please ask him to contact me? Thank you!!!
Sooner or later, you will feel the need for an average face model when trying different locating algorithms. Here are some averaged faces: