Results 1 - 10
of
48
L.: Deepface: Closing the gap to human-level performance in face verification
- In: IEEE CVPR
, 2014
"... In modern face recognition, the conventional pipeline consists of four stages: detect ⇒ align ⇒ represent ⇒ clas-sify. We revisit both the alignment step and the representa-tion step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face represe ..."
Abstract
-
Cited by 103 (4 self)
- Add to MetaCart
(Show Context)
In modern face recognition, the conventional pipeline consists of four stages: detect ⇒ align ⇒ represent ⇒ clas-sify. We revisit both the alignment step and the representa-tion step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network. This deep network involves more than 120 million parameters using several locally connected layers without weight shar-ing, rather than the standard convolutional layers. Thus we trained it on the largest facial dataset to-date, an iden-tity labeled dataset of four million facial images belong-ing to more than 4,000 identities. The learned representa-tions coupling the accurate model-based alignment with the large facial database generalize remarkably well to faces in unconstrained environments, even with a simple classifier. Our method reaches an accuracy of 97.25 % on the Labeled Faces in the Wild (LFW) dataset, reducing the error of the current state of the art by more than 25%, closely approach-ing human-level performance. 1.
Deep learning face representation by joint identification-verification
- in Advances in Neural Information Processing Systems
, 2014
"... The key challenge of face recognition is to develop effective feature repre-sentations for reducing intra-personal variations while enlarging inter-personal differences. In this paper, we show that it can be well solved with deep learning and using both face identification and verification signals a ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
(Show Context)
The key challenge of face recognition is to develop effective feature repre-sentations for reducing intra-personal variations while enlarging inter-personal differences. In this paper, we show that it can be well solved with deep learning and using both face identification and verification signals as supervision. The Deep IDentification-verification features (DeepID2) are learned with carefully designed deep convolutional networks. The face identification task increases the inter-personal variations by drawing DeepID2 extracted from different identities apart, while the face verification task reduces the intra-personal variations by pulling DeepID2 extracted from the same identity together, both of which are essential to face recognition. The learned DeepID2 features can be well generalized to new identities unseen in the training data. On the challenging LFW dataset [11], 99.15 % face verification accuracy is achieved. Compared with the best deep learning result [21] on LFW, the error rate has been significantly reduced by 67%. 1
G.: A practical transfer learning algorithm for face verification
- In: ICCV. (2013
"... Face verification involves determining whether a pair of facial images belongs to the same or different subjects. This problem can prove to be quite challenging in many im-portant applications where labeled training data is scarce, e.g., family album photo organization software. Herein we propose a ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
(Show Context)
Face verification involves determining whether a pair of facial images belongs to the same or different subjects. This problem can prove to be quite challenging in many im-portant applications where labeled training data is scarce, e.g., family album photo organization software. Herein we propose a principled transfer learning approach for merg-ing plentiful source-domain data with limited samples from some target domain of interest to create a classifier that ide-ally performs nearly as well as if rich target-domain data were present. Based upon a surprisingly simple generative Bayesian model, our approach combines a KL-divergence-based regularizer/prior with a robust likelihood function leading to a scalable implementation via the EM algorithm. As justification for our design choices, we later use prin-ciples from convex analysis to recast our algorithm as an equivalent structured rank minimization problem leading to a number of interesting insights related to solution struc-ture and feature-transform invariance. These insights help to both explain the effectiveness of our algorithm as well as elucidate a wide variety of related Bayesian approaches. Experimental testing with challenging datasets validate the utility of the proposed algorithm. 1.
Surpassing human-level face verification performance on LFW with GaussianFace
, 2014
"... Face verification remains a challenging problem in very complex conditions with large variations such as pose, illumination, expression, and occlusions. This problem is exacerbated when we rely unrealistically on a single training data source, which is often insufficient to cover the intrinsically c ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
(Show Context)
Face verification remains a challenging problem in very complex conditions with large variations such as pose, illumination, expression, and occlusions. This problem is exacerbated when we rely unrealistically on a single training data source, which is often insufficient to cover the intrinsically complex face variations. This paper pro-poses a principled multi-task learning approach based on Discriminative Gaussian Process Latent Variable Model, named GaussianFace, to enrich the diversity of training data. In comparison to existing methods, our model exploits additional data from multiple source-domains to improve the generalization performance of face verification in an unknown target-domain. Importantly, our model can adapt automatically to complex data distributions, and therefore can well capture complex face variations inherent in multiple sources. Extensive experiments demonstrate the effectiveness of the proposed model in learning from diverse data sources and generalize to unseen domain. Specifically, the accuracy of our algorithm achieves an impressive accuracy rate of 98.52 % on the well-known and challenging Labeled Faces in the Wild (LFW) benchmark [23]. For the first time, the human-level performance in face verification (97.53%) [28] on LFW is surpassed. 1 1.
Eigen-PEP for Video Face Recognition
"... Abstract. To effectively solve the problem of large scale video face recognition, we argue for a comprehensive, compact, and yet flexible rep-resentation of a face subject. It shall comprehensively integrate the visual information from all relevant video frames of the subject in a compact form. It s ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Abstract. To effectively solve the problem of large scale video face recognition, we argue for a comprehensive, compact, and yet flexible rep-resentation of a face subject. It shall comprehensively integrate the visual information from all relevant video frames of the subject in a compact form. It shall also be flexible to be incrementally updated, incorporating new or retiring obsolete observations. In search for such a representa-tion, we present the Eigen-PEP that is built upon the recent success of the probabilistic elastic part (PEP) model. It first integrates the informa-tion from relevant video sources by a part-based average pooling through the PEP model, which produces an intermediate high dimensional, part-based, and pose-invariant representation. We then compress the inter-mediate representation through principal component analysis, and only a number of principal eigen dimensions are kept (as small as 100). We evaluate the Eigen-PEP representation both for video-based face ver-ification and identification on the YouTube Faces Dataset and a new Celebrity-1000 video face dataset, respectively. On YouTube Faces, we further improve the state-of-the-art recognition accuracy. On Celebrity-1000, we lead the competing baselines by a significant margin while of-fering a scalable solution that is linear with respect to the number of subjects. (a) LFW (b) YouTube Faces (c) Celebrity-1000 Fig. 1. Sample images in three unconstrained face recognition datasets: the image-
Unconstrained face recognition: Identifying a person of interest from a media collection
, 2014
"... Abstract—As face recognition applications progress from con-strained sensing and cooperative subjects scenarios (e.g., driver’s license and passport photos) to unconstrained scenarios with uncooperative subjects (e.g., video surveillance), new challenges are encountered. These challenges are due to ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
(Show Context)
Abstract—As face recognition applications progress from con-strained sensing and cooperative subjects scenarios (e.g., driver’s license and passport photos) to unconstrained scenarios with uncooperative subjects (e.g., video surveillance), new challenges are encountered. These challenges are due to variations in ambient illumination, image resolution, background clutter, facial pose, expression, and occlusion. In forensic investigations where the goal is to identify a “person of interest, ” often based on low quality face images and videos, we need to utilize whatever source of information is available about the person. This could include one or more video tracks, multiple still images captured by bystanders (using, for example, their mobile phones), 3D face models, and verbal descriptions of the subject provided by witnesses. These verbal descriptions can be used to generate a face sketch and provide ancillary information about the person of interest (e.g., gender, race, and age). While traditional face matching methods take single media (i.e., a still face image, video track, or face sketch) as input, our work considers using the entire gamut of media collection as a probe to generate a single candidate list for the person of interest. We show that the proposed approach boosts the likelihood of correctly identifying the person of interest through the use of different fusion schemes, 3D face models, and incorporation of quality measures for fusion and video frame selection. Index Terms—Unconstrained face recognition, uncooperative subjects, media collection, quality-based fusion, still face image, video track, 3D face model, face sketch, demographics I.
Pcanet: A simple deep learning baseline for image classification?” arXiv preprint arXiv:1404.3606
, 2014
"... Abstract — In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal com-ponent analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract — In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal com-ponent analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets
Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval
"... Abstract. Recently, promising results have been shown on face recog-nition researches. However, face recognition and retrieval across age is still challenging. Unlike prior methods using complex models with strong parametric assumptions to model the aging process, we use a data-driven method to addr ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Recently, promising results have been shown on face recog-nition researches. However, face recognition and retrieval across age is still challenging. Unlike prior methods using complex models with strong parametric assumptions to model the aging process, we use a data-driven method to address this problem. We propose a novel coding framework called Cross-Age Reference Coding (CARC). By leveraging a large-scale image dataset freely available on the Internet as a reference set, CARC is able to encode the low-level feature of a face image with an age-invariant reference space. In the testing phase, the proposed method only requires a linear projection to encode the feature and therefore it is highly scalable. To thoroughly evaluate our work, we introduce a new large-scale dataset for face recognition and retrieval across age called Cross-Age Celebrity Dataset (CACD). The dataset contains more than 160,000 images of 2,000 celebrities with age ranging from 16 to 62. To the best of our knowledge, it is by far the largest publicly available cross-age face dataset. Experimental results show that the proposed method can achieve state-of-the-art performance on both our dataset as well as the other widely used dataset for face recognition across age, MORPH dataset.
Extensive facial landmark localization with coarse-to-fine convolutional network cascade
- in Proceedings of the 14th IEEE International Conference on Computer Vision Workshops (ICCVW ’13
, 2013
"... We present a new approach to localize extensive facial landmarks with a coarse-to-fine convolutional network cas-cade. Deep convolutional neural networks (DCNN) have been successfully utilized in facial landmark localization for two-fold advantages: 1) geometric constraints among facial points are i ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
We present a new approach to localize extensive facial landmarks with a coarse-to-fine convolutional network cas-cade. Deep convolutional neural networks (DCNN) have been successfully utilized in facial landmark localization for two-fold advantages: 1) geometric constraints among facial points are implicitly utilized; 2) huge amount of train-ing data can be leveraged. However, in the task of exten-sive facial landmark localization, a large number of fa-cial landmarks (more than 50 points) are required to be located in a unified system, which poses great difficulty in the structure design and training process of traditional con-volutional networks. In this paper, we design a four-level convolutional network cascade, which tackles the problem in a coarse-to-fine manner. In our system, each network level is trained to locally refine a subset of facial land-marks generated by previous network levels. In addition, each level predicts explicit geometric constraints (the posi-tion and rotation angles of a specific facial component) to rectify the inputs of the current network level. The combi-nation of coarse-to-fine cascade and geometric refinement enables our system to locate extensive facial landmarks (68 points) accurately in the 300-W facial landmark localiza-tion challenge. 1.
S.Z.: A benchmark study of large-scale unconstrained face recognition
- In: IAPR/IEEE International Joint Conference on Biometrics
"... Many efforts have been made in recent years to tackle the unconstrained face recognition challenge. For the bench-mark of this challenge, the Labeled Faces in the Wild (LFW) database has been widely used. However, the standard LFW protocol is very limited, with only 3,000 genuine and 3,000 impostor ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Many efforts have been made in recent years to tackle the unconstrained face recognition challenge. For the bench-mark of this challenge, the Labeled Faces in the Wild (LFW) database has been widely used. However, the standard LFW protocol is very limited, with only 3,000 genuine and 3,000 impostor matches for classification. Today a 97 % accuracy can be achieved with this benchmark, remaining a very lim-ited room for algorithm development. However, we argue that this accuracy may be too optimistic because the under-lying false accept rate may still be high (e.g. 3%). Further-more, performance evaluation at low FARs is not statistical-ly sound by the standard protocol due to the limited number of impostor matches. Thereby we develop a new benchmark protocol to fully exploit all the 13,233 LFW face images for large-scale unconstrained face recognition evaluation un-der both verification and open-set identification scenarios, with a focus at low FARs. Based on the new benchmark, we evaluate 21 face recognition approaches by combining 3 kinds of features and 7 learning algorithms. The benchmark results show that the best algorithm achieves 41.66 % verifi-cation rates at FAR=0.1%, and 18.07 % open-set identifica-tion rates at rank 1 and FAR=1%. Accordingly we conclude that the large-scale unconstrained face recognition problem is still largely unresolved, thus further attention and effort is needed in developing effective feature representations and learning algorithms. We thereby release a benchmark tool to advance research in this field. 1.