Results 1 -
3 of
3
Learning Visual Clothing Style with Heterogeneous Dyadic Co-occurrences
"... With the rapid proliferation of smart mobile devices, users now take millions of photos every day. These include large numbers of clothing and accessory images. We would like to answer questions like ‘What outfit goes well with this pair of shoes? ’ To answer these types of questions, one has to go ..."
Abstract
- Add to MetaCart
(Show Context)
With the rapid proliferation of smart mobile devices, users now take millions of photos every day. These include large numbers of clothing and accessory images. We would like to answer questions like ‘What outfit goes well with this pair of shoes? ’ To answer these types of questions, one has to go beyond learning visual similarity and learn a visual notion of compatibility across categories. In this paper, we propose a novel learning framework to help answer these types of questions. The main idea of this framework is to learn a feature transformation from images of items into a latent space that expresses compatibility. For the feature transformation, we use a Siamese Convolutional Neural Network (CNN) architecture, where training examples are pairs of items that are either compatible or incompatible. We model compatibility based on co-occurrence in large-scale user behavior data; in particular co-purchase data from Amazon.com. To learn cross-category fit, we introduce a strategic method to sample training data, where pairs of items are heterogeneous dyads, i.e., the two elements of a pair belong to different high-level categories. While this ap-proach is applicable to a wide variety of settings, we focus on the representative problem of learning compatible cloth-ing style. Our results indicate that the proposed framework is capable of learning semantic information about visual style and is able to generate outfits of clothes, with items from different categories, that go well together. 1.
Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation
"... Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently sig-nificantly pushed the state-of-art in semantic im-age segmentation. We study the more challeng-ing problem of learning DCNNs for semantic im-age segmentation from eith ..."
Abstract
- Add to MetaCart
Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently sig-nificantly pushed the state-of-art in semantic im-age segmentation. We study the more challeng-ing problem of learning DCNNs for semantic im-age segmentation from either (1) weakly anno-tated training data such as bounding boxes or image-level labels or (2) a combination of few strongly labeled and many weakly labeled im-ages, sourced from one or multiple datasets. We develop methods for semantic image segmenta-tion model training under these weakly super-vised and semi-supervised settings. Extensive experimental evaluation shows that the proposed techniques can learn models delivering state-of-art results on the challenging PASCAL VOC 2012 image segmentation benchmark, while re-quiring significantly less annotation effort. 1.
Under review as a conference paper at ICLR 2015 SEMANTIC IMAGE SEGMENTATION WITH DEEP CON- VOLUTIONAL NETS AND FULLY CONNECTED CRFS
"... Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and ob-ject detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classific ..."
Abstract
- Add to MetaCart
(Show Context)
Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and ob-ject detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called ”semantic image segmentation”). We show that responses at the final layer of DCNNs are not sufficiently localized for accurate object segmentation. This is due to the very invariance properties that make DCNNs good for high level tasks. We overcome this poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our “DeepLab ” system is able to localize segment boundaries at a level of accuracy which is beyond previous methods. Quantita-tively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 67.1 % IOU accuracy in the test set. We show how these results can be obtained efficiently: Careful network re-purposing and a novel application of the ’hole ’ algorithm from the wavelet community allow dense computation of neural net responses at 8 frames per second on a modern GPU. 1