Results 1 - 10
of
48
Joint incremental disfluency detection and dependency parsing
- Transactions of the Association of Computational Linugistics (TACL
, 2014
"... We present an incremental dependency parsing model that jointly performs disflu-ency detection. The model handles speech repairs using a novel non-monotonic tran-sition system, and includes several novel classes of features. For comparison, we evaluated two pipeline systems, us-ing state-of-the-art ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We present an incremental dependency parsing model that jointly performs disflu-ency detection. The model handles speech repairs using a novel non-monotonic tran-sition system, and includes several novel classes of features. For comparison, we evaluated two pipeline systems, us-ing state-of-the-art
Non-Monotonic Parsing of Fluent umm I Mean Disfluent Sentences
"... Parsing disfluent sentences is a challeng-ing task which involves detecting disflu-encies as well as identifying the syntactic structure of the sentence. While there have been several studies recently into solely detecting disfluencies at a high perfor-mance level, there has been relatively lit-tle ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
-tle work into joint parsing and disfluency detection that has reached that state-of-the-art performance in disfluency detec-tion. We improve upon recent work in this joint task through the use of novel features and learning cascades to produce a model which performs at 82.6 F-score. It outper
Object detection using stronglysupervised deformable part models
- In ECCV
, 2012
"... Abstract. Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization dur-ing training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier method ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
Abstract. Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization dur-ing training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier
and Literature
"... We present STIR (STrongly Incremen-tal Repair detection), a system that de-tects speech repairs and edit terms on transcripts incrementally with minimal la-tency. STIR uses information-theoretic measures from n-gram models as its prin-cipal decision features in a pipeline of classifiers detecting th ..."
Abstract
- Add to MetaCart
the the different stages of repairs. Results on the Switchboard dis-fluency tagged corpus show utterance-final accuracy on a par with state-of-the-art in-cremental repair detection methods, but with better incremental accuracy, faster time-to-detection and less computational overhead. We evaluate its performance us-ing
Information
"... The Visual Dependency Representation (VDR) is an explicit model of the spa-tial relationships between objects in an im-age. In this paper we present an approach to training a VDR Parsing Model without the extensive human supervision used in previous work. Our approach is to find the objects mentione ..."
Abstract
- Add to MetaCart
mentioned in a given descrip-tion using a state-of-the-art object detec-tor, and to use successful detections to pro-duce training data. The description of an unseen image is produced by first predict-ing its VDR over automatically detected objects, and then generating the text with a template
A lattice-based approach to automatic filled pause insertion
- DiSS The 7th Workshop on Disfluency in Spontaneous Speech
, 2015
"... This paper describes a novel method for automat-ically inserting filled pauses (e.g., UM) into fluent texts. Although filled pauses are known to serve a wide range of psychological and structural functions in conversational speech, they have not tradition-ally been modelled overtly by state-of-the-a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes a novel method for automat-ically inserting filled pauses (e.g., UM) into fluent texts. Although filled pauses are known to serve a wide range of psychological and structural functions in conversational speech, they have not tradition-ally been modelled overtly by state-of-the-art
Multichannel voice detection in adverse environments
- Proc. of EUSIPCO
, 2002
"... Detecting when voice is or is not present is an outstanding prob-lem for speech transmission, enhancement and recognition. Here we present a novel multichannel source activity detector that ex-ploits the spatial localization of the target audio source. The detector uses an array signal processing te ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
improvements in error rates of 55-70 % compared to the state-of-the-art adaptive multi-rate algorithm AMR2 used in present voice transmission technology. 1
LOST IN SEGMENTATION: THREE APPROACHES FOR SPEECH/NON-SPEECH DETECTION IN CONSUMER-PRODUCED VIDEOS
"... Traditional speech/non-speech segmentation systems have been designed for specific acoustic conditions, such as broadcast news or meetings. However, little research has been done on consumer-produced audio. This type of media is constantly growing and has complex characteris-tics such as low quality ..."
Abstract
- Add to MetaCart
quality recordings, environmental noise and overlapping sounds. This paper discusses an evaluation of three different approaches for speech/non-speech detection on consumer-produced audio. The approaches are state-of-the-art speech/non-speech detectors–one based on Gaussian Mixture Models (GMM), another
Age regression from soft aligned face images using low computational resources
"... Abstract. The initial step in most facial age estimation systems con-sists of accurately aligning a model to the output of a face detector (e.g. an Active Appearance Model). This fitting process is very expensive in terms of computational resources and prone to get stuck in local minima. This makes ..."
Abstract
- Add to MetaCart
it impractical for analysing faces in resource limited comput-ing devices. In this paper we build a face age regressor that is able to work directly on faces cropped using a state-of-the-art face detector. Our procedure uses K nearest neighbours (K-NN) regression with a metric based on a properly tuned Fisher
PixelTrack: a fast adaptive algorithm for tracking non-rigid objects
- In ICCV
, 2013
"... In this paper, we present a novel algorithm for fast track-ing of generic objects in videos. The algorithm uses two components: a detector that makes use of the generalised Hough transform with pixel-based descriptors, and a prob-abilistic segmentation method based on global models for foreground an ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
. The pro-posed tracking method has been thoroughly evaluated on challenging standard videos, and outperforms state-of-the-art tracking methods designed for the same task. Finally, the proposed models allow for an extremely efficient imple-mentation, and thus tracking is very fast. 1.
Results 1 - 10
of
48