• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 48
Next 10 →

Joint incremental disfluency detection and dependency parsing

by Matthew Honnibal, Mark Johnson - Transactions of the Association of Computational Linugistics (TACL , 2014
"... We present an incremental dependency parsing model that jointly performs disflu-ency detection. The model handles speech repairs using a novel non-monotonic tran-sition system, and includes several novel classes of features. For comparison, we evaluated two pipeline systems, us-ing state-of-the-art ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
We present an incremental dependency parsing model that jointly performs disflu-ency detection. The model handles speech repairs using a novel non-monotonic tran-sition system, and includes several novel classes of features. For comparison, we evaluated two pipeline systems, us-ing state-of-the-art

Non-Monotonic Parsing of Fluent umm I Mean Disfluent Sentences

by Mohammad Sadegh Rasooli, Joel Tetreault
"... Parsing disfluent sentences is a challeng-ing task which involves detecting disflu-encies as well as identifying the syntactic structure of the sentence. While there have been several studies recently into solely detecting disfluencies at a high perfor-mance level, there has been relatively lit-tle ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
-tle work into joint parsing and disfluency detection that has reached that state-of-the-art performance in disfluency detec-tion. We improve upon recent work in this joint task through the use of novel features and learning cascades to produce a model which performs at 82.6 F-score. It outper

Object detection using stronglysupervised deformable part models

by Hossein Azizpour, Ivan Laptev - In ECCV , 2012
"... Abstract. Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization dur-ing training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier method ..."
Abstract - Cited by 42 (3 self) - Add to MetaCart
Abstract. Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization dur-ing training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier

and Literature

by Julian Hough, Matthew Purver
"... We present STIR (STrongly Incremen-tal Repair detection), a system that de-tects speech repairs and edit terms on transcripts incrementally with minimal la-tency. STIR uses information-theoretic measures from n-gram models as its prin-cipal decision features in a pipeline of classifiers detecting th ..."
Abstract - Add to MetaCart
the the different stages of repairs. Results on the Switchboard dis-fluency tagged corpus show utterance-final accuracy on a par with state-of-the-art in-cremental repair detection methods, but with better incremental accuracy, faster time-to-detection and less computational overhead. We evaluate its performance us-ing

Information

by Desmond Elliott, Arjen P. De Vries, Centrum Wiskunde
"... The Visual Dependency Representation (VDR) is an explicit model of the spa-tial relationships between objects in an im-age. In this paper we present an approach to training a VDR Parsing Model without the extensive human supervision used in previous work. Our approach is to find the objects mentione ..."
Abstract - Add to MetaCart
mentioned in a given descrip-tion using a state-of-the-art object detec-tor, and to use successful detections to pro-duce training data. The description of an unseen image is produced by first predict-ing its VDR over automatically detected objects, and then generating the text with a template

A lattice-based approach to automatic filled pause insertion

by Marcus Tomalin, Mirjam Wester, Rasmus Dall, Bill Byrne, Simon King - DiSS The 7th Workshop on Disfluency in Spontaneous Speech , 2015
"... This paper describes a novel method for automat-ically inserting filled pauses (e.g., UM) into fluent texts. Although filled pauses are known to serve a wide range of psychological and structural functions in conversational speech, they have not tradition-ally been modelled overtly by state-of-the-a ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
This paper describes a novel method for automat-ically inserting filled pauses (e.g., UM) into fluent texts. Although filled pauses are known to serve a wide range of psychological and structural functions in conversational speech, they have not tradition-ally been modelled overtly by state-of-the-art

Multichannel voice detection in adverse environments

by J. Rosca, R. Balan, N. P. Fan, C. Beaugeant, V. Gilg - Proc. of EUSIPCO , 2002
"... Detecting when voice is or is not present is an outstanding prob-lem for speech transmission, enhancement and recognition. Here we present a novel multichannel source activity detector that ex-ploits the spatial localization of the target audio source. The detector uses an array signal processing te ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
improvements in error rates of 55-70 % compared to the state-of-the-art adaptive multi-rate algorithm AMR2 used in present voice transmission technology. 1

LOST IN SEGMENTATION: THREE APPROACHES FOR SPEECH/NON-SPEECH DETECTION IN CONSUMER-PRODUCED VIDEOS

by Benjamin Elizalde, Gerald Friedl
"... Traditional speech/non-speech segmentation systems have been designed for specific acoustic conditions, such as broadcast news or meetings. However, little research has been done on consumer-produced audio. This type of media is constantly growing and has complex characteris-tics such as low quality ..."
Abstract - Add to MetaCart
quality recordings, environmental noise and overlapping sounds. This paper discusses an evaluation of three different approaches for speech/non-speech detection on consumer-produced audio. The approaches are state-of-the-art speech/non-speech detectors–one based on Gaussian Mixture Models (GMM), another

Age regression from soft aligned face images using low computational resources

by Juan Bekios-calfa, Jose ́ M. Buenaposada, Luis Baumela
"... Abstract. The initial step in most facial age estimation systems con-sists of accurately aligning a model to the output of a face detector (e.g. an Active Appearance Model). This fitting process is very expensive in terms of computational resources and prone to get stuck in local minima. This makes ..."
Abstract - Add to MetaCart
it impractical for analysing faces in resource limited comput-ing devices. In this paper we build a face age regressor that is able to work directly on faces cropped using a state-of-the-art face detector. Our procedure uses K nearest neighbours (K-NN) regression with a metric based on a properly tuned Fisher

PixelTrack: a fast adaptive algorithm for tracking non-rigid objects

by Stefan Duffner, Christophe Garcia - In ICCV , 2013
"... In this paper, we present a novel algorithm for fast track-ing of generic objects in videos. The algorithm uses two components: a detector that makes use of the generalised Hough transform with pixel-based descriptors, and a prob-abilistic segmentation method based on global models for foreground an ..."
Abstract - Cited by 8 (0 self) - Add to MetaCart
. The pro-posed tracking method has been thoroughly evaluated on challenging standard videos, and outperforms state-of-the-art tracking methods designed for the same task. Finally, the proposed models allow for an extremely efficient imple-mentation, and thus tracking is very fast. 1.
Next 10 →
Results 1 - 10 of 48
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University