Results 1 - 10
of
36
Weighted finite-state transducers in speech recognition
- COMPUTER SPEECH & LANGUAGE
, 2002
"... We survey the use of weighted finite-state transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for hidden Markov models (HMMs), context-dependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general tr ..."
Abstract
-
Cited by 211 (5 self)
- Add to MetaCart
(Show Context)
We survey the use of weighted finite-state transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for hidden Markov models (HMMs), context-dependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general transducer operations combine these representations flexibly and efficiently. Weighted determinization and minimization algorithms optimize their time and space requirements, and a weight pushing algorithm distributes the weights along the paths of a weighted transducer optimally for speech recognition. As an example, we describe a North American Business News (NAB) recognition system built using these techniques that combines the HMMs, full cross-word triphones, a lexicon of 40 000 words, and a large trigram grammar into a single weighted transducer that is only somewhat larger than the trigram word grammar and that runs NAB in real-time on a very simple decoder. In another example, we show that the same techniques can be used to optimize lattices for second-pass recognition. In a third example, we show how general automata operations can be used to assemble lattices from different recognizers to improve recognition performance.
Sound and Precise Analysis of Web Applications for Injection Vulnerabilities
- PLDI'07
, 2007
"... Web applications are popular targets of security attacks. One common type of such attacks is SQL injection, where an attacker exploits faulty application code to execute maliciously crafted database queries. Both static and dynamic approaches have been proposed to detect or prevent SQL injections; w ..."
Abstract
-
Cited by 161 (5 self)
- Add to MetaCart
Web applications are popular targets of security attacks. One common type of such attacks is SQL injection, where an attacker exploits faulty application code to execute maliciously crafted database queries. Both static and dynamic approaches have been proposed to detect or prevent SQL injections; while dynamic approaches provide protection for deployed software, static approaches can detect potential vulnerabilities before software deployment. Previous static approaches are mostly based on tainted information flow tracking and have at least some of the following limitations: (1) they do not model the precise semantics of input sanitization routines; (2) they require manually written specifications, either for each query or for bug patterns; or (3) they are not fully automated and may require user intervention at various points in the analysis. In this paper, we address these limitations by proposing a precise, sound, and fully automated analysis technique for SQL injection. Our technique avoids the need for specifications by considering as attacks those queries for which user input changes the intended syntactic structure of the generated query. It checks conformance to this policy by conservatively characterizing the values a string variable may assume with a context free grammar, tracking the nonterminals that represent user-modifiable data, and modeling string operations precisely as language transducers. We have implemented the proposed technique for PHP, the most widely-used web scripting language. Our tool successfully discovered previously unknown and sometimes subtle vulnerabilities in real-world programs, has a low false positive rate, and scales to large programs (with approx. 100K loc).
Static Approximation of Dynamically Generated Web Pages
, 2005
"... Server-side programming is one of the key technologies that support today's WWW environment. It makes it possible to generate Web pages dynamically according to a user's request and to customize pages for each user. However, the flexibility obtained by server-side programming makes it much ..."
Abstract
-
Cited by 122 (3 self)
- Add to MetaCart
Server-side programming is one of the key technologies that support today's WWW environment. It makes it possible to generate Web pages dynamically according to a user's request and to customize pages for each user. However, the flexibility obtained by server-side programming makes it much harder to guarantee validity and security of dynamically generated pages.
Static Detection of Cross-Site Scripting Vulnerabilities
- ICSE'08
, 2008
"... ..."
(Show Context)
Parameter Estimation for Probabilistic Finite-State Transducers
- Proc. of the Annual Meeting of the Association for Computational Linguistics
, 2002
"... Weighted finite-state transducers suffer from the lack of a training algorithm. Training is even harder for transducers that have been assembled via finite-state operations such as composition, minimization, union, concatenation, and closure, as this yields tricky parameter tying. We formulate a &qu ..."
Abstract
-
Cited by 58 (4 self)
- Add to MetaCart
Weighted finite-state transducers suffer from the lack of a training algorithm. Training is even harder for transducers that have been assembled via finite-state operations such as composition, minimization, union, concatenation, and closure, as this yields tricky parameter tying. We formulate a "parameterized FST" paradigm and give training algorithms for it, including a general bookkeeping trick ("expectation semirings") that cleanly and efficiently computes expectations and gradients.
Probabilistic Finite-State Machines - Part I
"... Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finite-state automata with other well known devices that generate strings as hidden Markov models and n-grams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
String analysis for x86 binaries
- in Proc. 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE
, 2005
"... 1 ..."
(Show Context)
A practical string analyzer by the widening approach
- In APLAS
, 2006
"... Abstract. The static determination of approximated values of string expressions has many potential applications. For instance, approximated string values may be used to check the validity and security of generated strings, as well as to collect the useful string properties. Previous string analysis ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The static determination of approximated values of string expressions has many potential applications. For instance, approximated string values may be used to check the validity and security of generated strings, as well as to collect the useful string properties. Previous string analysis efforts have been focused primarily on the maxmization of the precision of regular approximations of strings. These methods have not been completely satisfactory due to the difficulties in dealing with heap variables and context sensitivity. In this paper, we present an abstract-interpretation-based solution that employs a heuristic widening method. The presented solution is implemented and compared to JSA. In most cases, our solution gives results as precise as those produced by previ-ous methods, and it makes the additional contribution of easily dealing with heap variables and context sensitivity in a very natural way. We anticipate the employment of our method in practical applications. 1
Speech Recognition Grammar Compilation in Grammatical Framework
- In Proceedings of the Workshop on Grammar-Based Approaches to Spoken Language Processing
, 2007
"... This paper describes how grammar-based language models for speech recognition systems can be generated from Grammatical Framework (GF) grammars. Context-free grammars and finite-state models can be generated in several formats: GSL, SRGS, JSGF, and HTK SLF. In addition, semantic interpretation code ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
This paper describes how grammar-based language models for speech recognition systems can be generated from Grammatical Framework (GF) grammars. Context-free grammars and finite-state models can be generated in several formats: GSL, SRGS, JSGF, and HTK SLF. In addition, semantic interpretation code can be embedded in the generated context-free grammars. This enables rapid development of portable, multilingual and easily modifiable speech recognition applications. 1
Weighted Grammar Tools: The GRM Library
, 2000
"... We describe the algorithmic and software design principles of a general grammar library designed for use in spoken-dialogue systems, speech synthesis, and other speech processing applications. The library is a set of general-purpose software tools for constructing and modifying weighted finite-state ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
We describe the algorithmic and software design principles of a general grammar library designed for use in spoken-dialogue systems, speech synthesis, and other speech processing applications. The library is a set of general-purpose software tools for constructing and modifying weighted finite-state acceptors and transducers representing grammars. The tools can be used in particular to compile weighted contextdependent rewrite rules into weighted finite-state transducers, read and compile, when possible, weighted context-free grammars into weighted automata, and dynamically modify the compiled grammar automata. The dynamic modifications allowed include: grammar switching, dynamic modification of rules, dynamic activation or non-activation of rules, and the use of dynamic lists. Access to these features is essential in spoken-dialogue applications. 2.1 Motivation We describe the algorithmic and software design principles of a general grammar library (GRM library) designed for use in ...