| T.L. Booth and R.A. Thomson. Applying probability measures to abstract languages. IEEE Transaction on Computers C, 22:442--450, 1973. |
....and unit production matrices #L and #U , respectively, are required to provide the corrections in the prediction and completion stages. Booth and Thompson provide a rigorous proof that guarantees the existence of these matrices if the grammar is well behaved according to axioms provided by [27]. For further clarity, Stolcke also provides a simple example in Table D.3. 6.4.4 Viterbi Parse Motivated by the use of the Viterbi parsing in the HMM, we can also apply a generalization of the Viterbi method for parsing a string # to retrieve the most likely probability among For more ....
T.L. Booth and R.A. Thompson, \Applying probability measures to abstract languages, " IEEE Transactions on Computers, Vol. 22, pp. 442-450, 1973.
....necessarily generate a stochastic langnage. This is illustrated in the following exalnple. 2 Example 1.1 Consider the stochastic grammar G with nonterminal set Vv S , terminal set Vr = a . The productions with their probabilities are given by: s z 1 . q Following the technique presented in [2] we find that the production generating function is given by 9(s) qs 1 q, and that. the first molnent matrix E is given by [2q] We can conclude that the grammar is consistent if and only ifq 1 2. For details we refer to [5] Notice that all the different trees of string o have the ....
....defined analoguous to the distribu[ion langnage and stochastic language of an unres[ric;cd 3 Consistency In this section consistency of weakly restricted stochastic grmnnmrs will be considered. The theory of nmltiLyl)e branching processes will bc need to come Lo a similar theorem as is given [n [2] for unres[ric[ed stochastic gramnars. Definition 3.1 ]br ghe j lb occurrence of hal Ai VN the production generating fnncbion wectkl rcslricted slochaslic grammars is dcfiucd where r. k) is I if nonterminal occurrence A . appears in the riflhl hand sidc of the k lb producgion rule with ....
[Article contains additional citation context not shown here]
T.L. Booth, R.A. Thompson. Applying Probability Measures to Abstract Languages. In: IEIfE Transactions on ComputersVoh C-22, No. 5, May 197,3.
....Valencia Camino de Vera, s n. 46071 Valencia (Spain) e mail: jandreu,jbenedi dsic.upv.es Abstract An important problem related to the probabilistic estimation of Stochastic ContextFree Grammars (SCFGs) is guaranteeing the consistency of the estimated model. This problem was considered in [3, 14] and studied in [10, 4] for unambiguous SCFGs only, when the probabilistic distributions were estimated by the relative frequencies in a training sample. In this work, we extend this result by proving that the property of consistency is guaranteed for all SCFGs without restrictions, when the ....
....A fundamental question which is related to these PE algorithms is to guarantee whether or not the learned SCFG generates a probabilistic language, that is to say if this SCFG is consistent. Moreover, the consistency of a SCFG determines the validity of various interesting probabilistic properties [3, 14]. For unambiguous SCFGs, it was proven in [10, 4] that when the probabilistic distributions are estimated by the relative frequencies in the sample, the obtained SCFG is consistent. In this work, this result is generalized by proving that the property of consistency is satised for SCFGs without ....
[Article contains additional citation context not shown here]
T.L. Booth and R.A. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22(5):442450, May 1973.
....study, two languages were chosen: the palindrome language with three terminal symbols (PAL3) and the arithmetic expression language with 5 terminal symbols (EXP) A SCFG was created for each language and was used only for generating a training sample. These grammars were consistent according to [3]. Each training sample had 5000 strings, but only 630 of them were different for PAL3, and 896 were different for EXP. For the training process, an initial characteristic grammar was created for each language. The number of non terminal (n) symbols was chosen heuristically as is described in ....
....EXP) Each grammar had the maximumnumber of rules that could be created with the chosen number of non terminal symbols and the given number of terminal symbols (v) that is, n n Delta v . The probabilities of the rules were attached randomly, but guaranteeing that the grammar was proper [3]. In order to avoid the problem of a bad initialization, ten different initializations were used for each task and for each algorithm. With this initial grammar, a reestimation process was carried out with the IO algorithm on the one hand and with the Viterbi algorithm on the other hand. At each ....
T.L. Booth and R.A. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22(5):442--450, May 1973.
.... number of times rule r is seen in a tree T , then the probability of a tree T can be written as P (T jQ) Y r2R p(r) c(T;r) or equivalently log P (T jQ) X r c(T ; r) log p(r) OE(T ) Delta Q where we define OE(T ) to be an n dimensional vector whose i th component is c(T ; r i ) [Booth and Thompson 1973] give conditions on the weights which ensure that P (T jQ) is a valid probability distribution over the set T , in other words that P T2T P (T jQ) 1, and 8T 2 T , P (T jQ) 0. The main condition is that the parameters define conditional distributions over the alternative ways of rewriting each ....
....trees fT 1 ; T 2 : Tm g. The log likelihood of the training set given parameters Q is L(Q) P j log P (T j jQ) The maximum likelihood estimates are to take Q = arg maxQ2W L(Q) where W is the set of allowable parameter settings (i.e. the parameter settings which obey the constraints in [Booth and Thompson 1973]) It can be proved using constrained optimization techniques (i.e. using Lagrange multipliers) that the maximumlikelihood estimate for the weight of a rule r = ff fi is p(ff fi) P j c(T j ; ff fi) P j c(T j ; ff) here we overload the notation c so that c(ff) is the number of times ....
[Article contains additional citation context not shown here]
Booth, T. L., and Thompson, R. A. 1973. Applying Probability Measures to Abstract Languages. IEEE Transactions on Computers, C-22(5), 442--450.
....to be fruitful, e.g. for the problem of ambiguity resolution. The simple but useful approximation adopted here is to assume the most plausible analysis of a string to be the most probable analysis of that string. An attempt to transfer the techniques of probabilistic context free grammars (see [3]) to CLGs was presented in [7] In this approach the derivation process of CLGs is dened as a stochastic process by the following stochastic model: Each program clause gets assigned an application probability and the probabilities of all clauses dening one predicate have to sum to 1. The ....
....should easily be given a formal basis in terms of our quantitative CLP scheme. 5 This calculation scheme also could easily be captured by our quantitative CLP scheme by replacing min by a product accordingly in the relevant denitions of the declarative and procedural semantics of our scheme. 6 [3] discuss further conditions on consistency of probabilistic grammars which would have to be satised also by a probabilistic CLG model. be incorrect, in the sense that it makes an independency assumption for clause applications which is violated by the languages generated from such probabilistic ....
Taylor L. Booth and Richard A. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22(5):442450, 1973.
....frequency information with the components making up a grammar formalism. For example, just two of 13 14 the options in the case of CFG are: 1) associating a single probability with each production that determines the probability of its use wherever it is applicable (i.e. Stochastic CFG; SCFG (Booth and Thompson, 1973)) or (2) associating different probabilities with a production depending on the particular nonterminal occurrence (on the RHS of a production) that is being rewritten (Chitrao and Grishman, 1990) In the latter case probabilities depend on the context (within a production) of the nonterminal ....
Booth, T. and Thompson, R. (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22(5):442--450.
.... [of his son] man] a man] proud] of his son] a [so tall] man] so tall] a man] a [six feet] tall man] six feet] tall] a six foot tall man] was [every three weeks] fixing] his bike [was frequently fixing] his bike ffl More precisely, F C selection must be in same chunk 131 General [2, 3, 4, 35, 36, 50, 61, 62, 81, 82, 84, 116, 117, 118, 129, 143, 144, 148, 200] Tagging [10, 19, 28, 56, 57, 66, 90, 91, 124, 125, 126, 131, 138, 153, 163, 168, 188] HMMs [21, 22, 23, 24, 25, 49, 64, 67, 78, 115, 119, 155, 157, 160, 161] Search [156] The Inside Outside Algorithm [85, 86, 136, 137] Regression [20, 30, 29, 38, 41, 42, 45, 46, 154, 162] Partial Parsing [6, 7, ....
T.L. Booth and R.A. Thompson. Applying probability measures to abstract languages. IEEE Trans. Comput., C-22:442--450, 1973.
....the general case, innite trees can be included in the sample space: innite labeled trees are labeled trees with an innite node set X. This requires an extension in the denition of the measure (not all subsets of the sample space are measurable) but does not aoeect the probabilities of nite trees. Booth and Thompson (1973) analyzes the conditions under which a probability measure over nite trees is dened. AIMS VOL. 4 NO. 3 1998 37 yields of the ordered daughters of the node. A sentence is a nite sequence of words, i.e. an element of W . We have already dened the event monomial e( of a tree licensed by a ....
Booth, T. L. and R. A. Thompson (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers C-22 (5), 442450.
.... left corner and right corner probabilities, P (X )L w 1 ) and P (Y )R w 2 ) which can each be obtained from a single matrix inversion (Jelinek Lafferty 1991) It should be mentioned that there are some technical conditions that have to be met for a SCFG to be well defined and consistent (Booth Thompson 1973). These condition are also sufficient to guarantee that the linear equations given by (3) have positive probabilities as solutions. The details of this are discussed in the Appendix. Finally, it is interesting to compare the relative ease with which one can solve the substring expectation problem ....
Booth, Taylor L., & Richard A. Thompson. 1973. Applying probability measures to abstract languages. IEEE Transactions on Computers C-22.442--450.
....the general case, innite trees can be included in the sample space: innite labeled trees are labeled trees with an innite node set X. This requires an extension in the denition of the measure (not all subsets of the sample space are measurable) but does not aoeect the probabilities of nite trees. Booth and Thompson (1973) analyzes the conditions under which a probability measure over nite trees is dened. AIMS VOL. 4 NO. 3 1998 37 yields of the ordered daughters of the node. A sentence is a nite sequence of words, i.e. an element of W . We have already dened the event monomial e( of a tree licensed by a ....
Booth, T. L. and R. A. Thompson (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers C-22 (5), 442450.
....language L; OE that cannot be generated by an unrestricted stochastic grammar G c ; D such that L; OE = L(G c ) p u (p u designates here the probability function as defined earlier) We will not give a full proof of this theorem. We only mention the counter example that is used in [2] to proof it: let L = fa i b i ji 2 f0; 1; gg. We know that L can be generated by a context free grammar. The following probability function is associated with L: OE(a i b i ) e Gammaa a i i where a is any (nonzero) real number. The theorem can now be proved by proving ....
....E will satisfy this condition if the magnitude of all the characteristic roots of E are less than one 1: the grammar is consistent. Similarly if one or more of these roots has a magnitude greater than 1 the limit diverges: the grammar is inconsistent. 2 The proof of the theorem is taken from [2]. Note that the theorem does not decide consistency if the largest eigenvalue of the first moment matrix is equal to 1. We will discuss that special case later on. An unrestricted stochastic grammar is called strongly consistent if all the eigenvalues of the E matrix have magnitude less than 1. ....
T.L. Booth, R.A. Thompson. Applying Probability Measures to Abstract Languages. In: IEEE Transactions on Computers Vol. C-22, No. 5 , May 1973.
....a stochastic language. This is illustrated in the following example. 2 Example 1.1 Consider the stochastic grammar G with nonterminal set VN = fSg, terminal set V T = fag. The productions with their probabilities are given by: S q S S S 1 Gammaq a Following the technique presented in [2] we find that the production generating function is given by g 1 (s 1 ) qs 2 1 1 Gamma q, and that the first moment matrix E is given by [2q] We can conclude that the grammar is consistent if and only if q 1=2. For details we refer to [5] Notice that all the different trees of string a ....
....analoguous to the distribution language and stochastic language of an unrestricted grammar. 3 Consistency In this section consistency of weakly restricted stochastic grammars will be considered. The theory of multitype branching processes will be used to come to a similar theorem as is given in [2] for unrestricted stochastic grammars. Definition 3.1 For the j th occurrence of nonterminal A i 2 VN the production generating function for weakly restricted stochastic grammars is defined as: g ij (s 1;1 ; s k;R(Ak ) jCA i j X u=1 p iju k Y m=1 R(Am ) Y n=1 s rmn (u) ....
[Article contains additional citation context not shown here]
T.L. Booth, R.A. Thompson. Applying Probability Measures to Abstract Languages. In: IEEE Transactions on ComputersVol. C-22, No. 5, May 1973.
....to non terminal X and the column corresponding to non terminal Y is the expected number of times X will be replaced by Y in exactly one production rule. As the spectral radius ae(M ) which is the modulus of the largest eigenvalue, is always less then 1 the probabilistic grammar is consistent [4]. That is, the sum over all the sentences generated from this grammar is 1. M = LP C B L T LP C B L T 2 6 6 6 6 6 4 1 Gamma 1 C 1 1 Gamma 1 C 1 0 0 0 0 0 1 0 0 0 0 1 Gamma 1 L 1 1 Gamma 1 L 1 0 0 0 0 0 1 nL P nL i=1 arity i 0 0 0 0 1 Gamma 1 V 1 3 7 7 7 7 7 5 ....
T. L. Booth and R. A. Thompson. Applying probability measures to abstract languages. IEEE Trans. Comput., C-22:442--450, 1973.
No context found.
T.L. Booth and R.A. Thomson. Applying probability measures to abstract languages. IEEE Transaction on Computers C, 22:442--450, 1973.
No context found.
T. L. Booth and R. A. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computers, 22(5):442-450, 1973.
No context found.
Taylor L. Booth and Richard A. Thompson, "Applying proba- bility measures to abstract languages," IEEE Transactions on Computers, vol. 22, pp. 442-450, 1973.
No context found.
T. Booth and R. Thompson. 1973. "Applying Probability Measures to Abstract Languages". In IEEE Transactions on Computers, 2(5).
No context found.
T. Booth and R. Thompson. 1973. "Applying Probability Measures to Abstract Languages". In IEEE Transactions on Computers, 22(5).
No context found.
BOOTH,TAYLOR L., & RICHARD A. THOMPSON. 1973. Applying probability measures to abstract languages. IEEE Transactions on Computers C-22.442--450.
No context found.
Booth, T. L. and R. A. Thompson (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers C-22 (5), 442450.
No context found.
Booth, T. L. and R. A. Thompson (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers C-22 (5), 442450.
No context found.
Booth, T. L. and R. A. Thompson (1973). Applying probability measures to abstract languages. IEEE Transactions on Computers C-22 (5), 442450.
No context found.
Taylor Booth and Richard Thompson. 1973. Applying probability measures to abstract languages.
No context found.
Taylor L. Booth & Richard A. Thompson[May 1973], "Applying Probability Measures to Abstract Languages," IEEE Transactions on Computers C-22, 442--450.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC