79 citations found. Retrieving documents...
E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Compilation of Constraint-based Contextual - Rules For Part-Of-Speech (2002)   (Correct)

....interpolation of uni , bi , and trigrams as smoothing technique [2] The aim of the present work is to describe the new formalism of contextual rules. This formalism is mainly inspired in CGs, but some aspects from other rule based environments, such as transformation based error driven learning [1] or relaxation labelling [5] have been also considered. After this, we focus the discussion on the ecient execution of the rules, for which we design a strategy that compiles them into nite state transducers (FSTs) Finally, we make some re ections about time and space complexity of the FSTs ....

Brill, E. (1994). Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Arti cial Intelligence (AAAI-94).


Applying Machine Learning for High Performance.. - Baluja, Mittal.. (1999)   (11 citations)  (Correct)

....(POS) tags can be used by other modules to reason about the roles and relative importance of words tokens in various contexts. In this system, we used the Brill tagger for POS tagging . Brill reports approximately 97 to 98 overall accuracy for words in the WSJ corpus for the tagger [ Brill, 1994; Brill, 1995 ] Its performance is lower on the named entity task. On our training data, the tagger obtained an F score of only 83 (P = 81, Performance on the name detection task is typically measured by the F # score [van Rijsbergen, 1979] which is a combination of the Precision (P) and Recall ....

Eric Brill. Some advances in rule based part-of-speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 722--727, Seattle, WA, 1994. AAAI.


Intelligent Web Agents that Learn to Retrieve and Extract.. - Eliassi-Rad, Shavlik (2001)   (Correct)

....an exhaustive list of all possible candidate bindings. The first step W W IE takes (both during mining and after) is to generate all possible fillers for each individual slot for a given document. Fillers can be individual words or phrases. Individual words are collected by using Brill s tagger [3], which annotates each word in a document with its part of speech tag. For each slot, we collect every word in the document that has a POS tag that matches a tag assigned to this variable somewhere in the IE task s advice. For cases where a variable is associated with a phrase, we apply a sentence ....

Brill E. (1994). Some advances in rule-based part of speech tagging, Proc. of LI-94 Conference, 722-727.


Phrasal Parsing by Using Data-Driven PoS Taggers - Megyesi   (Correct)

....and applied with good results for analyzing natural languages on different linguistic levels. For example, Hidden Markov Modeling (Brants, 2000) Maximum Entropy (Ratnaparkhi, 1996) Memory Based Learning (Daelemans et al. 1996; Zavrel Daelemans, 1999) and Transformation Based Learning (Brill, 1994) have been successfully applied to PoS tagging of English with an average accuracy of between 95 and 97 . Recently some attempts also have been made to build data driven shallow parsers for English by nding syntactically related nonoverlapping groups of words, so called chunks (Abney, 1991) ....

....parsers are based on three stateof the art data driven PoS taggers which will be described next. 2. 1 Algorithms Three data driven state of the art PoS taggers are included in the study: mxpost, based on the Maximum Entropy framework (Ratnaparkhi, 1996) TransformationBased Learning (tbl) (Brill, 1994), and Trigrams n Tags (tnt) based on Hidden Markov Model (Brants, 2000) mxpost is a probabilistic classi cationbased approach based on a Maximum Entropy model where contextual information is represented as binary features that are simultaneously used in order to predict the PoS tags. The ....

Eric Brill. Some Advances in RuleBased Part of Speech Tagging. In Proceedings of the 12th National Conference on Arti cial Intelligence


Evaluation of Index Term Discovery in Medical Reference Text - Wollersheim, Rahayu, Reeve (2002)   (Correct)

....taxonomy, and subsequently classified text atoms. Solid lines shows taxonomic IS A links, while dotted lines denoted classification by a taxonomic term level indexing of medical text by UMLS [8] We compare three web available part of speech taggers: two statistical taggers, namely Brill Tagger [9], and Treetagger [10] and a more recent rule based constraint grammar tagger, the EngCG Tagger [11] A second method of index term generation uses the existing book index as a base. It gets candidate terms from the list of key words that was assigned to the text when it was in book form. The ....

Brill, E., Some Advances In Rule-Based Part of Speech Tagging. AAAI, 1994.


Text Chunking based on a Generalization of Winnow - Zhang, Damerau, Johnson (2001)   (2 citations)  (Correct)

....data are extracted from sections of the Penn Treebank. The training set consists of WSJ sections 15 18 of the Penn Treebank (211727 tokens) and the test set consists of WSJ sections 20 (47377 tokens) Additionally, a part of speech (POS) tag was assigned to each token by a standard POS tagger [2] that was trained on the Penn Treebank. These POS tags can be used as features in a machine learning based chunking algorithm. See Section 5 for detail. As an example, for the previous example sentence, the associated POS tags, given in the parenthesis following each token, are: Balcor (NNP) ....

Eric Brill. Some advances in rule-based part of speech tagging. In Proc. AAAI 94, pages 722-727, 1994.


DOrAM: Real Answers to Real Questions - Mahlin, Goldman, Rosenschein (2002)   (Correct)

....sentences supplied by the expert are analyzed by a link grammar parser [4] that recognizes the noun phrase (NP) verb phrase (VP) and prepositional phrases (PP) in the sentences. Further, to reduce the number of potential mistakes made by the link grammar parser, we apply a part of speech tagger [2] to the same sentences to carry out a morphological analysis (e.g. adj, nouns, Finally, we extract all verbs (which were recognized as such during the morphological analysis) from VP clauses, and all nouns (again identified by morphological analysis) from all NP and PP clauses. We always ....

E. Brill. Some advances in rule-based part of speech tagging. Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, Washington., pages 722-727, 1994.


Massively Parallel Distributed Feature Extraction in.. - Kuntraruk, Pottenger (2001)   (Correct)

....Tagging After identifying fields of interest, our feature extraction algorithms perform part of speech tagging. The part of speech tagger is a rule based system for tagging English parts of speech. This system is based on the SemanTag system developed in [7] which in turn is based on [3] 4] [5]. The tagger uses three levels of rule sets to determine the part of speech of each word, and tags words with their English part of speech tag, as specified in the Brown tagset [12] DT determiner IN preposition or subordinating conjunction NN noun singular or mass PP personal ....

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, 1994.


Classification of Research Papers using Citation Links and .. - NANBA, KANDO, OKUMURA (2000)   (2 citations)  (Correct)

....all other papers in the same category. Our system then inspected papers from the database and returned ranked papers for each query. Search Engine We implemented the search engine based on a vector space model. Our system first extracts all nouns from passages using Brill s part of speech tagger[Brill, 1994]. Then the system calculates the similarity by cosine distance using extracted nouns. Alternatives We conducted experiments using the following eight methods. FULL , TITLE , ABST : using co occurrence of words in the full length text, title and abstract. METHOD , PURPOSE (our ....

Brill, E. Some advances in rule-based part of speech tagging. Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), pp. 722--727, 1994.


Scalable Browsing for Large Collections: A Case Study - Paynter, Witten.. (2000)   (4 citations)  (Correct)

....tried to identify noun phrases. The two approaches are equally accurate on the keyphrase extraction task, but we used stop words in the final system because it is significantly faster. The syntactic analysis first tags the input by assigning syntactic classes to each word. We use the Brill tagger [1,2]. Then we experimented with two heuristics for noun Figure 6: Browsing for information on poisson Figure 5: Browsing for information on dairy phrase identification. The first was suggested by Turney [18] as matching almost all of the keyphrases in the corpuses he used. It specifies zero or more ....

Brill, E. (1994) Some advances in rule-based part of speech tagging,Proc AAAI-94, pp. 722---727, Seattle.


Transformation-Based Learning of Danish Grammar Correction - Hardt   (Correct)

....the small size of the training corpus. The technique is quite general, and could be directly applied to a large number of grammar checking problems in Danish or other languages. 1 Introduction We describe a general technique for automatically deriving grammar checkers, using the Brill Tagger (Brill 94; Brill 95) For a given type of error, error occurrences are systematically generated, and special tags are used to identify the correct and incorrect forms. Then the tagger is trained to learn contexts where errors can be identi ed. The standard context rule learning system from the Brill Tagger ....

E. Brill. Some Advances in Rule-based Part of Speech Tagging. In Proceedings of the Twelfth National Conference on Articial Intelligence, pages 722-727. AAAI Press/The MIT Press, 1994.


A Corpus-based Bootstrapping Algorithm for Semi-Automated.. - Riloff (1999)   (Correct)

....Introduction Natural language understanding requires both syntactic and semantic knowledge, yet there are surprisingly few resources available for lexical semantic information. In contrast, a variety of dictionaries and computational tools are available for acquiring syntactic information (e.g. (Brill 1994; Church 1989; Marcus, Santorini, Marcinkiewicz 1993; Weischedel et al. 1993) Ideally, one would like to have a semantic knowledge base that contains semantic representations of all words, phrases, and concepts in the language. Given the vast scope of human knowledge and the practical ....

....corpus and a handful of predefined seed words. The only additional knowledge used by our system is a part of speech dictionary for syntactic segmentation. We used a hand crafted part of speech dictionary for these experiments, but statistical and corpus based taggers are widely available (e.g. (Brill 1994; Church 1989; Weischedel et al. 1993) One other relevant piece of related research is Roark and Charniak s work (Roark Charniak 1998) which improves upon preliminary results that we reported in (Riloff Shepherd 1997) Roark and Charniak confirmed our intuition that conjunctions, ....

Brill, E. 1994. Some Advances in Rule-based Part of Speech Tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, 722--727. AAAI Press/The MIT Press.


A Mutually Beneficial Integration of Data Mining and.. - Nahm, Mooney (2000)   (6 citations)  (Correct)

....a specific togeneral search. Constraints on patterns can specify the specific words, part of speech, or semantic classes of tokens. The hypernym links in WordNet (Fellbaum 1998) provide semantic class information, and documents are annotated with part of speech information using the tagger of Brill (1994). In this paper, we use the simpler version of RAPIER that employs only word and part of speech constraints since WordNet classes provide no additional advantage in this domain (Califf Mooney 1999) The learning algorithm of RAPIER was inspired by several inductive logic programming systems. ....

Brill, E. 1994. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, 722--727.


An Evaluation of Linguistically-motivated Indexing.. - Arampatzis, van der.. (2000)   (4 citations)  (Correct)

....indexing terms from the dataset, we applied some preprocessing. The pre processing was performed in six steps: 1. Tokenization (script written in PERL) Detection of sentence boundaries followed by division of sentences into words. 2. Part of speech tagging: Brill s rule based tagger 2 [3] was employed to obtain POS information for the contents of the dataset. The tagger comes with a lexicon derived from both the Penn Treebank tagging of the Wall Street Journal (WSJ) and the Brown Corpus. Conveniently, the WSJ articles are, like the Reuters documents, about economic topics and ....

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Wa., 1994.


A Theory-Refinement Approach to Information Extraction - Eliassi-Rad, Shavlik (2001)   (Correct)

....that the trained network scores. 2.1 Individual Slot Candidate Generator The first step Wawa IE takes (both during training and after) is to generate all possible individual fillers for each slot on a page. Fillers can be individual words or phrases. Individual words are collected by using Brill s (1994) tagger to annotate each word in the document with its POS. For each slot, we collect every word in the current document that has a POS tag that matches a tag assigned to this variable somewhere in the IE task s advice. For cases where a variable is associated with a phrase, we apply a sentence ....

....is that we require the user to provide us with the POS tags or parse structures of the extraction slots. We assume that the Brill s tagger and Sundance are perfect (i.e. they tag words and parse sentences with 100 accuracy) Brill s tagger annotates the words on a document with 97.2 accuracy (Brill, 1994), so 2.9 error rate propagates into our results. We were not able to find accuracy estimates for Sundance, though do remember that we also consider all subphrases of the phrases Sundance produces. Our approach is computationally demanding, due to its use of a generate and test approach. However, ....

Brill, E. (1994). Some advances in rule-based part of speech tagging. Proc. AAAI94 (pp. 722--727). Seattle, WA.


Scalable Browsing for Large Collections: A Case Study - Gordon Paynter Ian (2000)   (4 citations)  (Correct)

....task, but we used stop words in the final system because it is significantly faster. The syntactic analysis first tags the input by assigning syntactic classes to each word. We use the Brill tagger Figure 5: Browsing for information on dairy Figure 6: Browsing for information on poisson 5 [1,2]. Then we experimented with two heuristics for noun phrase identification. The first was suggested by Turney (in press) as matching almost all of the keyphrases in the corpuses he used. It specifies zero or more nouns or adjectives, followed by one final noun or gerund: noun adjective) noun ....

Brill, E. (1994) "Some advances in rule-based part of speech tagging," Proc AAAI-94, pp. 722--727, Seattle.


Classification of Research Papers using Citation Links and .. - Nanba, Kando, Okumura (2000)   (2 citations)  (Correct)

....all other papers in the same category. Our system then inspected papers from the database and returned ranked papers for each query. Search Engine We implemented the search engine based on a vector space model. Our system first extracts all nouns from passages using Brill s part of speech tagger [Brill, 1994]. Then the system calculates the similarity by cosine distance using the extracted nouns. Alternatives We conducted experiments using the following eight methods. FULL , TITLE , ABST : using words in the full length text, title and abstract. METHOD , PURPOSE (our methods) using ....

Brill, E. Some advances in rule-based part of speech tagging. Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), pp. 722--727, 1994.


The Balancing Act: Combining Symbolic and Statistical Approaches .. - Pedersen (1999)   (Correct)

....Readers with an interest in machine learning should take particular note of Chapter 7, Exploring the Nature of Transformation Based Learning, by Lance A. Ramshaw and Mitchell P. Marcus. They provide a detailed explanation and analysis of transformation based learning (e.g. Bri93] [Bri94]) a corpus based approach that has been applied to part of speech tagging, parsing, and prepositional phrase attachment. The focus of the chapter is on the resistance of transformation based learning to over training; this is investigated in depth and numerous comparisons are made to ....

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994.


Feature Engineering for a Symbolic Approach to Text Classification - Scott (1998)   (4 citations)  (Correct)

....from a document requires two separate algorithms. The first is a tagging algorithm to assign part of speech tags (noun, verb, preposition, etc. to the individual words, and the second is an algorithm to group the tagged words into noun phrases. Eric Brill s rule based part of speech tagger [BRI92, BRI94] was used to assign the part of speech tags, and grouping was done with a simple regular expression based partly on the discussion in the previous section. The latter algorithm is implemented within the FX system. For simplicity, the combination of Brill s tagger with the new functionality in FX ....

....algorithms tried to determine the most likely tag for a given word using probabilities estimated from a manually tagged corpus. Eric Brill developed a transformation based part of speech tagger, based on a supervised machine learning model, that was comparable in accuracy to stochastic taggers [BRI92, BRI94]. The algorithm proceeds in two passes. On the first pass, the words are assigned their most likely tag, based on their frequency of appearance with that tag in a manually tagged training corpus. On the second pass, tags are transformed based on the application of a set of rules learned from the ....

[Article contains additional citation context not shown here]

Eric Brill. Some Advances in Rule-Based Part of Speech Tagging. In Proc. AAAI-94. 1994. 722-727.


Instructable and Adaptive Web Agents that Learn to.. - Eliassi-Rad, Shavlik (2000)   (1 citation)  (Correct)

....headings. Moreover, bags for the words in the grandparent and great grandparent sections are kept, should the current window be nested that deeply. This knowledge of the context of words does not limit advice to only describing relations between nearby words. Wawa uses Brill s tagger (Brill, 1994) to annotate each word on a page with a part of speech (POS) tag (i.e. noun, proper noun, verb, etc) This information is represented in the neural networks as input features for the words in the sliding window. By adding POS tags, we are able to distinguish between di#erent grammatical uses of a ....

....of training an IE agent. We also generate an extracted list of candidates for each example in the testing set. The extracted lists of candidates associated with the testing set is used for evaluating the IE agent s performance. Brill s tagger annotates the words on a document with 97.2 accuracy (Brill, 1994). Note that in our experiments we assume that the Brill s tagger is perfect (i.e. it tags the words with 100 accuracy) This means that the 2.9 error rate in Brill s tagger propagates into our results. We were not able to find a tagger more accurate than Brill s tagger. 5.1.2. Training an IE ....

[Article contains additional citation context not shown here]

Brill, E.: 1994, `Some advances in rule-based part of speech tagging'. In: Proceedings of the Twelfth National Conference on Artificial Intelligence. Seattle, WA, pp. 722--727.


Feature Reduction for Document Clustering and Classification - Rüger, Gauch (2000)   (2 citations)  (Correct)

....or phrases, and a document collection can contain millions of di#erent features. Even after applying standard feature reduction techniques, the number of features remains large: in our clustering experiments with 528,155 US American newspaper articles, we only kept nouns based on Brill s tagger (Brill 1994) with a medium document frequency: the noun had to appear in least three documents and in no more that 33 of all documents. Additionally, a list of stop words was used to eliminate obvious function words of the language. This resulted in a vocabulary of around 280,000 socalled potentially ....

Brill, E. (1994). Some advances in rule-based part of speech tagging. In AAAI.


Feature Engineering for Text Classification - Scott, Matwin (1999)   (8 citations)  (Correct)

....from a document requires two separate algorithms. The first is a tagging algorithm to assign part of speech tags (noun, verb, preposition, etc. to the individual words, and the second is an algorithm to group the tagged words into noun phrases. Eric Brill s rule based part of speech tagger [Brill, 1992; 1994] is used in its default configuration to assign the part of speech tags, and grouping is done with a simple regular expression. The two algorithms together are referred to as the Noun Phrase Extractor (NoPE) 2 For our purposes a noun phrase is defined as a sequence of nouns or adjectives ....

Brill, Eric. 1994. Some advances in rule-based part of speech tagging. AAAI-94. 722-727.


Machine Learning and Natural Language Processing - Marquez (2000)   (1 citation)  (Correct)

....set of rules. It works iteratively by adding at each step the rule that best repairs the current errors. Concrete rules are acquired by instantiation of a predefined set of template rules. This algorithm has been applied to a number of natural language problems, including part of speech tagging [21, 22, 23, 6, 186], PP attachment disambiguation [20] parsing [18] spelling correction [133] and word sense disambiguation [67] One major drawback of TBL is its computational cost since all instantiations of templates are tested at each iteration to find the best rule. Recently, Samuel [191] presented an ....

....NLP tasks, such as speech processing, morphology and PoS tagging. NB DTs HMMs ME TBL NNs Speech recognition and synthesis [8, 9] 97] 187] 55, 56, 15] 206, 121, 155, 113, 229] Morphology [14] PoS tagging [194, 189] 200, 132, 140, 141, 164, 136, 138, 137] 45, 54, 144] 101, 177] [21, 22, 23, 6, 186] [155, 201, 70, 199, 131] IBL LSM EC PoS tagging [61, 60, 90, 58] 188] 90, 24, 139, 2, 136] Table 1: References corresponding to some low level NLP tasks Table 2 contains the references about parsing (either shallow or general) and structural ambiguity resolution. Table 3 groups the ....

Eric Brill. Some Advances in Rule--based Part--of--speech Tagging. In Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pages 722--727, 1994. http://www.cs.jhu.edu/brill/acadpubs.html.


Moving More Quickly toward Full Term Relations in Information Space - Newby   (Correct)

....utilized titles only and expanded by 25 or 50 terms (isa25t, isa50t) Results were considerably poorer than comparable AdHoc ISpace runs from TREC7, yielding average exact precision scores under 0.025. There are two likely explanations for this poor showing. One is that part of speech tagging (Brill, 1994) and query processing, used last year, are important. This is supported by the observation that expanded topics expanded all topic terms, including terms without much discriminatory value. This may have served to bring in useful terms, but it also increased the noise in topics. For query ....

Brill, Erik. (1994). "Some advances in rule-based part of speech tagging." Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94). Seattle, Washington.


YPA - An Intelligent Directory Enquiry Assistant - De Roeck, Kruschwitz, Neal.. (1998)   (3 citations)  (Correct)

....actually created. It will be seen that the same extraction techniques will result in significant differences in the results when applied to different parts of the input. However, it will be assumed some conditions hold in all further processing. Firstly, for part of speech tagging the Brill tagger [15, 16] is used without training (using the supplied lexical rule and contextual rule files of the Wall Street Journal Corpus and the lexicon of both the Wall Street Journal and Brown Corpus) This tagger is particularly appropriate as a contextual tagger is needed, especially one that is robust enough ....

....ENQUIRY ASSISTANT BT Technol J Vol 16 No 3 July 1998 152 handled similarly. The main problems can be summarised like this: there is no sentence structure, just a set of phrases, upper lower case distinctions are often used for lexical analysis (for example, Callan et al. [14] and Brill [16]) but in the source data file it is irrelevant usually the complete address entry is in one case, additional tagging errors (e.g. tagging a noun as a verb) result from the other points. One solution to the problems is to concentrate on nouns and compounds. It should be recalled that ....

Brill E: `Some advances in rule-based part of speech tagging', Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington (1994).


YPA - An Intelligent Directory Enquiry Assistant - De Roeck, Kruschwitz, Neal.. (1998)   (3 citations)  (Correct)

....It will be seen that the same extraction techniques will result in significant differences in the results when applied to different parts of the input. However, it will be assumed some conditions hold in all further processing. Firstly, for part of speech tagging the Brill tagger (Brill, 1992) (Brill, 1994) is used without training (using the supplied lexical rule and contextual rule files of the Wall Street Journal Corpus and the lexicon of both Wall Street Journal and Brown Corpus) This tagger is particularly appropriate as a contextual tagger is needed, especially one that is robust enough (see ....

....is true for both indexing the free text and the extracted company names, which are handled similarly. The main problems can be summarised like this: ffl there is no sentence structure, just a set of phrases, ffl upper lower case distinctions are often used for lexical analysis (for example (Brill, 1994) and (Callan, Croft, and Broglio, 1995) but in the source data file it is irrelevant usually the complete address entry is in one case, ffl additional tagging errors (e.g. tagging a noun as a verb) result from the other points. One solution to the problems is to concentrate on nouns and ....

Brill, E. 1994. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Wa.


Using a Trained Text Classifier to Extract Information - Eliassi-Rad, Shavlik   (Correct)

....the remaining 100 as our test set. In our previous work [10] we reported that the WAWA home page finder was more accurate than several other home page finders. Table 1: A Simple Information Extraction Algorithm 1. Retrieve the page s source text and tag it with Brill s part of speech tagger [3]. 2. If several proper nouns are adjacent to other, list each word separately and list all possible pairs and triples of these words (preserving the left to right order of the words) Discard all duplicate entries. The extracted list of single, double, and triple word phrases ( candidates ) ....

Brill, E., Some advances in rule-based part of speech tagging, Proc. AAAI '94, pp. 722-727.


Feature Engineering for Text Classification - Scott, Matwin (1999)   (8 citations)  (Correct)

....from a document requires two separate algorithms. The first is a tagging algorithm to assign part of speech tags (noun, verb, preposition, etc. to the individual words, and the second is an algorithm to group the tagged words into noun phrases. Eric Brill s rule based part of speech tagger [Brill, 1992; 1994] was used in its default configuration to assign the part of speech tags, and grouping was done with a simple regular expression. The two algorithms together are referred to as the Noun Phrase Extractor (NoPE) 1 For our purposes a noun phrase was defined to be a sequence of nouns or adjectives ....

Brill, Eric. 1994. Some Advances in Rule-Based Part of Speech Tagging. AAAI-94. 722-727.


Improving Browsing in Digital Libraries with Keyphrase .. - Gutwin, Paynter.. (1998)   (6 citations)  (Correct)

....systems query engine Figure 2. Kea keyphrase extraction process Finding phrases The process used to find candidate phrases for Keyphind is similar to that used in earlier projects (e.g. 31,41] First, documents are cleaned and tokenised, and then tagged using the Brill part of speech tagger [4]. The tagger adds a part of speech indicator (e.g. NN for noun, VB for verb) to each word. Words are not currently stemmed, although we do fold plurals to singular forms (e.g. libraries to library ) and standardize words that have different spellings (e.g. labor to labour ) Second, all ....

E. Brill, Some Advances in Rule-Based Part of Speech Tagging, in: Proceedings of the Twelfth National Conference on Artificial Intelligence, (AAAI Press, 1994).


Feature Reduction for Document Clustering and Classification - Rüger, Gauch (2000)   (1 citation)  (Correct)

....or phrases, and a document collection can contain millions of di erent features. Even after applying standard feature reduction techniques, the number of features remains large: in our clustering experiments with 528,155 US American newspaper articles, we only kept nouns based on Brill s tagger (Brill 1994) with a medium document frequency: the noun had to appear in least three documents and in no more that 33 of all documents. Additionally, a list of stop words was used to eliminate obvious function words of the language. This resulted in a vocabulary of around 280,000 so called potentially ....

Brill, E. (1994). Some advances in rule-based part of speech tagging. In AAAI.


An Evaluation of Linguistically-motivated Indexing.. - Arampatzis, van der.. (2000)   (4 citations)  (Correct)

....indexing terms from the dataset, we applied some preprocessing. The pre processing was performed in six steps: 1. Tokenization (script written in PERL) Detection of sentence boundaries followed by division of sentences into words. 2. Part of speech tagging: Brill s rule based tagger 2 [3] was employed to obtain POS information for the contents of the dataset. The tagger comes with a lexicon derived from both the Penn Treebank tagging of the Wall Street Journal (WSJ) and the Brown Corpus. Conveniently, the WSJ articles are, like the Reuters documents, about economic topics and ....

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Wa., 1994.


Unsupervised Part of Speech Tagging with Extended Templates - Becker (1998)   (Correct)

....Forschung und Technologie (BMBF) to the DFKI project paradime, FKZ ITW 9704. 1 Introduction Recently quite a number of methods for the training of part of speech taggers have emerged. Most of these are statistically based, employing the training of Hidden Markov Models. Brill,1992, Brill,1994] has shown that rule based Part of Speech (PoS) Taggers may perform comparably, with the added advantage of the rules being both more human readable and less space consuming than probability matrices of HMM s. However, the described algorithm is supervised, needing as input annotated corpora of a ....

Eric Brill. 1994. Some advances in rule based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI94) .


Term Selection for Filtering based on Distribution .. - Arampatzis, van.. (2000)   (Correct)

....of common function words, and morphological normalization of the remaining words. Tokenization consisted of detection of sentence boundaries, followed by division of sentences into words. Detection of sentence boundaries was necessary since we used a POS tagger. Brill s rule based tagger 9 [Brill, 1994] was employed to obtain POS information for the words of the dataset. The tagger comes with a lexicon derived from both the Penn Treebank tagging of the Wall Street Journal (WSJ) and the Brown Corpus. Conveniently, the WSJ articles are, like the Reuters documents, about economic topics and this ....

Brill, E. (1994). Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Wa.


Unsupervised Learning of Disambiguation Rules for Part of Speech.. - Brill (1995)   (52 citations)  Self-citation (Brill)   (Correct)

....require manually tagging This work was funded in part by NSF grant IRI 9502312. 2Some other approaches to tagging are described in [Hindle, 1989; Black eta . 1992] text each time the tagger is to be applied to a new language, and even when being applied to a new type of text. In [Brill, 1992; Brill, 1994] a rule based part of speech tagger is described which achieves highly competitive performance compared to stochastic taggers, and captures the learned knowledge in a set of simple deterministic rules instead of a large table of statistics. In addition, the learned rules can be converted into a ....

....called transformation based errordriven learning. Transformation based error driven learning has been applied to a number of natural language problems, including part of speech tagging, prepositional phrase at tachment disambiguation, speech generation and syntactic parsing [Brill, 1992; Brill, 1994; Ramshaw and Marcus, 1994; Roche and Schabes, 1995; Brill and Resnik, 1994; Huang et al. 1994; Brill, 1993a; Brill, 1993b] Figure I illustrates the learning process. First, unan notated text is passed through an initial state annotator. The initial state annotator can range in complexity from ....

[Article contains additional citation context not shown here]

Brill, E. 1994. Some advances in rule-based part of speech tagging. In Pro- ceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-9d), Seattle,


Guaranteed Pre-Tagging for the Brill Tagger - Mohammad, Pedersen (2002)   Self-citation (Brill)   (Correct)

....accuracy of tagging by providing a reliable anchor or seed around which to tag. 1 Introduction Part of speech tagging is a prerequisite task for many natural language processing applications, among them parsing, word sense disambiguation, machine translation, etc. The Brill Tagger (c.f. 1] [2], 3] 5] is one of the most widely used tools for assigning parts of speech to words. It is a hybrid of machine learning and statistical methods that is based on transformation based learning. The Brill Tagger has several virtues that we feel recommend it above other taggers. First, the source ....

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the 12th National Conference on Arti cial Intelligence (AAAI-94), Seattle, WA, 1994.


Dimacs At The Trec 2004 Genomics Track - Aynur Dayanik Dmitriy (2004)   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994.


Clause Identification - Erik Tjong Kim (2001)   (Correct)

No context found.

Eric Brill. 1994. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94). Seattle, Washington.


A Search Engine for Natural Language Applications - Cafarella, Etzioni (2005)   (Correct)

No context found.

E. Brill. Some Advances in Rule-Based Part of Speech Tagging. In AAAI, pages 722--727, 1994.


A Search Engine for Natural Language Applications - Cafarella, Etzioni (2005)   (Correct)

No context found.

E. Brill. Some Advances in Rule-Based Part of Speech Tagging. In AAAI, pages 722--727, 1994.


The Infocious Web Search Engine: Improving Web Searching.. - Ntoulas, Chao, Cho (2005)   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington, 1994.


Web-Scale Information Extraction in KnowItAll - Etzioni, Cafarella, Downey.. (2004)   (11 citations)  (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 722--727, 1994.


Matching Index Expressions - For Information Retrieval (1998)   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Arti cial Intelligence (AAAI-94), Seattle, Wa., 1994.


Web-Scale Information Extraction in KnowItAll - Etzioni, Cafarella, Downey.. (2004)   (11 citations)  (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 722--727, 1994.


The Role of the HDDI Collection Builder in.. - Bader, Callahan..   (1 citation)  (Correct)

No context found.

E. Brill, "Some advances in rule-based part of speech tagging", Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), 1994.


Compact and Tractable Descriptors for Information Discovery - Wondergem   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Arti cial Intelligence (AAAI-94), Seattle, Wa., 1994.


Using Multiple Sources of Information For Constraint-based.. - Tür (1996)   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Articial Intelligence (AAAI-94), Seattle, Washinton, 1994.


Shallow Parsing with PoS Taggers and Linguistic Knowledge - A.. - Megyesi (2001)   (Correct)

No context found.

, Seattle, Washington, 1994.


Topic Change And Local Perplexity In Spoken Legal Dialogue - Kenne, O'Kane   (Correct)

No context found.

Brill E. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Arti#cial Intelligence #AAAI-94#, 1994.


PROFILE - A Multi-Disciplinary Approach to.. - Simons, Arampatzis.. (2000)   (Correct)

No context found.

E. Brill. Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Articial Intelligence (AAAI-94), Seattle, Wa., 1994.


Word Informativeness and Automatic Pitch Accent Modeling - Pan, McKeown (1999)   (3 citations)  (Correct)

No context found.

Eric Brill. 1994. Some advances in rulebased part of speech tagging. In Proceedings of the 12th National Conference on Artificial Intelligence.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC