| Chris Clack, Jonny Farringdon, Peter Lidwell, and Tina Yu. Autonomous document classification for business. In W. Lewis Johnson, editor, The First International Conference on Autonomous Agents (Agents '97), pages 201--208, Marina del Rey, California, USA, February 5-8 1997. ACM Press. |
....index, link, and navigate captured material. The NSF grant proposal Automated Understanding of Captured Experience to the Experimental Software Systems Program by the PIs is meant to examine this problem. One approach we are using is to apply statistical measures for natural language understanding [58, 7, 12, 25, 32, 49, 15, 33, 11, 59]. Statistical approaches are usually based on matching word frequencies (vocabularies) in written or spoken text. The key insight in many statistical approaches to language is that concepts can be represented by vocabularies. Using vocabularies enables concepts to emerge from the data, and ....
Chris Clack, Jonny Farrington, Peter Lidwell, and Tina Yu. Autonomous document classification for business. In Proceedings of the 1997 International Conference on Autonomous Agents, February 1997.
....present an empirical comparison between generic programming and nave Bayes approach. 5. Approach We solve this problem using the novel approach of Genetic Programming. Genetic Programming has already been shown to be effective in classifying web pages [27] and in general document classification [23]. To provide a basis for comparison, we have also solved this problem using a traditional Nave Bayes classifier the most common classifier used in practice today. After delving into the theory behind the Bayesian and Genetic approaches we then examine the specifics of each implementation, which ....
....and the collection of non spam documents stood at 102 documents. We passed these documents through a series of filters. The first removed the HTML tags embedded in some messages. The second removed the 60 most common words in the English language a common practice in text learning [10] [23] since it is felt that these words occur too frequently to be of much discriminating value. Third, we applied stemming a technique that attempts to reduce the many forms of a word to their root form. For example, an ideal stemming algorithm would convert the words runs, running, and ran ....
Clack, C. & Farrington, J., & Lidwell, P., & Yu, T., Autonomous Document Classification for Business, in Proceedings of The ACM Agents Conference, 1997.
....approach represents a balance between statistical classification and complete understanding. It may be suggested that the agent should be able to infer conceptual features automatically by choosing features which best predict the training data. Several researchers have considered this approach [3, 4, 8, 14]. For example, Latent Semantic Analysis uses singular value decomposition to find a set of orthogonal factors which best represents the semantic structure of the documents [6] However, because any database of text examples will contain a small number of examples for each of a large number of ....
Chris Clack, Jonny Farrington, Peter Lidwell, and Tina Yu. Autonomous document classification for business. In Proceedings of the 1997 International Conference on Autonomous Agents, February 1997.
No context found.
Chris Clack, Jonny Farringdon, Peter Lidwell, and Tina Yu. Autonomous document classification for business. In W. Lewis Johnson, editor, The First International Conference on Autonomous Agents (Agents '97), pages 201--208, Marina del Rey, California, USA, February 5-8 1997. ACM Press.
No context found.
Chris Clack, Jonny Farringdon, Peter Lidwell, and Tina Yu. Autonomous document classification for business. In W. Lewis Johnson, editor, The First International Conference on Autonomous Agents (Agents '97), pages 201--208, Marina del Rey, California, USA, February 5-8 1997. ACM Press.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC