16 citations found. Retrieving documents...
H. Borko and M. Bernick. Automatic document classification. Journal of the ACM, 10(9):512--521, 1962.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Automatic Translation to Controlled Medical Vocabularies - Kornai, Stone   (Correct)

....be critically reevaluated for autocoding. Because of (1) techniques that are not linear or at least near linear in the number of classes are essentially useless even when they are provably superior on small data sets: this includes much of standard multivariate statistics, such as factor analysis [4] and regression methods [3] One well understood technique for dealing with this problem is divide and conquer: we can leverage the hierarchical structure of the classes by first classifying to the highest level, and building independent lower level classifiers for each node. In [32] we built a ....

Harold Borko and Myrna Bernick 1963 Automatic document classification. JACM 10 151--161


Color-Spatial Image Indexing and Applications - Huang (1998)   (6 citations)  (Correct)

....of correlograms. We will see (Section 7.4) that the performance with correlograms is much better than with histograms. 2 For the difference among SVD, KLT and PCA, see [Gerb81] 3 There is no good heuristic for choosing k. The rule of thumb is finding the k that gives the best performance [BB62] 71 7.3 The Hierarchical Classification Scheme Let C = fC 1 ; C c g be the image classes known apriori. WeassumethatwehaveasetS of training images whose class membership is known and T of images that need to be classified. We want build a classification tree from training images. At ....

H. Borko and M. Bernick. Automatic document classification. Journal of the ACM 9, pp. 512--521, 1962.


Dictionary Requirements for Text Classification: A Comparison of.. - Riloff (1995)   (Correct)

....surrounding them do not usually change their meaning. As a result, many microelectronics concepts can be recognized by looking for individual words. Conclusions Traditional information retrieval systems (e.g. Turtle and Croft, 1991; Salton, 1971] and text classification systems [Maron, 1961; Borko and Bernick, 1963; Hoyle, 1973] use isolated words or phrases to classify texts, but some classification tasks can benefit from using additional linguistic information. Our experiments indicate that both the domain and the task influence the kinds of knowledge that a system needs. In both the terrorism and joint ....

Borko, H. and Bernick, M. 1963. Automatic Document Classification.


Spotting Ontological Lacunae through Spectrum Analysis of.. - Hoenkamp (1998)   (3 citations)  (Correct)

....are presumably related. End of digress. Researchers have gone further than judging the closeness of documents solely by the words they contain. One direction is to compute the correlation coefficients be2 tween documents based on their word coordinates and then perform a factor analysis (Borko Bernick, 1963). In this fashion, the initially high dimension (the number of words) may be reduced, and the factors thus represent some underlying structure in the collection of documents. A related direction is to find a new basis for the document space by applying principle component analysis. Especially ....

Borko, H., & Bernick, M. D. (1963). Automatic document classification. Journal of the ACM, 10, 151--162.


An Automatic Hierarchical Image Classification Scheme - Huang, Kumar, Zzbih (1998)   (7 citations)  (Correct)

....normalized cuts [19] Finally, the subclasses and those training images that were correctly classified with respect to the subclasses are worked upon recursively to obtain the hierarchy 4 There is no good heuristic for choosing k. The rule of thumb is finding the k that gives the best performance [2]. in the classification tree, with the hope of improving the classification performance. 5.1 Confusion Matrix We construct the matrix A#S# as indicated in Section 4 and compute its SVD: A#S#=U#V T . Then we choose the best approximation Ak that gives the best classification of S on itself. ....

H. Borko and M. Bernick. Automatic document classification. Journal of the ACM, 9:512--521, 1962.


Indexing by Latent Semantic Analysis - Deerwester, Dumais, Furnas.. (1990)   (549 citations)  (Correct)

....Aiding information retrieval by discovering latent proximity structure has at least two lines of precedence in the literature. Hierarchical classification analyses are frequently used for term and document clustering [11] 12] 13] Latent class analysis [14] and factor analysis [15] [16] [17] have also been explored before for automatic document indexing and retrieval. In document clustering, for example, a notion of distance is defined such that two documents are considered close to the extent that they contain the same terms. The matrix of document todocument distances is then ....

....clustering model (a k dimensional model for n points has nk parameters) However previous attempts along these lines, too, had shortcomings. First, factor analysis is computationally expensive, and since most previous attempts were made 15 20 years ago, they were limited by processing constraints [16] . Second, most past attempts considered restricted versions of the factor analytic model, either by using very low dimensionality, or by converting the factor analysis results to a simple binary clustering [16] Third, some attempts have relied on excessively tedious data gathering ....

[Article contains additional citation context not shown here]

Borko, H and Bernick, M.D. Automatic document classification. Journal of the ACM, April 1963, 10(3), 151-162.


Toward Large-Scale Information Retrieval Using Latent Semantic.. - Letsche (1996)   (1 citation)  (Correct)

....resolved. LSI was developed to solve many of the information retrieval problems Luhn anticipated in the 1950 s. The LSI model will be discussed in Section 2.3. 2.2.2 Borko and Bernick on Reduced Space Document Classification A short time after Luhn s ideas were published, H. Borko and M. Bernick [BB63] presented a method by which documents could automatically be classified into predefined categories. Although document classification has different goals than information retrieval, Borko and Bernick s approach to document classification can be viewed as a special case of information retrieval. ....

H. Borko and M. Bernick. Automatic document classification. Journal of the Association for Computing Machinery, 10:151--162, 1963.


Spotting Ontological Lacunae through Spectrum Analysis of.. - Edward Hoenkamp (1998)   (3 citations)  (Correct)

....subspace are presumably related. End of digress. Researchers have gone further than judging the closeness of documents solely by the words they contain. One direction is to compute the correlation coefficients between documents based on their wordcoordinates and then perform a factor analysis [6]. In this fashion, the initially high dimension (the number of words) may be reduced, and the factors thus represent some underlying structure in the collection of documents. A related direction is to find a new basis for the document space by applying principle component analysis. Especially ....

H. Borko and M. D. Bernick, `Automatic document classification', Journal of the ACM, 10, 151--162, (April 1963).


Using Latent Semantic Analysis To Improve Access.. - Dumais, Furnas.. (1988)   (33 citations)  (Correct)

....latent proximity structure has several lines of precedence in the information science literature. Hierarchical classification analyses have sometimes been used for term and document clustering [13] 15] Factor analysis has also been explored previously for automatic indexing and retrieval [2]. Our latent structure method differs from these approaches in several important ways: 1) we use a high dimensional representation which allows us to better represent semantic relations; 2) both terms and text objects are explicitly represented in the same space; and (3) objects can be retrieved ....

Borko, H. and Bernick, M.D. Automatic document classification. Journal of the ACM, April 1963, 10(3), 151-162.


Large-Scale Information Retrieval with Latent Semantic Indexing - Letsche, Berry (1997)   (11 citations)  (Correct)

....retrieval in several ways. Most notably, LSI represents documents in a highdimensional space. Koll [16] for instance, used only seven dimensions to represent his semantic space. Secondly, both terms and documents are explicitly represented in the same space. Thirdly, unlike Borko and Bernick [6], no attempt is made to interpret the meaning of each dimension. Each dimension is merely assumed to represent one or more semantic relationships in the term document space. Finally, because of limits imposed mostly by the computational demands of vector space approaches to information retrieval, ....

H. Borko and M. Bernick. Automatic document classification. Journal of the Association for Computing Machinery, 10:151--162, 1963.


Information Extraction as a Basis for High-Precision Text.. - Riloff, Lehnert (1994)   (48 citations)  (Correct)

....are highly experienced with the domain and the task. To achieve similar success in a new domain, the entire knowledge engineering process must be repeated. Both traditional IR techniques and knowledge based techniques have been applied to the problem of text classification (e.g. see [Maron, 1961; Borko and Bernick, 1963; Hoyle, 1973] for traditional IR approaches and [Goodman, 1991; Hayes and Weinstein, 1991; Rau and Jacobs, 1991] for knowledge based approaches) Text classification is an information retrieval task in which one or more category labels is assigned to a document. This task assumes a pre defined, ....

....that may not have strong key phrases and for which context is crucial. Our work on text classification differs from previous work in several respects. Within the IR community, text classification has typically been approached using word based techniques and statistical methods (e.g. Maron, 1961; Borko and Bernick, 1963; Hoyle, 1973] As we explained in Section 1, word based techniques have many limitations. Knowledge based text classification systems have been developed that address some of these problems using rule bases or other knowledge sources (e.g. Goodman, 1991; Hayes and Weinstein, 1991; Rau and ....

Borko, H. and Bernick, M. 1963. Automatic Document Classification. J. ACM 10(2):151-- 162.


A Comparison of Two Learning Algorithms for Text Categorization - Lewis, Ringuette (1994)   (94 citations)  (Correct)

....or removing conclusions. The second strategy is to use existing bodies of manually categorized text in constructing categorizers by inductive learning. A wide variety of learning approaches have been used, including Bayesian classification [Mar61] decision trees [CFAT91] factor analysis [BB63] fuzzy sets [COL83] linear regression [BFL 88] and memory based approaches [CMSW91] Learning based systems have been found to be cheaper and faster to build, as well as more accurate in some applications [CMSW91] Text categorization applications nevertheless provide many challenges for ....

Borko, Harold and Bernick, Myrna. Automatic document classification. Journal of the Association for Computing Machinery, pages 151--161, 1963.


Loss Functions and Structured Domains for Support Vector Machines - Portera (2005)   (Correct)

No context found.

H. Borko and M. Bernick. Automatic document classification. Journal of the ACM, 10(9):512--521, 1962.


Combining Machine Learning and Hierarchical Structures for Text.. - Ruiz (2001)   (1 citation)  (Correct)

No context found.

H. Borko and M. Bernick. Automatic document classification. Journal of the Association for Computing Machinery, 10(2):151--161, 1963.


Signatures of 3D Models for Retrieval - Leifman, Katz, Tal, Meir   (Correct)

No context found.

H. Borko and M. Bernick. "Automatic document classification", J. of the ACM 9, 512-521, 1962.


Enhancing Performance in Latent Semantic Indexing (LSI) Retrieval - Dumais (1992)   (7 citations)  (Correct)

No context found.

Borko, H and Bernick, M.D. Automatic document classification. Journal of the ACM, April 1963, 10(3), 151-162.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC