Results 1 - 10
of
67
Enhanced Hypertext Categorization Using Hyperlinks
, 1998
"... A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and filtering. Therefore, an accurate classifier is ..."
Abstract
-
Cited by 326 (8 self)
- Add to MetaCart
A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and filtering. Therefore, an accurate classifier is an essential component of a hypertext database. Hyperlinks pose new problems not addressed in the extensive text classification literature. Links clearly contain highquality semantic clues that are lost upon a purely termbased classifier, but exploiting link information is non-trivial because it is noisy. Naive use of terms in the link neighborhood of a document can even degrade accuracy. Our contribution is to propose robust statistical models and a relaxation labeling technique for better classification by exploiting link information in a small neighborhood around documents. Our technique also adapts gracefully to the fraction of neighboring documents having known topics. We experimented ...
Markov Random Field Models in Computer Vision
, 1994
"... . A variety of computer vision problems can be optimally posed as Bayesian labeling in which the solution of a problem is defined as the maximum a posteriori (MAP) probability estimate of the true labeling. The posterior probability is usually derived from a prior model and a likelihood model. The l ..."
Abstract
-
Cited by 305 (18 self)
- Add to MetaCart
. A variety of computer vision problems can be optimally posed as Bayesian labeling in which the solution of a problem is defined as the maximum a posteriori (MAP) probability estimate of the true labeling. The posterior probability is usually derived from a prior model and a likelihood model. The latter relates to how data is observed and is problem domain dependent. The former depends on how various prior constraints are expressed. Markov Random Field Models (MRF) theory is a tool to encode contextual constraints into the prior probability. This paper presents a unified approach for MRF modeling in low and high level computer vision. The unification is made possible due to a recent advance in MRF modeling for high level object recognition. Such unification provides a systematic approach for vision modeling based on sound mathematical principles. 1 Introduction Since its beginning in early 1960's, computer vision research has been evolving from heuristic design of algorithms to syste...
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds ..."
Abstract
-
Cited by 171 (22 self)
- Add to MetaCart
Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 different concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images.
Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields
- IN IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 1999
"... In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pa ..."
Abstract
-
Cited by 131 (1 self)
- Add to MetaCart
In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pairwise relationships among the objects to be classified; this issue is one of the principal motivations for the framework of Markov random fields, and it arises in areas such as image processing, biometry, and document analysis. In its most basic form, this style of analysis seeks a classification that optimizes a combinatorial function consisting of assignment costs --- based on the individual choice of label we make for each object --- and separation costs --- based on the pair of choices we make for two "related" objects. We formulate a general classification problem of this type, the metric labeling problem; we show that it contains as special cases a number of standard classification f...
Region-based representations of image and video: Segmentation tools for multimedia services
, 1999
"... This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main pr ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standard are discussed. The review is structured around the strategies used by the algorithms (transition-based or homogeneity-based) and the decision spaces (spatial, spatio-temporal and temporal). The second part of the paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in...
A Bayesian Paradigm for Dynamic Graph Layout
, 1997
"... Dynamic graph layout refers to the layout of graphs that change over time. These changes are due to user interaction, algorithms, or other underlying processes determining the graph. Typically, users spend a noteworthy amount of time to get familiar with a layout, i.e. ..."
Abstract
-
Cited by 40 (12 self)
- Add to MetaCart
Dynamic graph layout refers to the layout of graphs that change over time. These changes are due to user interaction, algorithms, or other underlying processes determining the graph. Typically, users spend a noteworthy amount of time to get familiar with a layout, i.e.
Link Mining: A Survey
- SigKDD Explorations Special Issue on Link Mining
, 2005
"... Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly oth ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly other semantic information). Examples of homogeneous networks include single mode social networks, such as people connected by friendship links, or the WWW, a collection of linked web pages. Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments and contacts, or in bibliographic domains describing publications, authors, and venues. Link mining refers to data mining techniques that explicitly consider these links when building predictive or descriptive models of the linked data. Commonly addressed link mining tasks include object ranking, group detection, collective classification, link prediction and subgraph discovery. While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities. This is an exciting, rapidly expanding area. In this article, we review some of the common emerging themes. 1.
Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo -- Towards a "Trichromacy" Theory of Texture
, 1999
"... This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search f ..."
Abstract
-
Cited by 29 (12 self)
- Add to MetaCart
This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search for the statistics h which define the ensemble. A Julesz ensemble h) has an associated probability distribution q(I; h), which is uniform over the images in the ensemble and has zero probability outside. In a companion paper [32], q(I; h) is shown to be the limit distribution of the FRAME (Filter, Random Field, And Minimax Entropy) model[35] as the image lattice ! Z². This conclusion establishes the intrinsic link between the scientific definition of texture on Z² and the mathematical models of texture on finite lattices. It brings two advantages to computer vision. 1). The engineering practice of synthesizing texture images by matching statistics has been put on a mathematical fou...
Link Mining: A New Data Mining Challenge
- SIGKDD Explorations
, 2003
"... A key challenge for data mining is tackling the problem of mining richly structured datasets, where the objects are linked in some way. Links among the objects may demonstrate certain patterns, which can be helpful for many data mining tasks and are usually hard to capture with traditional statistic ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
A key challenge for data mining is tackling the problem of mining richly structured datasets, where the objects are linked in some way. Links among the objects may demonstrate certain patterns, which can be helpful for many data mining tasks and are usually hard to capture with traditional statistical models. Recently there has been a surge of interest in this area, fueled largely by interest in web and hypertext mining, but also by interest in mining social networks, security and law enforcement data, bibliographic citations and epidemiological records. 1.

