TileBars: Visualization of Term Distribution Information in Full Text Information Access
, 1995
"... The field of information retrieval has traditionally focused on textbases consisting of titles and abstracts. As a consequence, many underlying assumptions must be altered for retrieval from fulllength text collections. This paper argues for making use of text structure when retrieving from full te ..."
Cited by 341 (10 self)
text documents, and presents a visualization paradigm, called TileBars, that demonstrates the usefulness of explicit term distribution information in Booleantype queries. TileBars simultaneously and compactly indicate relative document length, query term frequency, and query term distribution
InformationTheoretic CoClustering
 In KDD
, 2003
"... Twodimensional contingency or cooccurrence tables arise frequently in important applications such as text, weblog and marketbasket data analysis. A basic problem in contingency table analysis is coclustering: simultaneous clustering of the rows and columns. A novel theoretical formulation views ..."
Cited by 346 (12 self)
Twodimensional contingency or cooccurrence tables arise frequently in important applications such as text, weblog and marketbasket data analysis. A basic problem in contingency table analysis is coclustering: simultaneous clustering of the rows and columns. A novel theoretical formulation
Algebraic Algorithms for Sampling from Conditional Distributions
 Annals of Statistics
, 1995
"... We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so a ..."
Cited by 268 (20 self)
We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so
Infinite Latent Feature Models and the Indian Buffet Process
, 2005
"... We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution ..."
Cited by 273 (45 self)
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution
Palmer LTER: Hydrogen peroxide in the Palmer LTER region: II. Water column distributions
 Antarctic Journal of the U.S
, 1993
"... tions range between 10 and 400 nanomolar (nM) decreasing with depth to undetectable levels (less than 1 nM) below the mixed layer. The two major suspected source terms for H202 are photochemical interactions with dissolved organic matter (DOM) and atmospheretoocean transport (see Karl et al., Anta ..."
Cited by 5 (1 self)
tions range between 10 and 400 nanomolar (nM) decreasing with depth to undetectable levels (less than 1 nM) below the mixed layer. The two major suspected source terms for H202 are photochemical interactions with dissolved organic matter (DOM) and atmospheretoocean transport (see Karl et al., Antarctic Journal, in this issue). In general, the global "pristine " ocean data demonstrate a strong latitudinal dependence with maximum H202 con centrations of 100200 nM in low latitudes (15°S to 15°N) decreasing to approximately 30 nM at 62°S (Weiler and Schrems 1993). To our knowledge, these are the only data for oceanic samples collected south of 60°S. The data of Weiler and Schrems (1993), however, are limited to only a few sam ples in the region of the Bransfield Strait. Aside from shortterm diel variations, temporal H202 variations on seasonal time scales have been reported only for the Caribbean Sea
Variability of Absorption and Optical Properties of Key Aerosol Types Observed in Worldwide Locations
 JOURNAL OF THE ATMOSPHERIC SCIENCES
, 2002
"... Aerosol radiative forcing is a critical, though variable and uncertain, component of the global climate. Yet climate models rely on sparse information of the aerosol optical properties. In situ measurements, though important in many respects, seldom provide measurements of the undisturbed aerosol in ..."
Cited by 263 (30 self)
in the entire atmospheric column. Here, 8 yr of worldwide distributed data from the AERONET network of groundbased radiometers were used to remotely sense the aerosol absorption and other optical properties in several key locations. Established procedures for maintaining and calibrating the global network
No eigenvalues outside the support of the limiting spectral distribution of largedimensional sample covariance matrices
 ANNALS OF PROBABILITY 26
, 1998
"... We consider a class of matrices of the form Cn = (1/N)(Rn+σXn)(Rn+σXn) ∗, where Xn is an n × N matrix consisting of independent standardized complex entries, Rj is an n×N nonrandom matrix, and σ> 0. Among several applications, Cn can be viewed as a sample correlation matrix, where information is ..."
Cited by 186 (35 self)
is contained in (1/N)RnR ∗ n, but each column of Rn is contaminated by noise. As n → ∞, if n/N → c> 0, and the empirical distribution of the eigenvalues of (1/N)RnR ∗ n converge to a proper probability distribution, then the empirical distribution of the eigenvalues of Cn converges a.s. to a nonrandom limit
On the Nyström Method for Approximating a Gram Matrix for Improved KernelBased Learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A problem for many kernelbased methods is that the amount of computation required to find the solution scales as O(n³), where n is the number of training examples. We develop and analyze an algorithm to compute an easilyinterpretable lowrank approximation to an nn Gram matrix G such that compu ..."
Cited by 188 (11 self)
and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciouslychosen and datadependent nonuniform probability distribution. Let F denote the spectral norm and the Frobenius norm, respectively, of a matrix
The Distributed Computing Column
"... The Distributed Computing Column covers the theory of systems that are composed of a number of interacting computing elements. These include problems of communication and networking, databases, distributed shared memory, multiprocessor architectures, operating systems, veri cation, internet, and th ..."
The Distributed Computing Column covers the theory of systems that are composed of a number of interacting computing elements. These include problems of communication and networking, databases, distributed shared memory, multiprocessor architectures, operating systems, veri cation, internet
©The Distributed Computing Column by
"... Advances in Distributed Computing have been simply astonishing during the past few decades. The Distributed Computing Column of BEATCS aims in exposing the community to the most challenging and inspirational results of the field. The column has been edited for the last eight years by Marios Mavronic ..."
Advances in Distributed Computing have been simply astonishing during the past few decades. The Distributed Computing Column of BEATCS aims in exposing the community to the most challenging and inspirational results of the field. The column has been edited for the last eight years by Marios
