This directory is created automatically and some papers may be mislabeled. Only document within the CiteSeer database are listed. The directory is intended to provide entry points for browsing the database and is not intended to be authoritative. Papers may not appear in all relevant categories. For example, papers in a sub-category may not appear in higher level categories.
1245.7 Authoritative Sources in a Hyperlinked Environment - Kleinberg (1998)(Correct)
The link structure of a hypermedia environment can be a rich source of information about the
content of the environment, provided we have effective means for understanding it. Versions of
this princip... / Much classical work in information retrieval has focused on this type br versions of classical information retrieval notions such as
1131.9 Querying Semi-Structured Data - Abiteboul (1997)(Correct)
The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in file systems to highly struc... / e.g.unstructured text Information retrieval tools may provide a br a set of HTML documents information retrieval packages to obtain
979.7 KQML as an agent communication language - Finin, Labrou, Mayfield (1995)(Correct)
One of the defining characteristics of an information agent is its ability for flexible interaction
and interoperation with other, similar software agents. This focus on interoperability has been
the ... / intelligent entities to information retrieval systems anything might
954.2 Text Categorization with Support Vector Machines: Learning with Many.. - Joachims (1998)(Correct)
This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are ... / and the classification task. Information Retrieval research suggests that br popular learning method from information retrieval a distance weighted
834.0 Querying the World Wide Web - Mendelzon, Mihaila, Milo (1997)(Correct)
1
The World Wide Web is a large, heterogeneous,
distributed collection of documents connected by
hypertext links. The most common technology currently
used for searching the Web depends on sending... / the Web depends on sending information retrieval requests to index br and searching by sending information retrieval requests to index
680.8 WebWatcher: A Learning Apprentice for the World Wide Web - Armstrong, Freitag, Joachims.. (1997)(Correct)
We describe an information seeking assistant
for the world wide web. This
agent, called WebWatcher, interactively
helps users locate desired information by
employing learned knowledge about which
hype... / learning agents for information retrieval. We focus in particular br This idea is common within information retrieval retrieval systems Salton
672.3 Multidimensional Access Methods - Gaede, Günther (1997)(Correct)
ing with credit is permitted. To copy otherwise, to republish, to post on
servers, to redistribute to lists, or to use any component of this work in other works, requires prior
specific permission and... / Information Storage and Retrieval Information Storage-File
605.7 Database Techniques for the World-Wide Web: A Survey - Florescu, Levy, Mendelzon (1998)(Correct)
The primary goal of this survey is to classify the different tasks to which database concepts have been applied, and to emphasize the technical innovations that are required to do so. We focus on thre... / other technologies such as Information Retrieval Artificial Intelligence br that go beyond the basic information retrieval paradigm supported by
523.4 Semistructured Data - Buneman (1997)(Correct)
In semistructured data, the information that is normally associated with a schema is contained within the data, which is sometimes called "self-describing". In some forms of semistructured data there ... / Most web queries exploit information retrieval techniques to retrieve
484.0 Letizia: An Agent That Assists Web Browsing - Lieberman (1995)(Correct)
Letizia is a user interface agent that assists a
user browsing the World Wide Web. As the
user operates a conventional Web browser such
as Netscape, the agent tracks user behavior and
attempts to anti... / perspectives of information retrieval and information br Sheth and Maes Information retrieval suggests the image of a
472.7 Learning Information Extraction Rules for Semi-structured and Free.. - Soderland (1999)(Correct)
A wealth of on-line text information can be made available to automatic processing
by information extraction (IE) systems. Each IE application needs a separate set of rules tuned to
the domain and w... / front end for high precision information retrieval or text routing as a br extraction are borrowed from information retrieval recall and
417.0 Agent Tcl: A flexible and secure mobile-agent system - Gray (1997)(Correct)
A mobile agent is an autonomous program that can migrate under its own control from machine to machine in a heterogeneous network. In other words, the program can suspend its execution at an arbitrary... / . . . Distributed information retrieval . br primarily in distributed information-retrieval applications including
382.6 Broadcast Disks: Data Management for Asymmetric Communication.. - Acharya (1995)(Correct)
This paper proposes the use of repetitive broadcast as a way of augmenting the memory hierarchy of clients in an asymmetric communication environment. We describe a new technique called "Broadcast Dis... / For example an information retrieval system in which the number br servers. ffl Information retrieval systems with large client
378.7 A Comparative Study on Feature Selection in Text Categorization - Yang, Pedersen (1997)(Correct)
This paper is a comparative study of feature
selection methods in statistical learning of
text categorization. The focus is on aggressive
dimensionality reduction. Five methods
were evaluated, includi... / received assumption in information retrieval. That is low-DF terms br and a typical information retrieval approach named Rocchio
347.8 Using Linear Algebra for Intelligent Information Retrieval - Berry, Dumais, O'Brien (1995)(Correct)
Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. B... / Algebra for Intelligent Information Retrieval M.W. Berry S.T. Dumais br Algebra For Intelligent Information Retrieval Michael W. Berry
336.2 Searching Distributed Collections With Inference Networks - Callan, Lu, Croft (1995)(Correct)
The use of information retrieval systems in networked environments
raises a new set of issues that have received little
attention. These issues include ranking document collections
for relevance to a ... / Abstract The use of information retrieval systems in networked br Research and Development in Information Retrieval. Copyright c fl
331.9 An Evaluation of Statistical Approaches to Text Categorization - Yang (1997)(Correct)
This paper is a comparative study of text categorization methods. Fourteen methods are investigated, based on
previously published results and newly obtained results from additional experiments. Corpu... / and Development in Information Retrieval . - . R.H. br Research and Development in Information Retrieval SIGIR' pages
327.6 A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text .. - Joachims (1997)(Correct)
The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorithm is presented in a t... / learning methods from information retrieval. Here a probabilistic br developed in information retrieval. Originally designed for
318.1 Blobworld: A System for Region-Based Image Indexing and Retrieval.. - Chad Carson (1999)(Correct)
Blobworld is a system for image retrieval based on finding
coherent image regions which roughly correspond to objects. Each image
is automatically segmented into regions ("blobs") with associated co... / view the performance of an information retrieval system can be measured by br Query analysis in a visual information retrieval context. J. Doc. and Text
318.1 Blobworld: A system for region-based image indexing and retrieval - Carson, Thomas, Belongie.. (1999)(Correct)
Blobworld is a system for image retrieval based on finding
coherent image regions which roughly correspond to objects. The image
is segmented into regions by fitting a mixture of Gaussians to the pi... / a user's point of view an information retrieval system can be measured by br Query analysis in a visual information retrieval context. J. Doc. and Text
303.7 Visual Information Seeking: Tight Coupling of Dynamic Query Filters.. - Ahlberg, Shneiderman (1994)(Correct)
This paper offers new principles for visual information seeking (VIS). A key concept is to support browsing, which is distinguished from familiar query composition and information retrieval because of... / query composition and information retrieval because of its emphasis on br of query composition and information retrieval because of its emphasis
295.6 Affective Computing - Picard (1995)(Correct)
Recent neurological studies indicate that the role of
emotion in human cognition is essential; emotions are
not a luxury. Instead, emotions play a critical role
in rational decision-making, in percept... / interaction perceptual information retrieval creative arts and br learning perceptual information retrieval creative arts and
278.2 W3QS: A Query System for the World-Wide Web - Konopnicki (1995)(Correct)
The World-Wide Web (WWW) is an ever
growing, distributed, non-administered, global
information resource. It resides on the worldwide
computer network and allows access
to heterogeneous information: te... / as a Wide-area hypermedia information retrieval initiative aiming to give br does not provide a powerful information retrieval facility. A graph
274.2 Integration of Heterogeneous Databases Without Common Domains Using.. - Cohen (1998)(Correct)
Most databases contain "name constants" like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of heterogeneous databases has ... / adopted in statistical information retrieval. We describe an efficient br been developed in the information retrieval IR community. As in
268.5 Web Document Clustering: A Feasibility Demonstration - Zamir, Etzioni (1998)(Correct)
Users of Web search engines are often forced
to sift through the long ordered list of document "snippets"
returned by the engines. The IR community has explored
document clustering as an alternative ... / . Effectiveness for Information Retrieval As we did not use a br their effectiveness for information retrieval. Specifically we used
263.6 Computationally Private Information Retrieval with Polylogarithmic.. - Cachin, Micali, Stadler (1999)(Correct)
We present a single-database computationally private information retrieval scheme with polylogarithmic communication complexity. Our construction is based on a new, but reasonable intractability assum... / Computationally Private Information Retrieval with Polylogarithmic br computationally private information retrieval scheme with
251.4 A Language Modeling Approach to Information Retrieval - Ponte, Croft (1998)(Correct)
Models of document indexing and document
retrieval have been extensively studied. The integration
of these two classes of models has been the
goal of several researchers but it is a very difficult pro... / Modeling Approach to Information Retrieval Jay M. Ponte and W. br the word model' is used in information retrieval in two senses. The first
245.4 Automatic Discovery of Language Models for Text Databases - Callan, Connell, Du (1999)(Correct)
The proliferation of text databases within large organizations
and on the Internet makes it difficult for a person to
know which databases to search. Given language models
that describe the contents ... / Du Center for Intelligent Information Retrieval Computer Science br to ordinary full text Information Retrieval systems Their
239.9 Relevance Feedback: A Power Tool for Interactive Content-Based Image.. - Rui (1998)(Correct)
Content-Based Image Retrieval (CBIR) has become
one of the most active research areas in the past few
years. Many visual feature representations have been explored
and many systems built. While these ... / in traditional text-based Information Retrieval systems. It is the process br and O's. Following the Information Retrieval theories
238.2 Querying Documents in Object Databases - Abiteboul, Cluet, Christophides.. (1997)(Correct)
We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consi... / or OQL navigational and information retrieval styles of accessing data. br search is well-managed by information retrieval systems and involves
234.7 Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies - Gravano, Garcia-Molina (1995)(Correct)
As large numbers of text databases have become
available on the Internet, it is harder to
locate the right sources for given queries. In
this paper we present gGlOSS, a generalized
Glossary-Of-Server... / inference networks from information retrieval to the text-database br Introduction to modern information retrieval. McGraw-Hill .
234.5 Scalable Internet Resource Discovery: Research Problems and Approaches - Bowman, Danzig (1994)(Correct)
Over the past several years, a number of information discovery and access tools have been
introduced in the Internet, including Archie, Gopher, Netfind, and WAIS. These tools have
become quite popular... / variety of wide-area filing information retrieval publishing and library br Netfind using the Z . information retrieval protocol to seamlessly
234.0 The MetaCrawler Architecture for Resource Aggregation on the Web - Selberg, Etzioni (1997)(Correct)
this article, we briefly outline the motivation for MetaCrawler and highlight previous work, and then discuss the architecture of MetaCrawler and how it enables MetaCrawler to perform well and to scal... / and all use reasonable Information Retrieval IR techniques to map a br Artificial Intelligence and Information Retrieval especially as it pertains
228.5 Greedy strikes back: Improved Facility Location Algorithms - Guha, Khuller (1998)(Correct)
A fundamental facility location problem is to choose the location of facilities, such as industrial plants and warehouses, to minimize the cost of satisfying the demand for some commodity. There are a... / machine scheduling and information retrieval communication networks
227.1 Experience With a Learning Personal Assistant - Mitchell, Caruana, Freitag.. (1994)(Correct)
Personal software assistants that help users with tasks like finding information, scheduling calendars, or managing work-flow will require significant customization to each individual user. For exampl... / use of machine learning for information retrieval filtering and
225.5 Two Algorithms for Nearest-Neighbor Search in High Dimensions - Kleinberg (1997)(Correct)
Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as ... / to a number of areas such as information retrieval pattern recognition and br example in algorithms for information retrieval pattern
224.7 Efficient Similarity Search In Sequence Databases - Agrawal (1993)(Correct)
We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation b... / clustering algorithms in information retrieval and library science br Introduction to Modern Information Retrieval McGraw-Hill .
218.1 Transductive Inference for Text Classification using Support Vector.. - Joachims (1999)(Correct)
This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, ... / technique in free-text information retrieval. The user marks some br the classification task. Information Retrieval research suggests that
217.3 SIFT - A Tool for Wide-Area Information Dissemination - Yan (1995)(Correct)
The dissemination model is becoming increasingly
important in wide-area information system. In this
model, the user subscribes to an information dissemination
service by submitting profiles that descr... / filtering using well-known information retrieval models. The SIFT filtering br relational rulebased information retrieval IR and artificial
211.5 Itinerant Agents for Mobile Computing - Chess, Grosof, Harrison, Levine.. (1995)(Correct)
This paper describes an abstract framework for itinerant agents that can be used to implement secure, remote applications in large, public networks such as the Internet or the IBM Global Network. Itin... / Electronic Commerce Information Retrieval Knowledge Representation br of an agent to perform both information retrieval and filtering at a server
208.6 Training Algorithms for Linear Text Classifiers - Lewis, Schapire, Callan, Papka (1996)(Correct)
Systems for text retrieval, routing, categorization and other IR tasks rely heavily on linear classifiers. We propose that two machine learning algorithms, the Widrow-Hoff and EG algorithms, be used i... / Center for Intelligent Information Retrieval Department of Computer br appear on the TIPSTER Information Retrieval Text Research Collection
204.1 Accurate Methods for the Statistics of Surprise and Coincidence - Dunning (1993)(Correct)
Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have no... / doing good work in information retrieval and natural language br by virtually all of the information retrieval literature. Even recent
203.5 Client Requirements For Real-Time Communication Services - Ferrari (1990)(Correct)
A real-time communication service provides its clients with the ability to specify their performance requirements and to obtain guarantees about the satisfaction of those requirements. In this paper, ... / e.g.data base queries information retrieval requests remote
195.7 Selection of Relevant Features and Examples in Machine Learning - Blum, Langley (1997)(Correct)
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant f... / of filtering systems for information retrieval electronic mail netnews br Koller and Sahami's work on information retrieval attributes but
195.7 Web Mining: Information and Pattern Discovery on the World Wide Web - Cooley, Mobasher, Srivastava (1997)(Correct)
Application of data mining techniques to the World Wide Web, referred to as Web mining, has been the focus of several recent research projects and papers. However, there is no established vocabulary, ... / more intelligent tools for information retrieval such as intelligent Web br of Web agents use various information retrieval techniques and
195.7 Distance-Based Indexing For High-Dimensional Metric Spaces - Bozkaya (1997)(Correct)
In many database applications, one of the common queries is to
find approximate matches to a given query item from a
collection of data items. For example, given an image database,
one may want to ret... / time series analysis information retrieval etc. In genetics the
195.0 GLIMPSE: A Tool to Search Through Entire File Systems - Manber (1994)(Correct)
GLIMPSE, which stands for GLobal IMPlicit SEarch, provides indexing and query schemes for file systems. The
novelty of glimpse is that it uses a very small index --- in most cases 2-4% of the size of ... / many types of documents. An information retrieval system for personal br data structure used in information retrieval IR systems is an
190.9 Digital Libraries and Autonomous Citation Indexing - Lawrence, Giles, Bollacker (1999)(Correct)
The World Wide Web is revolutionizing the way that researchers access scientific information. Articles are increasingly being made available on the homepages of authors or institutions, at journal Web... / libraries bibliometrics information retrieval scientific literature br designed mainly for information retrieval and allow navigating
185.1 A Sequential Algorithm for Training Text Classifiers - Lewis, Gale (1994)(Correct)
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully ... / Research and Development in Information Retrieval Springer-Verlag London br is critical to their use in information retrieval content analysis
179.7 Dissemination-based Data Delivery Using Broadcast Disks - Acharya, Franklin, Zdonik (1995)(Correct)
Mobile computers and wireless networks are emerging technologies which promise to make ubiquitous computing a reality. One challenge that must be met in order to truly realize this potential is that o... / For example an information retrieval system in which the number br servers. ffl Information retrieval systems with large client
177.1 Coordination Models and Languages - Papadopoulos, ARBAB (1998)(Correct)
A new class of models, formalisms and mechanisms has recently evolved for describing concurrent and distributed computations based on the concept of "coordination". The purpose of a coordination model... / ARBAB -intelligent information retrieval and multimedia-based
177.1 Boosting and Rocchio Applied to Text Filtering - Schapire, Singer, Singhal (1998)(Correct)
We discuss two learning algorithms for text filtering: modified
Rocchio and a boosting algorithm called AdaBoost. We show
how both algorithms can be adapted to maximize any general
utility matrix that... / machine learning ML and information retrieval IR Many algorithms for br problem in the field of information retrieval. In its early days
177.1 WebMate : A Personal Agent for Browsing and Searching - Chen, Sycara (1998)(Correct)
The World-Wide Web is developing very fast. Currently, finding useful information
on the Web is a time consuming process. In this paper, we present WebMate, an
agent that helps users to effectively br... / of the art in Web-based information retrieval in many ways. First it br applications as diverse as information retrieval user interface design
173.9 Learning Information Retrieval Agents: Experiments with Automated Web .. - Balabanovic, Shoham (1995)(Correct)
The current exponential growth of the Internet precipitates
a need for new tools to help people cope with the
volume of information. To complement recent work on
creating searchable indexes of the Wor... / Learning Information Retrieval Agents Experiments with br uses the vector space information retrieval paradigm where documents
171.4 Towards Mobile Cryptography - Sander, Tschudin (1998)(Correct)
Mobile code technology has become a driving force for recent advances in distributed
systems. The concept of mobility of executable code raises major security problems.
In this paper we deal with the ... / device and the network. Information retrieval is also the basic scenario
171.0 TileBars: Visualization of Term Distribution Information in Full Text .. - Hearst (1995)(Correct)
The field of information retrieval has traditionally focused on textbases consisting of titles and abstracts. As a consequence, many underlying assumptions must be altered for retrieval from full-leng... / ABSTRACT The field of information retrieval has traditionally focused br documents. KEYWORDS Information retrieval Full-length text
171.0 Specification Matching of Software Components - Zaremski, Wing (1995)(Correct)
ing with credit is permitted. To copy otherwise, to republish, to post on
servers, to redistribute to lists, or to use any component of this work in other works, requires prior
specific permission and... / a close match as in other information retrieval contexts Salton and br three categories. Text-based information retrieval Frakes and Nejmeh
170.3 A Comparison of Two Learning Algorithms for Text Categorization - Lewis, Ringuette (1994)(Correct)
This paper examines the use of inductive learning to categorize natural language documents into predefined content categories. Categorization of text is of increasing importance in information retriev... / of increasing importance in information retrieval and natural language br to documents to support information retrieval or to aid human indexers
170.2 WEBSOM - Self-Organizing Maps of Document Collections - Honkela (1997)(Correct)
Searching for relevant text documents has traditionally been based on keywords and Boolean expressions of them. Often the search results show high recall and low precision, or vice versa. Considerable... / have been developed for information retrieval based on the br is an explorative full-text information retrieval method and a browsing tool
168.1 Context-sensitive learning methods for text categorization - Cohen, Singer (1996)(Correct)
Two recently implemented machine learning algorithms, RIPPER
and sleeping experts for phrases, are evaluated on a
number of large text categorization problems. These algorithms
both construct classifi... / Research and Development in Information Retrieval . Cesa-Bianchi br on Document Analysis and Information Retrieval pages - Las
168.1 Designing a Family of Coordination Algorithms - Decker, Lesser (1995)(Correct)
Many researchers have shown that there is no single best organization or coordination mechanism for all environments. This paper discusses the design and implementation of an extendable family of coor... / management distributed information retrieval pilot's associate local
159.9 MindReader: Querying databases through multiple examples - Ishikawa, Subramanya, Faloutsos (1998)(Correct)
Users often can not easily express their queries. For example, in a multimedia/image by content setting, the user might want photographs with sunsets; in current systems, like QBIC, the user has to gi... / Keywords Databases Information Retrieval Access Methods br is often found in the information retrieval field as relevance
159.4 Pivoted Document Length Normalization - Singhal, Buckley, Mitra (1996)(Correct)
Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths. In this ... / Abstract Automatic information retrieval systems have to deal with br techniques are used in information retrieval systems. Following is a
159.4 Near Neighbor Search in Large Metric Spaces - Brin (1995)(Correct)
Given user data, one often wants to find approximate matches in a large database. A good example of such a task is finding images similar to a given image in a large collection of images. We focus on ... / year long temperature Information Retrieval Finding documents br video compression information retrieval and possibly data mining
154.2 JRes: A Resource Accounting Interface for Java - Czajkowski, von Eicken (1998)(Correct)
With the spread of the Internet the computing model on
server systems is undergoing several important changes.
Recent research ideas concerning dynamic operating
system extensibility are finding their... / of various tasks from information retrieval to accessing proprietary
154.2 Pitfalls of Agent-Oriented Development - Wooldridge, Jennings (1998)(Correct)
While the theoretical and experimental foundations of agent-based
systems are becoming increasingly well understood, comparatively
little effort has been devoted to understanding the pragmatics of
(mu... / to Internet-based information retrieval and management systems
150.7 Intelligent Agents: An Emerging Technology for Next Generation.. - Magedanz, Rothermel, Krause (1996)(Correct)
The telecommunications environment is changing its face towards an open market of information services where the vision is "information any time, at any place, in any form". Within this electronic mar... / interfaces PDA connection information retrieval information filtering br virtually unlimited local information retrieval and filtering local mail
148.9 Semistructured and Structured Data in the Web: Going Back and Forth - Paolo Atzeni (1997)(Correct)
this paper, we present the approach to the management of Web data as attacked in the Araneus unknown Semistructured and Structured Data in the Web:
Going Back and Forth
Paolo Atzeni,
Giansalvatore M... / is based on browsing and information retrieval techniques. Due to their br in Araneus based on information retrieval techniques. Moreover the
148.9 An Adaptive Web Page Recommendation Service - Balabanovic (1997)(Correct)
An adaptive recommendation service seeks to adapt
to its users, providing increasingly personalized recommendations
over time. In this paper we introduce
the "Fab" adaptive web page recommendation ser... / component which is an information retrieval IR system to match web br The vector-space model of information retrieval Salton McGill
148.5 Distributional Clustering of Words for Text Classification - Baker, McCallum (1998)(Correct)
This paper describes the application of Distributional Clustering [20] to document classification. This approach clusters words into groups based on the distribution of class labels associated with ea... / reduction technique for information retrieval that explicitly accounts br on Document Analysis and Information Retrieval pages - .
148.5 Latent Semantic Indexing: A Probabilistic Analysis - Papadimitriou, Raghavan, Tamaki.. (1998)(Correct)
Latent semantic indexing (LSI) is an information retrieval technique based on the spectral
analysis of the term-document matrix, whose empirical success had heretofore been without
rigorous prediction... / indexing LSI is an information retrieval technique based on the br Introduction The field of information retrieval has traditionally been
147.8 Private Information Retrieval - Chor, Goldreich, Kushilevitz, Sudan (1996)(Correct)
Publicly accessible databases are an indispensable resource for retrieving up to date information. But they also pose a significant risk to the privacy of the user, since a curious database operator c... / Private Information Retrieval Benny Chor y br transformed into a private information retrieval scheme with an additional
145.4 Statistical Models for Text Segmentation - Beeferman, BERGER, LAFFERTY (1999)(Correct)
This paper introduces a new statistical approach to automatically partitioning text
into coherent segments. The approach is based on a technique that incrementally builds an
exponential model to ext... / inspired by a problem in information retrieval given a large br an approach based on information retrieval methods such as local
145.4 Embedding Knowledge in Web Documents - Martin, Eklund (1999)(Correct)
The paper argues for the use of general and intuitive knowledge representation languages (and simpler notational variants, e.g. subsets of natural languages) for indexing the content of Web documents ... / Modeling Precision-oriented Information Retrieval Knowledge-based br Precision-oriented information retrieval is performed by Web robots
142.8 Efficient Crawling Through URL Ordering - Cho, Garcia-Molina, Page (1998)(Correct)
In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more "important" pages first. Obtaining important pages rapidly can be very useful when a crawler c... / been well studied in the Information Retrieval IR community Salton br Research and Development in Information Retrieval. Melbourne Australia
142.8 Analysis of a Very Large AltaVista Query Log - Silverstein (1998)(Correct)
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents approxi... / user assumed in the standard information retrieval literature. Specifically br suggests that traditional information retrieval techniques might not work
142.0 Distributed Intelligent Agents - Sycara, Decker, Pannu, Williamson.. (1996)(Correct)
We are investigating techniques for developing distributed and
adaptive collections of agents that coordinate to retrieve, filter and
fuse information relevant to the user, task and situation, as well... / as well as coordinating information retrieval and problem solving br to perform goal-directed information retrieval and information
142.0 Real Time Video and Audio in the World Wide Web - Chen (1995)(Correct)
The architecture of World Wide Web (WWW) browsers and servers support full file transfer for
document retrieval. TCP is used for data transfers by Web browsers and their associated Hypertext
Transfer ... / in the arena of traditional information retrieval and navigation it
142.0 A Trainable Document Summarizer - Kupiec, Pedersen, Chen (1995)(Correct)
To summarize is to reduce in complexity, and hence in length, while retaining some of the essential qualities of the original. This paper focusses on document extracts, a particular kind of computed d... / Researchand Development in Information Retrieval pages - July br and Development in Information Retrieval pages - . ACM
140.7 Passage-Level Evidence in Document Retrieval - Callan (1994)(Correct)
The increasing lengths of documents in full-text collections encourages renewed interest in the ranking
and retrieval of document passages. Past research showed that evidence from passages can improve... / in INQUERY a probabilistic information retrieval system. Experiments were br and Development in Information Retrieval July Dublin
140.0 The INQUERY Retrieval System - Callan, Croft, Harding(Correct)
As larger and more heterogeneous text
databases become available, information
retrieval research will depend on
the development of powerful, efficient
and flexible retrieval engines. In this paper,
we... / databases become available information retrieval research will depend on br interest in sophisticated information retrieval IR techniques has led to
137.1 A Web-based Information System that Reasons with Structured.. - Cohen (1998)(Correct)
The degree to which information sources are pre-processed by Web-based information systems varies greatly. In search engines like Altavista, little pre-processing is done, while in "knowledge integrat... / retrieval methods from information retrieval. WHIRL allows queries br which might be called the information retrieval IR model is
136.1 SenseMaker: An Information-Exploration Interface Supporting the. . . - Baldonado, al. (1997)(Correct)
We describe the design, implementation, and pilot study for
SenseMaker, an interface for information exploration across
heterogeneous sources. We propose supporting the contextdriven
evolution of a us... / information seeking information retrieval INTRODUCTION The br relational database and Information Retrieval IR clustering
135.8 Multi-Paragraph Segmentation of Expository Texts - Hearst (1994)(Correct)
We present a method for partitioning expository texts into coherent multi-paragraph units which reflect the subtopic structure of the texts. Using Chafe's Flow Model of discourse, we observe that subt... / the use of such segments in information retrieval applications. br boundaries instead. Information retrieval algorithms can use
135.8 Multi-Paragraph Segmentation of Expository Text - Hearst (1994)(Correct)
This paper describes TextTiling, an algorithm for partitioning
expository texts into coherent multi-paragraph
discourse units which reflect the subtopic structure of
the texts. The algorithm uses doma... / boundaries instead. Information retrieval algorithms can use
123.4 Question Answering from Frequently-Asked Question Files: Experiences.. - Robin Burke (1997)(Correct)
This technical report describes FAQ Finder, a natural language question-answering
system that uses files of frequently-asked questions as its knowledge base. Unlike AI
question-answering systems that ... / question files. Unlike information retrieval approaches that rely on a br need to have a testbed for information retrieval and natural language
119.9 Minimum Cuts in Near-Linear Time - Karger (1998)(Correct)
We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a "semi-duality" between minimum cuts and maximum spanning tree packings combined with our p... / items in a database. In information retrieval minimum cuts have been br Research and Development in Information Retrieval pages - June
119.9 Computing Iceberg Queries Efficiently - Min Fang (1998)(Correct)
Many applications compute aggregate functions (such as COUNT, SUM) over an attribute
(or set of attributes) to find aggregate values above some specified threshold. We call such
queries iceberg querie... / including data warehousing information-retrieval market basket analysis br queries also arise in many information retrieval IR problems. For
119.1 Experiences with Selecting Search Engines using Meta-Search - Dreilinger (1997)(Correct)
Search engines are among the most useful and high profile resources on the Internet. The
problem of finding information on the Internet has been replaced with the problem of knowing
where search engin... / approach to the problem of information retrieval on the Web by focusing on br review related ideas from information retrieval and Web search engines. We
119.1 Part-of-Speech Tagging Using Progol - Cussens (1997)(Correct)
A system for `tagging' words with their part-of-speech (POS)
tags is constructed. The system has two components: a lexicon containing
the set of possible POS tags for a given word, and rules which u... / stage to parsing in information retrieval in text to speech
119.1 Computationally Private Information Retrieval (Extended Abstract) - Chor, Gilboa (1997)(Correct)
Private information retrieval (PIR) schemes enable a user to access k replicated copies of a database (k 2), and privately retrieve one of the n bits of data stored in the databases. This means that t... / Computationally Private Information Retrieval extended abstract br y Abstract Private information retrieval PIR schemes enable a
118.8 Neuro-Fuzzy Modeling and Control - Jang, Sun (1995)(Correct)
Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networ... / time-series prediction information retrieval database management
118.1 Grouper: A Dynamic Clustering Interface to Web Search Results - Zamir, Etzioni (1999)(Correct)
Users of Web search engines are often forced to sift through the long ordered list of document "snippets"
returned by the engines. The IR community has explored document clustering as an alternative m... / . Related Work The Information Retrieval community has long br Research and Development in Information Retrieval SIGIR' pp
116.0 A Learning Approach to Personalized Information Filtering - Sheth (1994)(Correct)
A personalized information filtering system must specialize to current interests of the user
and adapt as they change over time. It must also explore newer domains for potentially
interesting informat... / . . Information Retrieval and Filtering br and differences with Information Retrieval IR Filtering contexts
114.8 Information Extraction Using Hidden Markov Models - Leek (1997)(Correct)
This thesis shows how to design and tune a hidden Markov model to extract factual information from a corpus of machine-readable English prose. In particular, the thesis presents a HMM that classifies ... / The state of the art in information retrieval technology is of limited br performance metrics of information retrieval precision and recall
114.8 A Decision-Theoretic Approach to Database Selection in Networked IR - Fuhr (1997)(Correct)
In networked IR, a client submits a query to a broker, which is in contact with a large number
of databases. In order to yield a maximum number of relevant documents at minimum cost, the
broker has to... / Introduction Networked information retrieval NIR is a new research br a major problem in networked information retrieval. In contrast to other
114.2 A Study on Retrospective and On-Line Event Detection - Yang, Pierce, Carbonell (1998)(Correct)
This paper investigates the use and extension
of text retrieval and clustering techniques for event
detection. The task is to automatically detect novel
events from a temporally-ordered stream of news... / raising new challenges for information retrieval technology. Although br groups the UMass information retrieval group and the Dragon
114.2 Machine Learning for Information Extraction in Informal Domains - Freitag (1998)(Correct)
Information extraction, the problem of generating structured summaries of human-oriented text documents, has been studied for over a decade now, but the primary emphasis has been on document collectio... / information extraction information retrieval multistrategy learning br decades-old discipline of information retrieval has developed automatic
113.5 Heterogeneous Uncertainty Sampling for Supervised Learning - Lewis, Catlett (1994)(Correct)
Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of ... / its class is high. In the information retrieval application described br sets are widely used in information retrieval We used this type of
110.6 Phrasal Translation and Query Expansion Techniques for Cross-Language .. - Ballesteros, Croft (1997)(Correct)
Dictionary methods for cross-language information retrieval
give performance below that for mono-lingual retrieval.
Failure to translate multi-term phrases has been shown to
be one of the factors resp... / for Cross-Language Information Retrieval Lisa Ballesteros and W. br Center for Intelligent Information Retrieval Computer Science
110.1 Demand-based Document Dissemination to Reduce Traffic and Balance.. - Bestavros (1995)(Correct)
Research on replication techniques to reduce traffic and minimize the latency of information retrieval in a distributed system has concentrated on client-based caching, whereby recently/frequently acc... / and minimize the latency of information retrieval in a distributed system
109.0 The Design of a Multicast-based Distributed File System - Bjorn Gronvall (1999)(Correct)
JetFile is a distributed file system designed to support
shared file access in a heterogenous environment such
as the Internet. It uses multicast communication and optimistic
strategies for synchroniz... / document preparation information retrieval and programming.
109.0 Similarity Measures - Santini, Jain (1999)(Correct)
With complex multimedia data, we see the emergence of database systems in which the
fundamental operation is similarity assessment. Before database issues can be addressed, it is
necessary to give a d... / operation for many Visual Information Retrieval systems. In most systems
108.5 Effective Retrieval with Distributed Collections - Xu, Callan (1998)(Correct)
This paper evaluates the retrieval effectiveness
of distributed information retrieval systems in realistic
environments. We find that when a large number
of collections are available, the retrieval ef... / Center for Intelligent Information Retrieval Computer Science br of distributed information retrieval systems in realistic
108.5 An Efficient Boosting Algorithm for Combining Preferences - Freund, Iyer, Schapire, Singer (1998)(Correct)
The problem of combining preferences arises in several
applications, such as combining the results of different search
engines. This work describes an efficient algorithm for combining
multiple pref... / problems and in information-retrieval problems In br common in the field of information retrieval. Here a set of documents
107.2 A Neural Network Approach to Topic Spotting - Wiener, Pedersen, Weigend (1995)(Correct)
This paper presents an application of
nonlinear neural networks to topic spotting.
Neural networks allow us to model higherorder
interaction between document terms
and to simultaneously predict multip... / to prescreen documents in information retrieval and data extraction br vector space model of information retrieval seeks to transform
106.1 Application Design for Wireless Computing - Watson (1994)(Correct)
As mobile computing becomes more prevalent, systems
and applications must deal with a growing disparity
in resources types and availability at the user interface
device. Network characteristics, displ... / accessible there. . Information retrieval and management are two of br obvious choice for extending information retrieval over the wireless
104.4 Inference Networks for Document Retrieval - Turtle (1991)(Correct)
Information retrieval is concerned with selecting documents from a collection that will be of interest to a user with a stated information need or query. Research aimed at improving the performance of... / Professor W. Bruce Croft Information retrieval is concerned with br within which disparate information retrieval research results can be
102.1 ARACHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods.. - Menczer (1997)(Correct)
ARACHNID is a distributed algorithm for information discovery in large, dynamic, distributed environments such as the World Wide Web. The approach is based on a distributed, adaptive population of int... / to disciplines like information retrieval that have provided useful br life have been applied to information retrieval and filtering Sheth and
101.4 Experiments in Multilingual Information Retrieval using the SPIDER.. - Sheridan, Ballerini (1996)(Correct)
We introduce a new approach to multilingual information retrieval
based on the use of thesaurus-based query expansion
techniques applied over a collection of comparable multilingual
documents. This ap... / Experiments in Multilingual Information Retrieval using the SPIDER system br new approach to multilingual information retrieval based on the use of
101.2 Information Extraction as a Basis for High-Precision Text.. - Riloff, Lehnert (1994)(Correct)
We describe an approach to text classification that represents a compromise between
traditional word-based techniques and in-depth natural language processing.
Our approach uses a natural language pro... / the Center for Intelligent Information Retrieval at the University of br Traditional approaches to information retrieval use keyword searches and
99.9 Results and Challenges in Web Search Evaluation - Hawking, Craswell, Thistlewaite (1999)(Correct)
A frozen 18.5 million page snapshot of part of the Web has been created to enable and encourage meaningful and reproducible evaluation of Web search systems and techniques. This collection is being us... / as being in the area of Information Retrieval and in the area of br and so on. As the Information Retrieval IR research community
99.9 Let's Browse: A Collaborative Web Browsing Agent - Lieberman, al. (1999)(Correct)
Web browsing, like most of today's desktop applications, is
usually a solitary activity. Other forms of media, such as
watching television, are often done by groups of people,
such as families or frie... / justified by results from information retrieval that show that retaining
98.5 Natural Language Processing for Information Retrieval - David Lewis (1996)(Correct)
The paper summarizes the essential properties of document retrieval and reviews both conventional
practice and research findings, the latter suggesting that simple statistical techniques can be effect... / language processing for information retrieval David D. Lewis AT T br text. We will however use information retrieval IR sometimes taken to
98.5 Mobile Agents: Are They a Good Idea? - Chess, Harrison, Kershenbaum (1995)(Correct)
Mobile agents are programs, typically written in a script language, which may be
dispatched from a client computer and transported to a remote server computer for
execution. Several authors have sugge... / performing transactions and information retrieval in networks. Other br of an agent to perform both information retrieval and filtering at a server
97.8 Server Ranking for Distributed Text Retrieval Systems on the Internet - Budi Yuwono (1997)(Correct)
Keyword-based search services have become necessary
tools for finding information resources on the
Internet today. The rapid growth of information
on the Internet renders centralized keyword index
ser... / researchers. Keywords information retrieval internet data-bases. br most widely used text-based information retrieval models are the Boolean
97.1 Reactive Tuple Spaces for Mobile Agent Coordination - Cabri, Leonardi, Zambonelli (1998)(Correct)
Mobile active computational entities introduce peculiar problems in the
coordination of distributed application components. The paper surveys several
coordination models for mobile agent applications ... / Tuple Spaces Java WWW Information Retrieval . Introduction br systems by using a WWW information retrieval application as an example.
97.1 Evaluating Database Selection Techniques: A Testbed and Experiment - James French (1998)(Correct)
We describe a testbed for database selection
techniques and an experiment conducted using this testbed.
The testbed is a decomposition of the TREC/TIPSTER
data that allows analysis of the data along m... / sites in our distributed information retrieval test environment. Specific br on issues in distributed information retrieval systems and much of this
97.1 Implicit Rating and Filtering - Nichols (1998)(Correct)
Social filtering systems that use explicit ratings require a large number of ratings to remain viable. The effort involved for a user to rate a document may outweigh any benefit received, leading to a... / searching. The users of information retrieval IR systems are faced br Research and Development in Information Retrieval SIGIR' Dublin
97.1 Context and Page Analysis for Improved Web Search - Lawrence, Giles (1998)(Correct)
NEC Research Institute has developed a metasearch engine that improves the efficiency of Web searches by downloading and analyzing each document and then displaying results that show the query terms i... / all possible ways precise information retrieval would be simple A search br filtered out by traditional information retrieval systems. Web search
96.9 Subtopic Structuring for Full-Length Document Access - Hearst (1993)(Correct)
We argue that the advent of large volumes of full-length text, as opposed to short texts
like abstracts and newswire, should be accompanied by corresponding new approaches to
information access. Towar... / Research and Development in Information Retrieval pp. - br better results on a typical information retrieval task than does a standard
96.2 Dynamic Queries for Visual Information Seeking - Shneiderman (1994)(Correct)
Dynamic queries are a novel approach to information seeking that may enable users to cope with
information overload. They allow users to see an overview of the database, rapidly (100 msec
updates) exp... / queries database search information retrieval direct manipulation user br of the cognitive load of information retrieval to the perceptual system
95.6 Experiments on Using Semantic Distances Between Words in Image.. - Smeaton, Quigley (1996)(Correct)
Traditional approaches to information retrieval are based
upon representing a user's query as a bag of query terms
and a document as a bag of index terms and computing a
degree of similarity between ... / Traditional approaches to information retrieval are based upon br intended word sense but in information retrieval applications automatic
94.8 Internet Resource Discovery Services - Obraczka, Danzig, Li (1993)(Correct)
The exponentially growing global Internet is a virtual infinite fountain of information, yet it can seem like an information labyrinth when you try to find any specific fact. Recently, a number of too... / Resource Discovery Information Retrieval Internet Directory br WAIS is a full text information retrieval architecture whose clients
93.6 Incremental Clustering and Dynamic Information Retrieval - Charikar, Chekuri, Motwani (1997)(Correct)
Motivated by applications such as document and image
classification in information retrieval, we consider the problem
of clustering dynamic point sets in a metric space. We
propose a model called incr... / Clustering And Dynamic Information Retrieval Moses Charikar br and image classification in information retrieval we consider the problem
93.6 Cat-a-Cone: An Interactive Interface for Specifying Searches and.. - Hearst (1997)(Correct)
This paper introduces a novel user interface that integrates
search and browsing of very large category hierarchies
with their associated text collections. A key component
is the separate but simultan... / categories. We view information retrieval as a complex task that br a large thesaurus for information retrieval. In Proceedings of RIAO
92.7 A Data Transformation System for Biological Data Sources - Buneman, Davidson, Hart, Overton.. (1995)(Correct)
Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ... / is accessed through an information retrieval package called Entrez br and is accessed through the information retrieval system Entrez. It is
91.4 Natural-Sounding Speech Synthesis Using Variable-Length Units - Yi (1998)(Correct)
The goal of this work was to develop a speech synthesis system which concatenates
variable-length units to create natural sounding speech. Our initial work in this area
showed that by careful design o... / used in a conversational information retrieval system in two application br a conversational information retrieval system. The domain is
91.4 Combining Textual and Visual Cues for Content-based Image Retrieval.. - Cascia, Sethi, Sclaroff (1998)(Correct)
A system is proposed that combines textual and visual
statistics in a single index vector for content-based search
of a WWW image database. Textual statistics are captured
in vector form using latent ... / using a method known in the information retrieval community as Latent br Algebra For Intelligent Information Retrieval. Tr Ut-Cs- - U.
91.4 Indexing with WordNet synsets can improve text retrieval - Gonzalo (1998)(Correct)
The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. T... / general feeling within the information retrieval community is that dealing br knowledge-based approach to information retrieval. In Proceedings of the
90.9 Balanced Aspect Ratio Trees: Combining the Advantages of k-d Trees.. - Duncan, Goodrich, Kobourov (1999)(Correct)
Given a set S of n points in IR
d
, we show, for fixed d,
how to construct in O(n log n) time a data structure we call
the Balanced Aspect Ratio (BAR) tree. A BAR tree is a
binary space partition t... / GIS computer graphics information retrieval pattern recognition br They are also used in information retrieval for finding nearest
89.8 TREC and TIPSTER Experiments With INQUERY - James Callan (1995)(Correct)
INQUERY is a probablistic information retrieval system based upon a Bayesian
inference network model. This paper describes recent improvements to the system as
a result of participation in the TIPSTER... / INQUERY is a probablistic information retrieval system based upon a br The effectiveness of an information retrieval IR system depends upon
89.3 BioKleisli: A Digital Library for Biomedical Researchers - Davidson, Overton, Tannen, Wong (1997)(Correct)
Data of interest to biomedical researchers
associated with the Human Genome Project (HGP) is
stored all over the world in a number of different electronic
data formats and accessible through a varie... / of format and limited information retrieval facilities pose a br developed their own query information-retrieval systems around
89.3 Agents for Information Gathering - Knoblock, Ambite (1997)(Correct)
With the vast number of information resources available today, a critical problem is how to locate, retrieve and process information. It is impractical to build a single unified system that combines a... / dynamically construct information retrieval plans and learn about br Hsu. Cooperating agents for information retrieval. In Proceedings of the
89.3 Mistake-Driven Learning in Text Categorization - Dagan (1997)(Correct)
Learning problems in the text processing
domain often map the text to a space
whose dimensions are the measured features
of the text, e.g., its words. Three
characteristic properties of this domain ar... / categorization systems in information retrieval IR are often ad-hoc and br of ACM-SIGIR Conference on Information Retrieval. Blum A. . Learning
88.8 Growth Trends in Wide-Area TCP Connections - Paxson (1994)(Correct)
We analyze the growth of a large research laboratory's widearea
TCP connections over a period of three years. Our
data consisted of eight month-long traces of all TCP connections
made between the site... / datasets relatively new information-retrieval protocols such as Gopher br limitations. Use of information-retrieval protocols such as
88.2 SunOS 5.0 Multithread Architecture - Powell, Kleiman, Barton, Shah.. (1991)(Correct)
this document are identified by the trademarks of the
companies who market those products.
ii SunSoft
SunOS 5.0 Multithread Architecture iii unknown AWhitePaper
Solaris SunOS 5.0
SunOS 5.0 Multith... / taping or storage in an information retrieval system-without prior
86.9 An HTTP-based Infrastructure for Mobile Agents - Lingnau, Drobnik, Dömel (1995)(Correct)
Mobile agents are an emerging technology attracting interest from the fields of distributed systems, information retrieval, electronic commerce and artificial intelligence. We present an infrastruct... / of distributed systems information retrieval electronic commerce and br of this idea is in information retrieval where it is easy to
86.9 Description of the UMass System as Used for MUC-6 - Fisher, Soderland, McCarthy, Feng.. (1995)(Correct)
INTRODUCTION Information extraction research at the University of Massachusetts is based on portable, trainable language processing components. Some components are more effective than others, some hav... / modified developed in the Information Retrieval Laboratory at UMass. The br Research on Intelligent Information Retrieval. BIBLIOGRAPHY
85.7 Learning to Order Things - Cohen, Schapire, Singer (1998)(Correct)
There are many applications in which it is desirable to order rather than classify instances.
Here we consider the problem of learning how to order instances given feedback in the form of
preference j... / it is common practice in information retrieval to rank documents br theory the social sciences information retrieval and mathematical economics
85.1 Discovering Trends in Text Databases - Lent, Agrawal, Srikant (1997)(Correct)
We describe a system we developed for identifying trends in text documents collected over a period of time. Trends can be used, for example, to discover that a company is shifting interests from one d... / has been studied by the information retrieval community. The work on br and structured queries in information retrieval. In th International
84.0 Steps Toward Formalizing Context - Akman, Surav (1996)(Correct)
situations are the constructs which are more amenable to mathematical
manipulation. An abstract situation is defined as a (possibly non-well-founded [11])
set of infons. Given a real situation s, the ... / processing and intelligent information retrieval. Although the word br Context in Intelligent Information Retrieval A formal notion of
84.0 Person identification using multiple cues - Brunelli, Falavigna (1995)(Correct)
This paper presents a person identification system based on acoustic and visual features. The system is organized as a set of nonhomogeneous classifiers whose outputs are integrated after a normalizat... / systems in the area of information retrieval automatic banking
84.0 Bag of words and word psitions - Cohen (1995)(Correct)
Text categorization is the task of classifying text into one of several predefined categories. In this paper we will evaluate the effectiveness of several ILP methods for text categorization, and also... / words However many modern information retrieval systems support queries br problems described from the information retrieval literature Lewis and