• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 108,139
Next 10 →

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

by Antony Rowstron, Peter Druschel , 2001
"... This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internetbased overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache a ..."
Abstract - Cited by 803 (23 self) - Add to MetaCart
This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internetbased overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache

Bigtable: A distributed storage system for structured data

by Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber - IN PROCEEDINGS OF THE 7TH CONFERENCE ON USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION - VOLUME 7 , 2006
"... Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications ..."
Abstract - Cited by 1028 (4 self) - Add to MetaCart
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications

Instance-based learning algorithms

by David W. Aha, Dennis Kibler, Marc K. Albert - Machine Learning , 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract - Cited by 1389 (18 self) - Add to MetaCart
. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several realworld

Wide-area cooperative storage with CFS

by Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica , 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract - Cited by 999 (53 self) - Add to MetaCart
The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers

Query evaluation techniques for large databases

by Goetz Graefe - ACM COMPUTING SURVEYS , 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract - Cited by 767 (11 self) - Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem

Making Large-Scale SVM Learning Practical

by Thorsten Joachims , 1998
"... Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large lea ..."
Abstract - Cited by 1861 (17 self) - Add to MetaCart
learning tasks with many training examples, off-the-shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements. SV M light1 is an implementation of an SVM learner which addresses the problem of large tasks. This chapter presents algorithmic

MapReduce: Simplified data processing on large clusters.

by Jeffrey Dean , Sanjay Ghemawat - In Proceedings of the Sixth Symposium on Operating System Design and Implementation (OSDI-04), , 2004
"... Abstract MapReduce is a programming model and an associated implementation for processing and generating large data sets. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of ..."
Abstract - Cited by 3439 (3 self) - Add to MetaCart
of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large

A density-based algorithm for discovering clusters in large spatial databases with noise

by Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu , 1996
"... Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clu ..."
Abstract - Cited by 1786 (70 self) - Add to MetaCart
Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery

A Field Study of the Software Design Process for Large Systems

by Bill Curtis, Herb Krasner, Neil Iscoe - Communications of the ACM , 1988
"... The problems of designing large software systems were studied through interviewing personnel from 17 large projects. A layered behavioral model is used to analyze how three lgf these problems-the thin spread of application domain knowledge, fluctuating and conflicting requirements, and communication ..."
Abstract - Cited by 685 (2 self) - Add to MetaCart
The problems of designing large software systems were studied through interviewing personnel from 17 large projects. A layered behavioral model is used to analyze how three lgf these problems-the thin spread of application domain knowledge, fluctuating and conflicting requirements

A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood

by Stéphane Guindon, Olivier Gascuel , 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract - Cited by 2182 (27 self) - Add to MetaCart
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements
Next 10 →
Results 1 - 10 of 108,139
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University