• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 171
Next 10 →

Pig Latin: A Not-So-Foreign Language for Data Processing

by Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
"... There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, e.g., Teradata, offer a solution, but are usually prohibitively e ..."
Abstract - Cited by 607 (13 self) - Add to MetaCart
-level, procedural style of map-reduce. The accompanying system, Pig, is fully implemented, and compiles Pig Latin into physical plans that are executed over Hadoop, an open-source, map-reduce implementation. We give a few examples of how engineers at Yahoo! are using Pig to dramatically reduce the time required

Google’s MapReduce Programming Model — Revisited

by Ralf Lämmel
"... Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce a ..."
Abstract - Cited by 82 (1 self) - Add to MetaCart
Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce

MapReduce Framework

by Sangwon Seo, Edward J. Yoon, Jaehong Kim, Seongwook Jin, Seungryoul Maeng, Sangwon Seo, Edward J. Yoon, Jaehong Kim, Seongwook Jin , 2010
"... have become so complex, and thus computation tools play an important role. In this paper, we explore the state-of-the-art framework providing high-level matrix computation primitives with MapReduce through the case study approach, and demon-strate these primitives with different computation engines ..."
Abstract - Add to MetaCart
have become so complex, and thus computation tools play an important role. In this paper, we explore the state-of-the-art framework providing high-level matrix computation primitives with MapReduce through the case study approach, and demon-strate these primitives with different computation engines

Behavioral Simulations in MapReduce

by Guozhang Wang, Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers Johannes Gehrke
"... In many scientific domains, researchers are turning to large-scale behavioral simulations to better understand real-world phenomena. While there has been a great deal of work on simulation tools from the high-performance computing community, behavioral simulations remain challenging to program and a ..."
Abstract - Cited by 5 (4 self) - Add to MetaCart
and automatically scale in parallel environments. In this paper we present BRACE (Big Red Agent-based Computation Engine), which extends the MapReduce framework to process these simulations efficiently across a cluster. We can leverage spatial locality to treat behavioral simulations as iterated spatial joins

Map-Reduce Examples

by Amit Jain
"... tf-idf weight (term frequencyinverse document frequency) is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word ..."
Abstract - Add to MetaCart
of the word in the corpus. Variations of the tfidf weighting scheme are often used by search engines as a central tool in scoring and ranking a document’s relevance given a user query. TF-IDF ◮ The term frequency (tf) for a given term ti within a particular document dj is defined as follows, where ni

MapReduce on the Cell Broadband Engine Architecture

by Marc De Kruijf , 2007
"... In this paper, we propose the evaluation of MapReduce on the Cell processor by way of the Marchine Cubes application. We argue that the Cell architecture and the MapReduce parallel programming model complement each other well, and that the Marching Cubes application is a good application through whi ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
In this paper, we propose the evaluation of MapReduce on the Cell processor by way of the Marchine Cubes application. We argue that the Cell architecture and the MapReduce parallel programming model complement each other well, and that the Marching Cubes application is a good application through

Disco: Distributed co-clustering with map-reduce. ICDM

by Spiros Papadimitriou, Jimeng Sun , 2008
"... Huge datasets are becoming prevalent; even as researchers, we now routinely have to work with datasets that are up to a few terabytes in size. Interesting real-world applications produce huge volumes of messy data. The mining process involves several steps, starting from pre-processing the raw data ..."
Abstract - Cited by 53 (1 self) - Add to MetaCart
to estimating the final models. As data become more abundant, scalable and easyto-use tools for distributed processing are also emerging. Among those, Map-Reduce has been widely embraced by both academia and industry. In database terms, Map-Reduce is a simple yet powerful execution engine, which can

Cogset: A High-Performance MapReduce Engine

by Steffen Viken Valvåg , 2011
"... MapReduce has become a widely employed programming model for large-scale data-intensive computations. Traditional MapReduce engines employ dynamic routing of data as a core mech-anism for fault tolerance and load balancing. An alternative mechanism is static routing, which reduces the need to store ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
MapReduce has become a widely employed programming model for large-scale data-intensive computations. Traditional MapReduce engines employ dynamic routing of data as a core mech-anism for fault tolerance and load balancing. An alternative mechanism is static routing, which reduces the need to store

Soren: Adaptive MapReduce for Programmable

by Reza Mokhtari, Amin Abbasi, Farshad Khunjush, Reza Azimi
"... Abstract. InrecentyearstheMapReduceprogrammingmodelhasbeen widely used for developing parallel data-intensive applications. As a result of its popularity, there exist many implementations of the MapReduce model on different parallel architectures including on massively parallel programmable GPUs. A ..."
Abstract - Add to MetaCart
which is capable of monitoring key characteristics of applications and dynamically executing them efficiently in one of the three variations of the MapReduce engine it implements. Our preliminary results show that our adaptive method can significantly improve performance for many MapReduce applications

Versatile XQuery Processing in MapReduce

by Caetano Sauer, Sebastian Bächle, Theo Härder
"... Abstract. The MapReduce (MR) framework has become a standard tool for performing large batch computations—usually of aggregative nature—in parallel over a cluster of commodity machines. A significant share of typical MR jobs involves standard database-style queries, where it becomes cumbersome to sp ..."
Abstract - Add to MetaCart
Abstract. The MapReduce (MR) framework has become a standard tool for performing large batch computations—usually of aggregative nature—in parallel over a cluster of commodity machines. A significant share of typical MR jobs involves standard database-style queries, where it becomes cumbersome
Next 10 →
Results 1 - 10 of 171
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University