Results 1 - 10
of
52
Peer and Self Assessment in Massive Online Classes
"... Peer and self assessment offer an opportunity to scale both assessment and learning to global classrooms. This paper reports our experiences with two iterations of the first large online class to use peer and self assessment. In this class, peer grades correlated highly with staff-assigned grades. T ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Peer and self assessment offer an opportunity to scale both assessment and learning to global classrooms. This paper reports our experiences with two iterations of the first large online class to use peer and self assessment. In this class, peer grades correlated highly with staff-assigned grades. The second iteration had 42.9 % of students grades within 5 % of the staff grade, and 65.5 % within 10%. On average, students assessed their work 7 % higher than staff did. Students also rated peers ’ work from their own country 3.6% higher than those from elsewhere. We performed three experiments to improve grading accuracy. We found that giving students feedback about their grading bias increased subsequent accuracy. We introduce short, customizable feedback snippets that cover common issues with assignments, providing students more qual-itative peer feedback. Finally, we introduce a data-driven approach that highlights high-variance items for improvement. We find that rubrics that use a parallel sentence structure, unambiguous wording and well-specified dimensions have lower variance. After revising rubrics, median grading error decreased from 12.4% to 9.9%.
Dirt Cheap Web-Scale Parallel Text from the Common Crawl
"... Parallel text is the fuel that drives modern machine translation systems. The Web is a comprehensive source of preexisting parallel text, but crawling the entire web is impossible for all but the largest companies. We bring web-scale parallel text to the masses by mining the Common Crawl, a public W ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Parallel text is the fuel that drives modern machine translation systems. The Web is a comprehensive source of preexisting parallel text, but crawling the entire web is impossible for all but the largest companies. We bring web-scale parallel text to the masses by mining the Common Crawl, a public Web crawl hosted on Amazon’s Elastic Cloud. Starting from nothing more than a set of common two-letter language codes, our open-source extension of the STRAND algorithm mined 32 terabytes of the crawl in just under a day, at a cost of about $500. Our large-scale experiment uncovers large amounts of parallel text in dozens of language pairs across a variety of domains and genres, some previously unavailable in curated datasets. Even with minimal cleaning and filtering, the resulting data boosts translation performance across the board for five different language pairs in the news domain, and on open domain test sets we see improvements of up to 5 BLEU. We make our code and data available for other researchers seeking to mine this rich new data resource. 1 1
Learning Whom to Trust with MACE
"... Non-expert annotation services like Amazon’s Mechanical Turk (AMT) are cheap and fast ways to evaluate systems and provide categorical annotations for training data. Unfortunately, some annotators choose bad labels in order to maximize their pay. Manual identification is tedious, so we experiment wi ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
(Show Context)
Non-expert annotation services like Amazon’s Mechanical Turk (AMT) are cheap and fast ways to evaluate systems and provide categorical annotations for training data. Unfortunately, some annotators choose bad labels in order to maximize their pay. Manual identification is tedious, so we experiment with an item-response model. It learns in an unsupervised fashion to a) identify which annotators are trustworthy and b) predict the correct underlying labels. We match performance of more complex state-of-the-art systems and perform well even under adversarial conditions. We show considerable improvements over standard baselines, both for predicted label accuracy and trustworthiness estimates. The latter can be further improved by introducing a prior on model parameters and using Variational Bayes inference. Additionally, we can achieve even higher accuracy by focusing on the instances our model is most confident in (trading in some recall), and by incorporating annotated control instances. Our system, MACE (Multi-Annotator Competence Estimation), is available for download1. 1
Constructing parallel corpora for six indian languages via crowdsourcing
- In Proceedings of the Seventh Workshop on Statistical Machine Translation
, 2012
"... Recent work has established the efficacy of Amazon’s Mechanical Turk for constructing parallel corpora for machine translation re-search. We apply this to building a collec-tion of parallel corpora between English and six languages from the Indian subcontinent: Bengali, Hindi, Malayalam, Tamil, Telu ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Recent work has established the efficacy of Amazon’s Mechanical Turk for constructing parallel corpora for machine translation re-search. We apply this to building a collec-tion of parallel corpora between English and six languages from the Indian subcontinent: Bengali, Hindi, Malayalam, Tamil, Telugu, and Urdu. These languages are low-resource, under-studied, and exhibit linguistic phenom-ena that are difficult for machine translation. We conduct a variety of baseline experiments and analysis, and release the data to the com-munity. 1
Crowdsourcing Research Opportunities: Lessons from Natural Language Processing
"... Although the field has led to promising early results, the use of crowdsourcing as an integral part of science projects is still regarded with skepticism by some, largely due to a lack of awareness of the opportunities and implications of utilizing these new techniques. We address this lack of aware ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
(Show Context)
Although the field has led to promising early results, the use of crowdsourcing as an integral part of science projects is still regarded with skepticism by some, largely due to a lack of awareness of the opportunities and implications of utilizing these new techniques. We address this lack of awareness, firstly by highlighting the positive impacts that crowdsourcing has had on Natural Language Processing research. Secondly, we discuss the challenges of more complex methodologies, quality control, and the necessity to deal with ethical issues. We conclude with future trends and opportunities of crowdsourcing for science, including its potential for disseminating results, making science more accessible, and enriching educational programs. Categories and Subject Descriptors
Crisis MT: Developing a cookbook for MT in crisis situations
- In Proceedings of the Sixth Workshop on Statistical Machine Translation, Proceedings of the Workshop on Statistical Machine Translation (WMT
, 2011
"... Abstract In this paper, we propose that MT is an important technology in crisis events, something that can and should be an integral part of a rapid-response infrastructure. By integrating MT services directly into a messaging infrastructure (whatever the type of messages being serviced, e.g., text ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Abstract In this paper, we propose that MT is an important technology in crisis events, something that can and should be an integral part of a rapid-response infrastructure. By integrating MT services directly into a messaging infrastructure (whatever the type of messages being serviced, e.g., text messages, Twitter feeds, blog postings, etc.), MT can be used to provide first pass translations into a majority language, which can be more effectively triaged and then routed to the appropriate aid agencies. If done right, MT can dramatically increase the speed by which relief can be provided. To ensure that MT is a standard tool in the arsenal of tools needed in crisis events, we propose a preliminary Crisis Cookbook, the contents of which could be translated into the relevant language(s) by volunteers immediately after a crisis event occurs. The resulting data could then be made available to relief groups on the ground, as well as to providers of MT services. We also note that there are significant contributions that our community can make to relief efforts through continued work on our research, especially that research which makes MT more viable for under-resourced languages.
Aggregating ordinal labels from crowds by minimax conditional entropy
"... We propose a method to aggregate noisy ordi-nal labels collected from a crowd of workers or annotators. Eliciting ordinal labels is important in tasks such as judging web search quality and rating products. Our method is motivated by the observation that workers usually have diffi-culty distinguishi ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
We propose a method to aggregate noisy ordi-nal labels collected from a crowd of workers or annotators. Eliciting ordinal labels is important in tasks such as judging web search quality and rating products. Our method is motivated by the observation that workers usually have diffi-culty distinguishing between two adjacent ordi-nal classes whereas distinguishing between two classes which are far away from each other is much easier. We formulate our method as min-imax conditional entropy subject to constraints which encode this observation. Empirical eval-uations on real datasets demonstrate significant improvements over existing methods. 1.
Sentence Simplification as Tree Transduction
"... In this paper, we introduce a syntax-based sentence simplifier that models simplifi-cation using a probabilistic synchronous tree substitution grammar (STSG). To im-prove the STSG model specificity we uti-lize a multi-level backoff model with addi-tional syntactic annotations that allow for better d ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we introduce a syntax-based sentence simplifier that models simplifi-cation using a probabilistic synchronous tree substitution grammar (STSG). To im-prove the STSG model specificity we uti-lize a multi-level backoff model with addi-tional syntactic annotations that allow for better discrimination over previous STSG formulations. We compare our approach to T3 (Cohn and Lapata, 2009), a re-cent STSG implementation, as well as two state-of-the-art phrase-based sentence simplifiers on a corpus of aligned sen-tences from English and Simple English Wikipedia. Our new approach performs significantly better than T3, similarly to human simplifications for both simplicity and fluency, and better than the phrase-based simplifiers for most of the evalua-tion metrics. 1
CROWDSOURCING THE ACQUISITION OF NATURAL LANGUAGE CORPORA: METHODS AND OBSERVATIONS
"... We study the opportunity for using crowdsourcing methods to acquire language corpora for use in natural language processing systems. Specifically, we empirically investigate three methods for eliciting natural language sentences that correspond to a given semantic form. The methods convey frame sema ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
We study the opportunity for using crowdsourcing methods to acquire language corpora for use in natural language processing systems. Specifically, we empirically investigate three methods for eliciting natural language sentences that correspond to a given semantic form. The methods convey frame semantics to crowd workers by means of sentences, scenarios, and list-based descriptions. We discuss various performance measures of the crowdsourcing process, and analyze the semantic correctness, naturalness, and biases of the collected language. We highlight research challenges and directions in applying these methods to acquire corpora for natural language processing applications. Index Terms — crowdsourcing, natural language elicitation methods, language understanding, spoken dialog. 1.
A.: Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
- In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), European Language Resources Association (ELRA) (2014) 859–866
"... Abstract Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources. Although the use of this approach is intensifying in all its key genres (paid-for crowdsourcing, games with a purpose, volunteering- ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources. Although the use of this approach is intensifying in all its key genres (paid-for crowdsourcing, games with a purpose, volunteering-based approaches), the community still lacks a set of best-practice guidelines similar to the annotation best practices for traditional, expert-based corpus acquisition. In this paper we focus on the use of crowdsourcing methods for corpus acquisition and propose a set of best practice guidelines based in our own experiences in this area and an overview of related literature. We also introduce GATE Crowd, a plugin of the GATE platform that relies on these guidelines and offers tool support for using crowdsourcing in a more principled and efficient manner.