• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,166
Next 10 →

Supervised and unsupervised discretization of continuous features

by James Dougherty, Ron Kohavi, Mehran Sahami - in A. Prieditis & S. Russell, eds, Machine Learning: Proceedings of the Twelfth International Conference , 1995
"... Many supervised machine learning algorithms require a discrete feature space. In this paper, we review previous work on continuous feature discretization, identify de n-ing characteristics of the methods, and conduct an empirical evaluation of several methods. We compare binning, an unsupervised dis ..."
Abstract - Cited by 540 (11 self) - Add to MetaCart
Many supervised machine learning algorithms require a discrete feature space. In this paper, we review previous work on continuous feature discretization, identify de n-ing characteristics of the methods, and conduct an empirical evaluation of several methods. We compare binning, an unsupervised

Unsupervised Learning of the Morphology of a Natural Language

by John Goldsmith - COMPUTATIONAL LINGUISTICS , 2001
"... This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics that rapidly develop a probabilist ..."
Abstract - Cited by 355 (12 self) - Add to MetaCart
This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics that rapidly develop a

Induction of pluripotent stem cells from adult human fibroblasts by defined factors

by Kazutoshi Takahashi, Koji Tanabe, Mari Ohnuki, Megumi Narita, Tomoko Ichisaka, Kiichiro Tomoda - Cell 2007
"... Successful reprogramming of differentiated human somatic cells into a pluripotent state would allow creation of patient- and disease-specific stem cells. We previously reported generation of induced pluripotent stem (iPS) cells, capable of germline transmission, from mouse somatic cells by transduct ..."
Abstract - Cited by 446 (3 self) - Add to MetaCart
by transduction of four defined transcription factors. Here, we demonstrate the generation of iPS cells from adult human dermal fibroblasts with the same four factors: Oct3/4, Sox2, Klf4, and c-Myc. Human iPS cells were similar to human embryonic stem (ES) cells in morphology, proliferation, surface antigens

Bootstrapping morphological analysis of Gikuyu using unsupervised maximum entropy learning

by Guy De Pauw, Peter Waiganjo Wagacha - In Proceedings of the eighth INTERSPEECH conference , 2007
"... This paper describes a proof-of-the-principle experiment in which maximum entropy learning is used for the automatic induction of shallow morphological features for the resourcescarce Bantu language of Gĩkũyũ. This novel approach circumvents the limitations of typical unsupervised morphological indu ..."
Abstract - Cited by 10 (6 self) - Add to MetaCart
This paper describes a proof-of-the-principle experiment in which maximum entropy learning is used for the automatic induction of shallow morphological features for the resourcescarce Bantu language of Gĩkũyũ. This novel approach circumvents the limitations of typical unsupervised morphological

Unsupervised models for morpheme segmentation and morphology learning

by Mathias Creutz, Krista Lagus - ACM Trans. Speech Lang. Process , 2007
"... We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequence ..."
Abstract - Cited by 108 (8 self) - Add to MetaCart
We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy

A Hybrid Approach to the Induction of Underlying Morphology

by Fei Xia
"... We present a technique for refining a baseline segmentation and generating a plausible underlying morpheme segmentation by integrating hand-written rewrite rules into an existing state-of-the-art unsupervised morphological induction procedure. Performance on measures which consider surface-boundary ..."
Abstract - Add to MetaCart
We present a technique for refining a baseline segmentation and generating a plausible underlying morpheme segmentation by integrating hand-written rewrite rules into an existing state-of-the-art unsupervised morphological induction procedure. Performance on measures which consider surface

Unsupervised Induction of Natural Language Morphology Inflection Classes

by Christian Monson, Alon Lavie, Jaime Carbonell, Lori Levin - In Proceedings of the Seventh Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON ’04 , 2004
"... We propose a novel language-independent framework for inducing a collection of morphological inflection classes from a monolingual corpus of full form words. Our approach involves two main stages. In the first stage, we generate a large data structure of candidate inflection classes and their interr ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
baseline techniques applied to induction of major inflection classes of Spanish. The preliminary results on an initial training corpus already surpass an F1 of 0.5 against ideal Spanish inflectional morphology classes. 1

Corpus-based induction of syntactic structure: Models of dependency and constituency

by Dan Klein - In Proceedings of the 42nd Annual Meeting of the ACL , 2004
"... We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The product model outperforms both components on their respective evaluation metrics, giving the best pu ..."
Abstract - Cited by 229 (9 self) - Add to MetaCart
We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The product model outperforms both components on their respective evaluation metrics, giving the best

A Framework for Unsupervised Natural Language Morphology Induction

by unknown authors
"... This paper presents a framework for unsupervised natural language morphology induction wherein candidate suffixes are grouped into candidate inflection classes, which are then arranged in a lattice structure. With similar candidate inflection classes placed near one another in the lattice, I propose ..."
Abstract - Add to MetaCart
This paper presents a framework for unsupervised natural language morphology induction wherein candidate suffixes are grouped into candidate inflection classes, which are then arranged in a lattice structure. With similar candidate inflection classes placed near one another in the lattice, I

Multiple sequence alignment for morphology induction. Working notes for the

by Tzvetan Tchoukalov, Christian Monson, Brian Roark - CLEF 2009 Workshop, Corfu , 2009
"... MetaMorph is a novel application of multiple sequence alignment (MSA) to natural language morphology induction. Given a text corpus in any language, we sequentially align a subset of the words of the corpus to form an MSA using a probabilistic scoring scheme. We then segment the MSA to produce outpu ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
segmentation. This suggests that MSA is an effective algorithm for unsupervised morphology induction and may yet outperform the state-of-the-art morphology induction algorithms. Future research directions are discussed.
Next 10 →
Results 1 - 10 of 2,166
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University