Results 1  10
of
17
A Matrix Factorization Method for Mapping Items to Skills and for Enhancing ExpertBased Qmatrices
"... Abstract. Uncovering the right skills behind question items is a difficult task. It requires a thorough understanding of the subject matter and of the cognitive factors that determine student performance. The skills definition, and the mapping of item to skills, require the involvement of experts. W ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Uncovering the right skills behind question items is a difficult task. It requires a thorough understanding of the subject matter and of the cognitive factors that determine student performance. The skills definition, and the mapping of item to skills, require the involvement of experts. We investigate means to assist experts for this task by using a data driven, matrix factorization approach. The two mappings of items to skills, the expert on one side and the matrix factorization on the other, are compared in terms of discrepancies, and in terms of their performance when used in a linear model of skills assessment and item outcome prediction. Visual analysis shows a relatively similar pattern between the expert and the factorized mappings, although differences arise. The prediction comparison shows the factorization approach performs slightly better than the original expert Qmatrix, giving supporting evidence to the belief that the factorization mapping is valid. Implications for the use of the factorization to design better item to skills mapping are discussed.
Tagaware ordinal sparse factor analysis for learning and content analytics
 In Proc. 6th Intl. Conf. on Educational Data Mining
, 2013
"... Machine learning offers novel ways and means to design personalized learning systems wherein each student’s educational experience is customized in real time depending on their background, learning goals, and performance to date. SPARse Factor Analysis (SPARFA) is a novel framework for machine lear ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
Machine learning offers novel ways and means to design personalized learning systems wherein each student’s educational experience is customized in real time depending on their background, learning goals, and performance to date. SPARse Factor Analysis (SPARFA) is a novel framework for machine learningbased learning analytics, which estimates a learner’s knowledge of the concepts underlying a domain, and content analytics, which estimates the relationships among a collection of questions and those concepts. SPARFA jointly learns the associations among the questions and the concepts, learner concept knowledge profiles, and the underlying question difficulties, solely based on the correct/incorrect graded responses of a population of learners to a collection of questions. In this paper, we extend the SPARFA framework significantly to enable: (i) the analysis of graded responses on an ordinal scale (partial credit) rather than a binary scale (correct/incorrect); (ii) the exploitation of tags/labels for questions that partially describe the question–concept associations. The resulting Ordinal SPARFATag framework greatly enhances the interpretability of the estimated concepts. We demonstrate using real educational data that Ordinal SPARFATag outperforms both SPARFA and existing collaborative filtering techniques in predicting missing learner responses.
Timevarying learning and content analytics via sparse factor analysis
 In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2014
"... We propose SPARFATrace, a new machine learningbased framework for timevarying learning and content analytics for educational applications. We develop a novel message passingbased, blind, approximate Kalman filter for sparse factor analysis (SPARFA) that jointly traces learner concept knowledge ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We propose SPARFATrace, a new machine learningbased framework for timevarying learning and content analytics for educational applications. We develop a novel message passingbased, blind, approximate Kalman filter for sparse factor analysis (SPARFA) that jointly traces learner concept knowledge over time, analyzes learner concept knowledge state transitions (induced by interacting with learning resources, such as textbook sections, lecture videos, etc., or the forgetting effect), and estimates the content organization and difficulty of the questions in assessments. These quantities are estimated solely from binaryvalued (correct/incorrect) graded learner response data and the specific actions each learner performs (e.g., answering a question or studying a learning resource) at each time instant. Experimental results on two online course datasets demonstrate that SPARFATrace is capable of tracing each learner’s concept knowledge evolution over time, analyzing the quality and content organization of learning resources, and estimating the question–concept associations and the question difficulties. Moreover, we show that SPARFATrace achieves comparable or better performance in predicting unobserved learner responses compared to existing collaborative filtering and knowledge tracing methods.
Informative prediction based on ordinal questionnaire data,” submitted
"... Abstract—Supporting human decision making is a major goal of data mining. The more decision making is critical, the more interpretability is required in the predictive model. This paper proposes a new framework to build a fully interpretable predictive model for questionnaire data, while maintaining ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Supporting human decision making is a major goal of data mining. The more decision making is critical, the more interpretability is required in the predictive model. This paper proposes a new framework to build a fully interpretable predictive model for questionnaire data, while maintaining high prediction accuracy with regards to the final outcome. Such a model has applications in project risk assessment, in health care, in sentiment analysis and presumably in any real world application that relies on questionnaire data for informative and accurate prediction. Our framework is inspired by models in Item Response Theory (IRT), which were originally developed in psychometrics with applications to standardized tests such as SAT. We first extend these models, which are essentially unsupervised, to the supervised setting. We then derive a distance metric from the trained model to define the informativeness of individual question items. On realworld questionnaire data obtained from information technology projects, we demonstrate the power of this approach in terms of interpretability as well as predictability. To the best of our knowledge, this is the first work that leverages the IRT framework to provide informative and accurate prediction on ordinal questionnaire data. Index Terms—psychometrics, questionnaire data, item response theory, metric learning I.
Noisy Matrix Completion under Sparse Factor Models
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2014
"... ..."
Quantized Matrix Completion for Personalized Learning
"... The recently proposed SPARse Factor Analysis (SPARFA) framework for personalized learning performs factor analysis on ordinal or binaryvalued (e.g., correct/incorrect) graded learner responses to questions. The underlying factors are termed “concepts ” (or knowledge components) and are used for le ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
The recently proposed SPARse Factor Analysis (SPARFA) framework for personalized learning performs factor analysis on ordinal or binaryvalued (e.g., correct/incorrect) graded learner responses to questions. The underlying factors are termed “concepts ” (or knowledge components) and are used for learning analytics (LA), the estimation of learner conceptknowledge profiles, and for content analytics (CA), the estimation of question–concept associations and question difficulties. While SPARFA is a powerful tool for LA and CA, it requires a number of algorithm parameters (including the number of concepts), which are difficult to determine in practice. In this paper, we propose SPARFALite, a convex optimizationbased method for LA that builds on matrix completion, which only requires a single algorithm parameter and enables us to automatically identify the required number of concepts. Using a variety of educational datasets, we demonstrate that SPARFALite (i) achieves comparable performance in predicting unobserved learner responses to existing methods, including item response theory (IRT) and SPARFA, and (ii) is computationally more efficient.
Matrix recovery from quantized and corrupted measurements
 In IEEE Intl. Conf. on Acoustics, Speech and Signal Processing
, 2014
"... This paper deals with the recovery of an unknown, lowrank matrix from quantized and (possibly) corrupted measurements of a subset of its entries. We develop statistical models and corresponding (multi)convex optimization algorithms for quantized matrix completion (QMC) and quantized robust prin ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This paper deals with the recovery of an unknown, lowrank matrix from quantized and (possibly) corrupted measurements of a subset of its entries. We develop statistical models and corresponding (multi)convex optimization algorithms for quantized matrix completion (QMC) and quantized robust principal component analysis (QRPCA). In order to take into account the quantized nature of the available data, we jointly learn the underlying quantization bin boundaries and recover the lowrank matrix, while removing potential (sparse) corruptions. Experimental results on synthetic and two realworld collaborative filtering datasets demonstrate that directly operating with the quantized measurements—rather than treating them as real values—results in (often significantly) lower recovery error if the number of quantization bins is less than about 10. Index Terms — Quantization, convex optimization, matrix completion, robust principal component analysis.
Joint Topic Modeling and Factor Analysis of Textual Information and Graded Response Data
"... Modern machine learning methods are critical to the development of largescale personalized learning systems that cater directly to the needs of individual learners. The recently developed SPARse Factor Analysis (SPARFA) framework jointly estimates learner’s knowledge of the latent concepts unde ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Modern machine learning methods are critical to the development of largescale personalized learning systems that cater directly to the needs of individual learners. The recently developed SPARse Factor Analysis (SPARFA) framework jointly estimates learner’s knowledge of the latent concepts underlying a domain and the relationships among a collection of questions and the latent concepts, solely from the graded responses to a collection of questions. To better interpret the estimated latent concepts, SPARFA relies on a postprocessing step that utilizes userdefined tags (e.g., topics or keywords) available for each question. In this paper, we relax the need for userdefined tags by extending SPARFA to jointly process both graded learner responses and the text of each question and its associated answer(s) or other feedback. Our purely datadriven approach (i) enhances the interpretability of the estimated latent concepts without the need of explicitly generating a set of tags or performing a postprocessing step, (ii) improves the prediction performance of SPARFA, and (iii) scales to large test/assessments where human annotation would prove burdensome. We demonstrate the efficacy of the proposed approach on two real educational datasets. 1.
SPARSE PROBIT FACTOR ANALYSIS FOR LEARNING ANALYTICS
"... We develop a new model and algorithm for machine learningbased learning analytics, which estimate a learner’s knowledge of the concepts underlying a domain. Our model represents the probability that a learner provides the correct response to a question in terms of three factors: their understanding ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We develop a new model and algorithm for machine learningbased learning analytics, which estimate a learner’s knowledge of the concepts underlying a domain. Our model represents the probability that a learner provides the correct response to a question in terms of three factors: their understanding of a set of underlying concepts, the concepts involved in each question, and each question’s intrinsic difficulty. We estimate these factors given the graded responses to a set of questions. We develop a biconvex algorithm to solve the resulting SPARse Factor Analysis (SPARFA) problem. We also incorporate userdefined tags on questions to facilitate the interpretability of the estimated factors. Experiments with synthetic and realworld data demonstrate the efficacy of our approach. Index Terms — biconvex optimization, content analytics, learning analytics, personalized learning, factor analysis 1.
Bayesian Pairwise Collaboration Detection in Educational Datasets
 In Proc. IEEE Global Conf. on
, 2013
"... Abstract—Online education affords the opportunity to revolutionize learning by providing access to highquality educational resources at low costs. The recent popularity of socalled MOOCs (massive open online courses) further accelerates this trend. However, these exciting advancements result in s ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Online education affords the opportunity to revolutionize learning by providing access to highquality educational resources at low costs. The recent popularity of socalled MOOCs (massive open online courses) further accelerates this trend. However, these exciting advancements result in several challenges for the course instructors. Among these challenges is the detection of collaboration between learners on online tests or takehome exams which, depending on the courses ’ rules, can be considered cheating. In this work, we propose new models and algorithms for detecting pairwise collaboration between learners. Under a fully Bayesian setting, we infer the probability of learners ’ succeeding on a series of test items solely based on their response data. We then use this information to estimate the likelihood that two learners were collaborating. We demonstrate the efficacy of our methods on both synthetic and realworld educational data; for the latter, we find strong evidence of collaboration for a certain pair of learners in a noncollaborative takehome exam. Index Terms—Bayesian methods, cheating, collaboration detection, hypothesis testing, online education, sparse factor analysis. I.