Results 1 
5 of
5
Discovering Hidden Variables in NoisyOr Networks using Quartet Tests
"... We give a polynomialtime algorithm for provably learning the structure and parameters of bipartite noisyor Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We give a polynomialtime algorithm for provably learning the structure and parameters of bipartite noisyor Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data. We obtain an efficient learning algorithm for a family of Bayesian networks that we call quartetlearnable. For each latent variable, the existence of a singlycoupled quartet allows us to uniquely identify and learn all parameters involving that latent variable. We give a proof of the polynomial sample complexity of our learning algorithm, and experimentally compare it to variational EM. 1
Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data
"... We propose a family of efficient algorithms for learning the parameters of a Bayesian network from incomplete data. Our approach is based on recent theoretical analyses of missing data problems, which utilize a graphical representation, called the missingness graph. In the case of MCAR and MAR data ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We propose a family of efficient algorithms for learning the parameters of a Bayesian network from incomplete data. Our approach is based on recent theoretical analyses of missing data problems, which utilize a graphical representation, called the missingness graph. In the case of MCAR and MAR data, this graph need not be explicit, and yet we can still obtain closedform, asymptotically consistent parameter estimates, without the need for inference. When this missingness graph is explicated (based on background knowledge), even partially, we can obtain even more accurate estimates with less data. Empirically, we illustrate how we can learn the parameters of large networks from large datasets, which are beyond the scope of algorithms like EM (which require inference). 1
Estimating LatentVariable Graphical Models using Moments and Likelihoods
"... Recent work on the method of moments enable consistent parameter estimation, but only for certain types of latentvariable models. On the other hand, pure likelihood objectives, though more universally applicable, are difficult to optimize. In this work, we show that using the method of moments in ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Recent work on the method of moments enable consistent parameter estimation, but only for certain types of latentvariable models. On the other hand, pure likelihood objectives, though more universally applicable, are difficult to optimize. In this work, we show that using the method of moments in conjunction with composite likelihood yields consistent parameter estimates for a much broader class of discrete directed and undirected graphical models, including loopy graphs with high treewidth. Specifically, we use tensor factorization to reveal information about the hidden variables. This allows us to construct convex likelihoods which can be globally optimized to recover the parameters. 1.
Early Detection of Diabetes from Health Claims
"... Early detection of Type 2 diabetes poses challenges to both the machine learning and medical communities. Current clinical practices focus on narrow patientspecific courses of action whereas electronic health records and insurance claims data give us the ability to generalize that knowledge across ..."
Abstract
 Add to MetaCart
Early detection of Type 2 diabetes poses challenges to both the machine learning and medical communities. Current clinical practices focus on narrow patientspecific courses of action whereas electronic health records and insurance claims data give us the ability to generalize that knowledge across large sets of populations. Advances in population health care have the potential to improve the quality of health of the patient as well as decrease future medical costs, at least in part by prevention of longterm complications accruing during undiagnosed diabetes. Based on patient data from insurance claims, we present the results of our initial experiments into identification of patients who will develop diabetes. We motivate future work in this area by considering the need to develop machine learning algorithms that can effectively deal with the depth and the variety of the data. 1
Unsupervised Learning of Disease Progression Models
"... Chronic diseases, such as Alzheimer’s Disease, Diabetes, and Chronic Obstructive Pulmonary Disease, usually progress slowly over a long period of time, causing increasing burden to the patients, their families, and the healthcare system. A better understanding of their progression is instrumental in ..."
Abstract
 Add to MetaCart
(Show Context)
Chronic diseases, such as Alzheimer’s Disease, Diabetes, and Chronic Obstructive Pulmonary Disease, usually progress slowly over a long period of time, causing increasing burden to the patients, their families, and the healthcare system. A better understanding of their progression is instrumental in early diagnosis and personalized care. Modeling disease progression based on realworld evidence is a very challenging task due to the incompleteness and irregularity of the observations, as well as the heterogeneity of the patient conditions. In this paper, we propose a probabilistic disease progression model that address these challenges. As compared to existing disease progression models, the advantage of our model is threefold: 1) it learns a continuoustime progression model from discretetime observations with nonequal intervals; 2) it learns the full progression trajectory from a set of incomplete records that only cover short segments of the progression; 3) it learns a compact set of medical concepts as the bridge between the hidden progression process and the observed medical evidence, which are usually extremely sparse and noisy. We demonstrate the capabilities of our model by applying it to a realworld COPD patient cohort and deriving some interesting clinical insights.