@MISC{_computationaltractability, author = {}, title = {Computational tractability of machine learning}, year = {} }
Share
OpenURL
Abstract
Huge data sets containing millions of training examples (tall data) with a large number of attributes (fat data) are relatively easy to gather. However one of the bottlenecks for successful inference of useful information from the data is the computational complexity of machine learning algorithms. Most state-of-the-art nonparametric machine learning algorithms have a computational complexity of either O(N 2) or O(N 3), where N is the number of training examples. In my thesis I propose to explore the scalability of machine learning algorithms both with respect to the number of training examples and the number of attributes per training example and propose linear O(N) time ɛ-exact approximation algorithms. The area lies at the interface of computational statistics, machine learning, approximation theory, and computational geometry. [ Dissertation proposal]: May 4, 2005 1. THE EVILS OF LEARNING WITH MASSIVE DATA SETS During the past few decades is has become relatively easy to gather huge amount of data, apprehensively called-massive data sets. A few such examples include genome sequencing, astronomical databases, internet databases, experimental data from particle physics, medical databases, financial records, weather reports, audio