Results 1 -
2 of
2
Linear Discriminant Text Classification in High Dimension
"... Linear Discriminant (LD) techniques are typically used in pattern recognition tasks when there are many (n >> 10 4 ) datapoints in low-dimensional (d < 10 2 ) space. In this paper we argue on theoretical grounds that LD is in fact more appropriate when training data is sparse, and the dimension ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Linear Discriminant (LD) techniques are typically used in pattern recognition tasks when there are many (n >> 10 4 ) datapoints in low-dimensional (d < 10 2 ) space. In this paper we argue on theoretical grounds that LD is in fact more appropriate when training data is sparse, and the dimension of the space is extremely high. To support this conclusion we present experimental results on a medical text classification problem of great practical importance, autocoding of adverse event reports. We trained and tested LD-based systems for a variety of classification schemes widely used in the clinical drug trial process (COSTART, WHOART, HARTS, and MedDRA) and obtained significant reduction in the rate of misclassification compared both to generic Bayesian machine-learning techniques and to the current generation of domain-specific autocoders based on string matching. 1
A Dynamical Study of the Generalised Delta Rule
, 2000
"... The generalised delta rule is a powerful non-linear distributed learning procedure capable of learning arbitrary mappings for artificial neural networks of any topology. Yet, the learning procedure is poorly specified in that it cannot specifically guarantee a solution for all solvable problems. Thi ..."
Abstract
- Add to MetaCart
The generalised delta rule is a powerful non-linear distributed learning procedure capable of learning arbitrary mappings for artificial neural networks of any topology. Yet, the learning procedure is poorly specified in that it cannot specifically guarantee a solution for all solvable problems. This study focuses on developing a benchmarking procedure for the generalised delta rule that provides a visualisation of the complete dynamics of the procedure and allows optimisation of all the variables that comprise the system functions together. A number of dynamical modes of convergence for the procedure are shown, in particular universal convergence to global error minima. In a number of experiments with small networks, the procedure was found to exhibit regions of universal global convergence for particular system parameters. With each problem examined, a particular value or range of values for the learning rate parameter was found that tunes the network for optimal learning success. In conclusion, it was found that small values of the learning rate parameter are not necessarily optimal for obtaining global convergence. It is further conjectured that feedforward generalised delta rule networks have enough representational capacity to map any combinatorial Boolean logic function, that a convergence proof should exist for these problems, and that the long term behaviour of the procedure should be tractable under universality theory. iii Acknowledgements Thanks to Han, Ian and Nigel for starting it all, and to everyone at Nottingham University for their expert help. I would especially like to thank my examiners, Mark Plumbley and Frank Ritter, for their extensive positive feedback, which was incorporated into this thesis. iv List of Symbols net x Total input to unit x. T ...

