Results 1 - 10
of
58
Deterministic Annealing for Clustering, Compression, Classification, Regression, and Related Optimization Problems
- Proceedings of the IEEE
, 1998
"... this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, ph ..."
Abstract
-
Cited by 193 (4 self)
- Add to MetaCart
this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, physics, biology, control and signal processing, information theory, complexity theory, and psychology (see [45]). Neural networks have provided a fertile soil for the infusion (and occasionally confusion) of ideas, as well as a meeting ground for comparing viewpoints, sharing tools, and renovating approaches. It is within the ill-defined boundaries of the field of neural networks that researchers in traditionally distant fields have come to the realization that they have been attacking fundamentally similar optimization problems.
Anomaly Detection: A Survey
, 2007
"... Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and c ..."
Abstract
-
Cited by 69 (1 self)
- Add to MetaCart
Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the di®erent directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.
Probabilistic Model-Based Clustering of Multivariate and Sequential Data
- In Proceedings of Artificial Intelligence and Statistics
, 1999
"... Probabilistic model-based clustering, based on finite mixtures of multivariate models, is a useful framework for clustering data in a statistical context. This general framework can be directly extended to clustering of sequential data, based on finite mixtures of sequential models. In this paper we ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
Probabilistic model-based clustering, based on finite mixtures of multivariate models, is a useful framework for clustering data in a statistical context. This general framework can be directly extended to clustering of sequential data, based on finite mixtures of sequential models. In this paper we consider the problem of fitting mixture models where both multivariate and sequential observations are present. A general EM algorithm is discussed and experimental results demonstrated on simulated data. The problem is motivated by the practical problem of clustering individuals into groups based on both their static characteristics and their dynamic behavior. 1 Introduction and Motivation Consider the following problem. We have a set of individuals (a random sample from a larger population) whomwe would like to cluster into groups based on observational data. For each individual we can measure characteristics which are relatively static (e.g., their height, weight, income, age, sex, etc)...
A Comparison Between Neural-Network Forecasting Techniques - Case Study: River Flow Forecasting
- IEEE Transactions on Neural Networks
, 1999
"... Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. The first goal of this paper is to apply neural networks to the problem of forecasting the flow of the River Nile i ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. The first goal of this paper is to apply neural networks to the problem of forecasting the flow of the River Nile in Egypt. The second goal of the paper is to utilize the time series as a benchmark to compare between several neural-network forecasting methods. We compare between four different methods to preprocess the inputs and outputs, including a novel method proposed here based on the discrete Fourier series. We also compare between three different methods for the multistep ahead forecast problem: the direct method, the recursive method, and the recursive method trained using a backpropagation through time scheme. We also include a theoretical comparison between these three methods. The final comparison is between different methods to perform longer horizon forecast, and that includes ways to partition the problem into the several subproblems of forecasting KKK steps ahead. Index Terms--- Backpropagation, Fourier series, multistep ahead prediction, neural networks, Nile River, river flow forecasting, seasonal time series, time series prediction. I.
Structurally Adaptive Modular Networks for Non-Stationary Environments
- IEEE Transactions on Neural Networks
"... This paper introduces a neural network capable of dynamically adapting its architecture to realize time variant non-linear input-output maps. This network has its roots in the mixture of experts framework but uses a localized model for the gating network. Modules or experts are grown or pruned depen ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
This paper introduces a neural network capable of dynamically adapting its architecture to realize time variant non-linear input-output maps. This network has its roots in the mixture of experts framework but uses a localized model for the gating network. Modules or experts are grown or pruned depending on the complexity of the modeling problem. The structural adaptation procedure addresses the model selection problem and typically leads to much better parameter estimation. Batch mode learning equations are extended to obtain on-line update rules enabling the network to model time varying environments. Simulation results are presented throughout the paper to support the proposed techniques. This research was supported in part by ARO contracts DAAH04-94-G-0417 and 04-95-10494 and NSF grant ECS 9307632. Contents 1 Introduction 3 2 Background on Mixture of Experts 4 2.1 Generic Mixture of Experts Architecture : : : : : : : : : : : : : : : : : : : : : : : 4 2.2 Drawbacks of a Global...
Mixture of Experts Regression Modeling by Deterministic Annealing
- IEEE Transactions on Signal Processing
, 1997
"... We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial bas ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML) approaches, we directly minimize the (squared) regression error. We use the probabilistic framework as means to define an optimization method that avoids many shallow local minima on the complex cost surface. Our method is based on deterministic annealing (DA), where the entropy of the system is gradually reduced, with the expected regression cost (energy) minimized at each entropy level. The corresponding Lagrangian is the system's "free-energy," and this annealing process is controlled by variation of the Lagrange multiplier, which acts as a "temperature" parameter. The new method consistently and substantially outperformed the com...
Nonlinear trading models through Sharpe Ratio maximization
- International Journal of Neural Systems
, 1997
"... www.stern.nyu.edu / aweigend While many trading strategies are based on price prediction, traders in nancial markets are typically interested in risk-adjusted performance such as the Sharpe Ratio, rather than price predictions themselves. This paper introduces an approach which generates a nonlinear ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
www.stern.nyu.edu / aweigend While many trading strategies are based on price prediction, traders in nancial markets are typically interested in risk-adjusted performance such as the Sharpe Ratio, rather than price predictions themselves. This paper introduces an approach which generates a nonlinear strategy that explicitly maximizes the Sharpe Ratio. It is expressed as a neural network model whose output is the position size between a risky and a risk-free asset. The iterative parameter update rules are derived and compared to alternative approaches. The resulting trading strategy is evaluated and analyzed on both computer-generated data and real world data (DAX, the daily German equity index). Trading based on Sharpe Ratio maximization compares favorably to both pro t optimization and probability matching (through cross-entropy optimization). The results show that the goal of optimizing out-of-sample risk-adjusted pro t can be achieved with this nonlinear approach. 1
Calibrated probabilistic forecasting at the Stateline wind energy center: The regime-switching space-time (RST) method
- Journal of the American Statistical Association
, 2004
"... With the global proliferation of wind power, accurate short-term forecasts of wind resources at wind energy sites are becoming paramount. Regime-switching space-time (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind s ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
With the global proliferation of wind power, accurate short-term forecasts of wind resources at wind energy sites are becoming paramount. Regime-switching space-time (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind speed and wind power. The model formulation is parsimonious, yet takes account of all the salient features of wind speed: alternating atmospheric regimes, temporal and spatial correlation, diurnal and seasonal non-stationarity, conditional heteroscedasticity, and non-Gaussianity. The RST method identifies forecast regimes at the wind energy site and fits a conditional predictive model for each regime. Geographically dispersed meteorological observations in the vicinity of the wind farm are used as off-site predictors. The RST technique was applied to 2-hour ahead forecasts of hourly average wind speed at the Stateline wind farm in the US Pacific Northwest. In July 2003, for instance, the RST forecasts had root-mean-square error (RMSE) 28.6 % less than the persistence forecasts. For each month in the test period, the RST forecasts had lower RMSE than forecasts using state-of-the-art vector time series techniques. The RST method provides probabilistic forecasts in the form of

