### Citations

6580 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...n the training process. For each new added node, the activation function center and the output connection weight are decided according to an extended chained version of the Nadaraja–Watson estimator (=-=Bishop, 1995-=-; Schioler & Hartmann, 1992; Specht, 1990). Then, the diffusion parameters of the kernel functions are determined by an empirical risk driven rule based on a geneticlike optimization technique (Carozz... |

3453 |
UCI repository of machine learning databases. http://www. ics.uci.edu/~mlearn/MLRepository.html
- BLAKE, J
- 1998
(Show Context)
Citation Context ...1Þe KjA ijb ! 1Kdðxi;x 0 i Þ (35) where d(x, y)Z1 ifxZy and 0 otherwise, jA ij is the size of A i and A i is the set of values of the ith attribute. Experiments are performed on 3 of 5 UCI data sets (=-=Blake & Merz, 1998-=-) used in previous papers to assess the diffusion kernel 1 . Each dataset is randomly divided in training (80%) and test (20%) data. A summary of dataset information is reported in Table 1. According ... |

2803 | Learning with Kernels
- Schölkopf, Smola
- 2002
(Show Context)
Citation Context ...ng suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algorithms (Aizerman, Braverman, & Rozonoér, 1964; Boser, Guyon, & Vapnik, 1992; Kimeldorf & Wahba, 1971; =-=Schölkopf & Smola, 2002-=-) is to express the similarities * Corresponding author. Tel.: C39 0824 305151; fax: C39 824 23013. E-mail addresses: carozza@unisannio.it (M. Carozza), rampone@ unisannio.it (S. Rampone). 0893-6080/$... |

1857 | A training algorithm for optimal margin classifiers
- Boser, Guyou, et al.
- 1992
(Show Context)
Citation Context ..., 2004; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algorithms (Aizerman, Braverman, & Rozonoér, 1964; =-=Boser, Guyon, & Vapnik, 1992-=-; Kimeldorf & Wahba, 1971; Schölkopf & Smola, 2002) is to express the similarities * Corresponding author. Tel.: C39 0824 305151; fax: C39 824 23013. E-mail addresses: carozza@unisannio.it (M. Carozza... |

500 | Convolution kernels on discrete structures
- Haussler
- 1999
(Show Context)
Citation Context ...vectorial representation, and the concepts of ‘similarity’, and ‘distance’, well defined in the Euclidean spaces, become quite fuzzy. Starting from the convolution kernel idea introduced by Haussler (=-=Haussler, 1999-=-), recent works (Hammer & Jain, 2004; Kondor & Lafferty, 2002; Nicotra, Micheli, & Starita, 2004; Tan & Wang, 2004; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discret... |

366 |
Probabilistic neural network
- Specht
- 1990
(Show Context)
Citation Context ...d node, the activation function center and the output connection weight are decided according to an extended chained version of the Nadaraja–Watson estimator (Bishop, 1995; Schioler & Hartmann, 1992; =-=Specht, 1990-=-). Then, the diffusion parameters of the kernel functions are determined by an empirical risk driven rule based on a geneticlike optimization technique (Carozza & Rampone, 1999, 2001). We also conside... |

220 | Diffusion kernels on graphs and other discrete input spaces
- Kondor, Lafferty
- 2002
(Show Context)
Citation Context ...arity’, and ‘distance’, well defined in the Euclidean spaces, become quite fuzzy. Starting from the convolution kernel idea introduced by Haussler (Haussler, 1999), recent works (Hammer & Jain, 2004; =-=Kondor & Lafferty, 2002-=-; Nicotra, Micheli, & Starita, 2004; Tan & Wang, 2004; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algo... |

63 | On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. - Niyogi, Girosi - 1996 |

53 | Learning kernels from biological networks by maximizing entropy - Tsuda, Noble |

50 | Graph-driven features extraction from microarray data using diffusion kernels and kernel cca
- VERT, KANEHISA
- 2002
(Show Context)
Citation Context ...tly, the use of discrete diffusion kernels (Kondor & Lafferty, 2002) has been suggested for graph-structured data. They have been successfully applied for making predictions from biological networks (=-=Vert & Kanehisa, 2003-=-). While such kernels appear to be appropriate to express similarities between graph-structured data, they evidence the difficulty of correctly modulating the associate diffusion parameter (Tsuda & St... |

37 |
Some results on Tchebychean spline functions
- Wahba, Kimeldorf
- 1971
(Show Context)
Citation Context ...ectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algorithms (Aizerman, Braverman, & Rozonoér, 1964; Boser, Guyon, & Vapnik, 1992; =-=Kimeldorf & Wahba, 1971-=-; Schölkopf & Smola, 2002) is to express the similarities * Corresponding author. Tel.: C39 0824 305151; fax: C39 824 23013. E-mail addresses: carozza@unisannio.it (M. Carozza), rampone@ unisannio.it ... |

19 |
Mapping neural networks derived from the Parzen window estimator
- Schiøler, Hartmann
- 1992
(Show Context)
Citation Context ... process. For each new added node, the activation function center and the output connection weight are decided according to an extended chained version of the Nadaraja–Watson estimator (Bishop, 1995; =-=Schioler & Hartmann, 1992-=-; Specht, 1990). Then, the diffusion parameters of the kernel functions are determined by an empirical risk driven rule based on a geneticlike optimization technique (Carozza & Rampone, 1999, 2001). W... |

16 |
A support vector machine with a hybrid kernel and minimal Vapnik-Chervonenkis dimension,
- Tan, Wang
- 2004
(Show Context)
Citation Context ..., become quite fuzzy. Starting from the convolution kernel idea introduced by Haussler (Haussler, 1999), recent works (Hammer & Jain, 2004; Kondor & Lafferty, 2002; Nicotra, Micheli, & Starita, 2004; =-=Tan & Wang, 2004-=-; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algorithms (Aizerman, Braverman, & Rozonoér, 1964; Boser,... |

12 | Neural methods for nonstandard data
- Hammer, Jain
- 2004
(Show Context)
Citation Context ...he concepts of ‘similarity’, and ‘distance’, well defined in the Euclidean spaces, become quite fuzzy. Starting from the convolution kernel idea introduced by Haussler (Haussler, 1999), recent works (=-=Hammer & Jain, 2004-=-; Kondor & Lafferty, 2002; Nicotra, Micheli, & Starita, 2004; Tan & Wang, 2004; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea... |

5 | Codes Interpolation from Noisy Patterns by Means of a Vector Quantization
- Rampone
- 1995
(Show Context)
Citation Context ...risk of netM with bj2Ik, ksm. Then the expected number of drawings in I necessary to acquire at least a value in Im is n log n (25) Proof (see (Feller, 1970) example IX.3.d, (Carozza & Rampone, 2001; =-=Rampone, 1995-=-)) 4.2. Kernel nodes The average error of netM estimator, the expected risk (Niyogi & Girosi, 1996), is unknown because our only source of information is the data set (11). So, we approximate the expe... |

4 | Function approximation from noisy data by an incremental rbf network
- Carozza, Rampone
- 1999
(Show Context)
Citation Context ...e the difficulty of correctly modulating the associate diffusion parameter (Tsuda & Stafford Noble, 2004). In this paper, we combine the diffusion kernels with an incremental network-based estimator (=-=Carozza & Rampone, 1999-=-, 2001; Chentouf, Jutten, Maignan, & Kanevski, 1997). Diffusion kernel nodes are iteratively added in the training process. For each new added node, the activation function center and the output conne... |

4 |
Incremental neural networks for function approximation
- Chentouf, Jutten, et al.
- 1997
(Show Context)
Citation Context ...odulating the associate diffusion parameter (Tsuda & Stafford Noble, 2004). In this paper, we combine the diffusion kernels with an incremental network-based estimator (Carozza & Rampone, 1999, 2001; =-=Chentouf, Jutten, Maignan, & Kanevski, 1997-=-). Diffusion kernel nodes are iteratively added in the training process. For each new added node, the activation function center and the output connection weight are decided according to an extended c... |

4 |
3rd ed. An introduction to probability theory and its applications
- Feller
- 1968
(Show Context)
Citation Context ...irical risk of netM with bj2Im, and E 0 M the empirical risk of netM with bj2Ik, ksm. Then the expected number of drawings in I necessary to acquire at least a value in Im is n log n (25) Proof (see (=-=Feller, 1970-=-) example IX.3.d, (Carozza & Rampone, 2001; Rampone, 1995)) 4.2. Kernel nodes The average error of netM estimator, the expected risk (Niyogi & Girosi, 1996), is unknown because our only source of info... |

2 | An incremental multivariate regression method for function approximation from noisy data
- Carozza, Rampone
- 2001
(Show Context)
Citation Context ... and E 0 M the empirical risk of netM with bj2Ik, ksm. Then the expected number of drawings in I necessary to acquire at least a value in Im is n log n (25) Proof (see (Feller, 1970) example IX.3.d, (=-=Carozza & Rampone, 2001-=-; Rampone, 1995)) 4.2. Kernel nodes The average error of netM estimator, the expected risk (Niyogi & Girosi, 1996), is unknown because our only source of information is the data set (11). So, we appro... |

2 | Fisher kernel for tree structured data
- Nicotra, Micheli, et al.
- 2004
(Show Context)
Citation Context ...ell defined in the Euclidean spaces, become quite fuzzy. Starting from the convolution kernel idea introduced by Haussler (Haussler, 1999), recent works (Hammer & Jain, 2004; Kondor & Lafferty, 2002; =-=Nicotra, Micheli, & Starita, 2004-=-; Tan & Wang, 2004; Watkins, 1999) weredirectly motivated by applying suitable kernel methods to discrete, categorical data. The common idea behind kernel-based algorithms (Aizerman, Braverman, & Rozo... |

1 | attribute cardinality - Max - 1964 |

1 | Rampone / Neural Networks 18 (2005) 1087–1092 Hamming distance LMC error (SV) Diffusion kernel LMC error (SV) Diffusion kernel CRM error (M) Breast cancer 9 10 7.44% (206) 3.70% (43) 4.39% (49 - Carozza, S - 1994 |

1 | On using a Support Vector Machine in Learning Feed-Forward Control
- deKruif, deVries
- 2001
(Show Context)
Citation Context ...o involves a significant error reduction. The use of diffusion kernels combined with an incremental learning algorithm tends towards an iterative learning mechanism for Support Vector Machines (SVM) (=-=deKruif & deVries, 2001-=-): SVM tends to prune a set of support vectors during the parameter optimization, while our model grows a small set of kernel nodes related to the generalization ability. As the number of center grows... |