| V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, T. Sorsa. "Computational auditory scene recognition". In Proc. IEEE ICASSP 2001. |
....(BER) is defined as the ratio of the energy at a certain frequency band to the total energy. Thus the BER for the i subband in time frame k is: M n k S n k n X n X k i C 0 2 ) 1) where S is the set of Fourier transform coefficients belonging to the i subband [7]. Feature vectors are extracted from the preprocessed signal. Human auditory perception does not operate on a linear frequency scale. Therefore we apply a filter bank consisting of triangular filters spaced uniformly on the mel scale. An approximation between a frequency value in Hertz and in mel ....
Peltonen,V., Computational Auditory Scene Recognition. In Proc. International Conference on Acoustic, Speech, and Signal Processing, Orlando, Florida, May 2002.
....periodicity measurement is performed in the residual signal after preprocessing with a sinusoidal model. Band energy ratio (BER) feature was used to model signals rough spectral energy distribution. BER is defined as the ratio of the energy at a certain frequency band to the total energy [3]. Since human auditory perception does not operate on a linear frequency scale, we apply a filter bank consisting of triangular filters spaced uniformly on the mel scale. At each frequency band, an autocorrelation function (ACF) is calculated over the BER values within a three second long sliding ....
Peltonen,V., Computational Auditory Scene Recognition. In Proc. ICASSP, Orlando, Florida, May 2002.
No context found.
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, T. Sorsa. "Computational auditory scene recognition". In Proc. IEEE ICASSP 2001.
....3. FEATURES Ten fundamental acoustic features were investigated for classification of auditory scenes. In addition, the variance and delta features of the basic features were also studied. We provide here a very short description of each feature, more detailed descriptions can be found in [9]. The features are grouped into three categories according to their processing domain. Time domain features Zero crossing rate (ZCR) is defined as the number of zero voltage crossings within a frame. Short time average energy is the energy of a frame. Frequency domain features Let ....
V. Peltonen, "Computational auditory scene recognition," M.Sc. thesis, Tampere University of Tech., Finland, 2001.
No context found.
Peltonen. (2000). "Computational Auditory Scene Recognition". MSc thesis, Tampere University of Technology, Department of Information Technology, August 2001.
No context found.
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational auditory scene recognition. In IEEE Int'l Conf. on Acoustics, Speech, and Signal Processing, volume 2, pages 1941.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC