Results 11 - 20
of
30
Multipitch Estimation And Sound Separation By The Spectral Smoothness Principle
, 2001
"... A processing principle is proposed for finding the pitches and separating the spectra of concurrent musical sounds. The principle, spectral smoothness, is used in the human auditory system which separates sounds partly by assuming that the spectral envelopes of real sounds are continuous. Both theor ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
A processing principle is proposed for finding the pitches and separating the spectra of concurrent musical sounds. The principle, spectral smoothness, is used in the human auditory system which separates sounds partly by assuming that the spectral envelopes of real sounds are continuous. Both theoretical and experimental evidence is presented for the vital importance of spectral smoothness in resolving sound mixtures. Three algorithms of varying complexity are described which successfully implement the new principle. In validation experiments, random pitch and sound source combinations were analyzed in a single time frame. Number of simultaneous sounds ranged from one to six, database comprising sung vowels and 26 musical instruments. Usage of a specific yet straightforward smoothing operation corrected approximately half of the pitch errors that occurred in a system which was otherwise identical but did not use the smoothness principle. In random four-voice mixtures, pitch error rate...
Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitative Information Integration Mechanism
- Proc. International Joint Conf. on Artificial Intelligence
, 1995
"... We propose a process model for hierarchical perceptual sound organization, which recognizes perceptual sounds included in incoming sound signals. We consider perceptual sound organization as a scene analysis problem in the auditory domain. Our model consists of multiple processing modules and a hypo ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
We propose a process model for hierarchical perceptual sound organization, which recognizes perceptual sounds included in incoming sound signals. We consider perceptual sound organization as a scene analysis problem in the auditory domain. Our model consists of multiple processing modules and a hypothesis network for quantitative integration of multiple sources of information. When input information for each processing module is available, the module rises to process it and asynchronously writes output information to the hypothesis network. On the hypothesis network, individual information is integrated and an optimal internal model of perceptual sounds is automatically constructed. Based on the model, a music scene analysis system has been developed for acoustic signals of ensemble music, which recognizes rhythm, chords, and source-separated musical notes. Experimental results show that our method has permitted autonomous, stable and effective information integration to construct the internal model of hierarchical perceptual sounds. 1
A Probabilistic Model for the Transcription of Single-Voice Melodies
- Tampere University of Technology
, 2003
"... A method is proposed for the automatic transcription of single-voice melodies from an acoustic waveform into a symbolic musical notation (a MIDI file). The system consists of a signal processing front-end which calculates a continuous pitch track and of a probabilistic model which converts the pitch ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
A method is proposed for the automatic transcription of single-voice melodies from an acoustic waveform into a symbolic musical notation (a MIDI file). The system consists of a signal processing front-end which calculates a continuous pitch track and of a probabilistic model which converts the pitch track into a discrete musical notation. Our proposed probabilistic model consists of three parts operating in parallel: a pitch trajectory model, a musicological model, and a duration model. The first handles imperfections in the performed/estimated pitch values using a hidden Markov model, the second estimates musical key signature to improve the transcription accuracy, and the last models the duration of the notes.
Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps
, 2007
"... We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features different from those of monophonic sounds. To cope with this, we weight features based on how much they are affected by overlapping. First, we quantitatively evaluate the influence of overlapping on each feature as the ratio of the within-class variance to the between-class variance in the distribution of training data obtained from polyphonic sounds. Then, we generate feature axes using a weighted mixture that minimizes the influence via linear discriminant analysis. In addition, we improve instrument identification using musical context. Experimental results showed that the recognition rates using both feature weighting and musical context were 84.1 % for duo, 77.6 % for trio, and 72.3 % for quartet; those without using either were 53.4, 49.6, and 46.5%, respectively.
Data Reprocessing in Signal Understanding Systems
, 1996
"... DATA REPROCESSING IN SIGNAL UNDERSTANDING SYSTEMS SEPTEMBER 1996 FRANK I. KLASSNER, III B.S., UNIVERSITY OF SCRANTON M.S., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor Victor R. Lesser Signal understanding systems have the difficult tas ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
DATA REPROCESSING IN SIGNAL UNDERSTANDING SYSTEMS SEPTEMBER 1996 FRANK I. KLASSNER, III B.S., UNIVERSITY OF SCRANTON M.S., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor Victor R. Lesser Signal understanding systems have the difficult task of interpreting environmental signals: decomposing them and explaining their components in terms of an arbitrary number of instances of perceptual object categories whose properties can interact with one another. This dissertation addresses the problem of designing blackboard-based perceptual systems for interpreting signals from complex environments. A "complex environment" is one that can (1) produce signal-to-noise ratios that vary unpredictably over time, and (2) can contain perceptual objects that mutually interfere with each others' signal signature, or have arbitrary time-dependent behaviors. The traditional design paradigm for perceptual systems assumes that some particular set of ...
A Multiresolution Time-Frequency Analysis And Interpretation Of Musical Rhythm
, 1999
"... This thesis describes an approach to representing musical rhythm in computational terms. The purpose of such an approach is to provide better models of musical time for machine accompaniment of human musicians and in that attempt, to better understand the processes behind human perception and perfor ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This thesis describes an approach to representing musical rhythm in computational terms. The purpose of such an approach is to provide better models of musical time for machine accompaniment of human musicians and in that attempt, to better understand the processes behind human perception and performance. The intersections between musicology and artificial intelligence (AI) are reviewed, describing the rewards from the interdisciplinary study of music with AI techniques, and the converse benefits to AI research. The arguments for formalisation of musicological theories using AI and cognitive science concepts are presented. These bear upon the approach of research, considering ethnographic and process models of music versus traditionally descriptive methods of music study. This enquiry investigates the degree to which the human task of music can be studied and modelled computationally. It simultaneously performs the AI task of problem domain identification and constraint. The psycholo...
Computer Music Analysis
, 1998
"... Computer music analysis is investigated, with specific reference to the current research fields of automatic music transcription, human music perception, pitch determination, note and stream segmentation, score generation, timefrequency analysis techniques, and musical grammars. Human music percepti ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Computer music analysis is investigated, with specific reference to the current research fields of automatic music transcription, human music perception, pitch determination, note and stream segmentation, score generation, timefrequency analysis techniques, and musical grammars. Human music perception is investigated from two perspectives: the computational model perspective desires an algorithm that perceives the same things that humans do, regardless of how the program accomplishes this, and the physiological model perspective desires an algorithm that models exactly how humans perceive what they perceive.
A predominant-f0 estimation method for real-world musical audio signals: MAP estimation for incorporating prior knowledge about f0s and tone models
- in Proc. Workshop on Consistent and
, 2001
"... In this paper we describe a robust method, called PreFEst, for estimating the fundamental frequency (F0) of melody and bass lines in monaural audio signals containing sounds of various instruments. Most previous F0-estimation methods have difficulty dealing with such complex audio signals because th ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper we describe a robust method, called PreFEst, for estimating the fundamental frequency (F0) of melody and bass lines in monaural audio signals containing sounds of various instruments. Most previous F0-estimation methods have difficulty dealing with such complex audio signals because they are designed for mixtures of only a few sounds. Without assuming the number of sound sources, PreFEst can obtain the most predominant F0 — corresponding to the melody or bass line — supported by harmonics within an intentionally-limited frequency range. It estimates the relative dominance of every possible F0 (represented as a probability density function of the F0) and the shape of harmonic-structure tone models by using the MAP (Maximum A Posteriori Probability) estimation considering their prior distribution. Experimental results showed that a real-time system implementing this method is robust enough to detect the melody and bass lines in compact-disc recordings. 1.
Sound Scene Segmentation by Dynamic Detection of Correlogram Comodulation
- the International Joint Conference on AI Workshop on Computational Auditory Scene Analysis
, 1999
"... : A new technique for sound-scene analysis is presented. This technique operates by discovering common modulation behavior among groups of frequency subbands in the autocorrelogram domain. The analysis is conducted by first analyzing the autocorrelogram to estimate the amplitude modulation and perio ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
: A new technique for sound-scene analysis is presented. This technique operates by discovering common modulation behavior among groups of frequency subbands in the autocorrelogram domain. The analysis is conducted by first analyzing the autocorrelogram to estimate the amplitude modulation and period modulation of each channel of data at each time step, and then using dynamic clustering techniques to group together channels with similar modulation behavior. Implementation details of the analysis technique are presented, and its performance is demonstrated on a test sound.
Joint Detection and Tracking of Time-Varying Harmonic Components: a Flexible Bayesian Approach
- in "IEEE transactions on Speech, Audio and Language Processing
, 2006
"... This paper addresses the joint estimation and detection of time-varying harmonic components in audio signals. We follow a flexible viewpoint, where several frequency/amplitude trajectories are tracked in spectrogram using particle filtering. The core idea is that each harmonic component (composed of ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper addresses the joint estimation and detection of time-varying harmonic components in audio signals. We follow a flexible viewpoint, where several frequency/amplitude trajectories are tracked in spectrogram using particle filtering. The core idea is that each harmonic component (composed of a fundamental partial together with several overtone partials) is considered a target. Tracking requires to define a state-space model with state transition and measurement equations. Particle filtering algorithms rely on a so-called sequential importance distribution, and we show that it can be built on previous multipitch estimation algorithms, so as to yield an even more efficient estimation procedure with established convergence properties. Moreover, as our model captures all the harmonic model information, it actually separates the harmonic sources. Simulations on synthetic and real music data show the interest of our approach.

