Room 410, EEE Building. Velasquez Street.
University of the Philippines,
Diliman, Quezon City 1101.

Telephone: 981-8500 loc 3370
E-mail: dsp@eee.upd.edu.ph

HOME
PROJECTS
PEOPLE
GALLERY
LINKS

:: THE AUDIO GROUP

 

:: PRESENT PROJECTS

:: PAST PROJECTS

:: MORE ON AUDIO

 

:: BACK TO PROJECT

 
 
THE AUDIO GROUP
   
 

 

Although CDs have been in use for nearly twenty years, audio has remained mostly analog. Until now, digital music on the CD has been converted to analog inside the CD player and taken through analog signal processing and amplification before being applied to the speakers. But with the recent increase in such digital infrastructure as the Internet, digital networks in buildings and automobiles, and wireless digital communications, along with new digital sound sources like MP3 and DV, an unprecedented opportunity to process digital audio data has been created. And increasingly, there will be a need to do that kind of processing.

 

 

PRESENT PROJECTS

 

Modal Distribution Analysis and Sum of Sinusoids Synthesis of Kulintang Musical Signals
 

:: Franklin C. Agsaway


:: Recent developments in time-frequency analysis found applications in various fields including music processing. The developments of faster processors and less expensive memory storage have also made software-based synthesis more accessible to musicians. Analysis and synthesis software have already been developed for various musical instruments. Yet, few of these instruments are ethnic and traditional. The advent of ethnomusicology in Asia, particularly the Philippines, sees the need for software-based synthesis of instruments like those in the Kulintang.


A Kulintang is an ensemble of musical instruments played traditionally by the peoples of Western Mindanao, particularly in Maguindanao and Lanao. A basic ensemble consists of dabakan, a babandir, a kulintang, one or two agaongs, and four gandingans. A dabakan is a drum made of lizard or goat skin, while thew rest are gongs, or set of gongs, made of iron and bronze.


Recordings of Kulintang signals obtained in the Pseudo-Anechoic Audio Room of the UP DSP Laboratory were analyzed using time-frequency analysis. Amplitude and frequency parameters of the signals were obtained from the modal time-frequency distribution of the signals. Finally, synthesis of the signals were done using the obtained parameters in a sum of sinusoids synthesis refined with back-extended exponential decay synthesis. Real-time synthesis was implemented using the ADSP-21065L digital signal processor. The synthesized signals were then evaluated using listening tests.

 

:: Back to Top

 _____________________________________________________________________

 

Beat Recognition Using Neural Networks
 

:: Paolo Antonio C. Castro

:: Beat recognition is the process of inferring the beats from music. For a computer to be dubbed as musically intelligent, it must first understand the beat and follow the tempo of the music like a normal human being can. It is a big step in closing the gap between man and machine. These computers can be used for automatic indexing of music into measures instead of seconds. Recognizing the beat can also aid in automatic music accompaniment and concert stage lighting control.
 

A beat recognition system was implemented starting with drum onset detection. Drum sounds in music are detected with an onset detection system utilizing a neural network. Features were extracted from the samples and used to train the network to recognize either a base or a snare drum. Agents with tempo hypotheses are then formed from the detected drum onsets and their intervals from each other. They output the possible beats. The beat with the highest score by the agent is considered the correct beat.
 

A total of 120 songs were selected to train and test the system. Since the system requires a song to have a tempo between 80 and 160 beats per minute, a 30-second portion of each song which satisfies the required tempo were extracted. 80 of these samples were used for training, 20 were for onset detection testing, and another 20 was added for beat detection testing. 15 of the onset detection testing samples were also used for testing the beat detection making a total of 35. The outputs for these 35 samples were compared to hand-segmented ground truths for evaluation. An overall beat recognition accuracy of 76.42% was achieved for the current project and 71.13% was obtained using the test samples with the previous project. When compared with the previous project's recognition accuracy, it was found that the system performed higher by 5.29%.

 

:: Back to Top

_____________________________________________________________________

 

A Multiple Audio Codec Interface to an ADSP-21065L SHARC Processor
 

:: Jaeson C. Paras


:: In this project, a multiple input-output interface of analog audio channels to the designed serialports of an EZ-KIT LiteTM ADSP-21065l SHARC Evaluation Board was designed , implemented, and tested. The multiple audio analog channels are input to four stereo audio codecs. The audio codec is responsible for the analog to digital conversion of the processed output. The direct memory access (DMA) controller of the evaluation board controls the data transfers between the codecs and the digital signal processor (DSP).


The daughter board containing four stereo audio codecs interfaced to the serial ports of a digital signal processor was designed and implemented. Designing of the daughter board's footprint and layout were done using the ORCAD software. Parts and devices were purchased and fabrication was done through our friends at ASTEC Custom Power. Software development was done to configure the hardware to the appropriate tasks. Testing was done through the Analog Devices Visual DSP++ 2.0 emulation software and an industry standard IEEE 1149.1 JTAG Test Emulation Access Port. Results have shown that the daughter board is fully functional and was able to input, process, and output digital audio data.

 

:: Back to Top

 

     
  PAST_PROJECTS    
         
  Digital Parametric Equalizer on an ADSP 21065L
  ::  Jerremeo Raynier T. Gabas
  ::  Equalizers are commonly implemented using analog circuits, but with the advent of high-speed digital signal processors, their capabilities can now be emulated using a digital algorithm. This pro1ject will implement a parametric equalizer using an ADSP-21065L and can be controlled using a graphical user interface on a personal computer.

 

:: Back to Top
   
   
  4-Channel Digital Automatic Microphone Mixer Using ADSP-21065L Digital Signal Processor
  ::  Ericson L. Machacon
  ::  A four-channel digital automatic microphone mixer will be implemented using the ADSP-21065L digital signal processor. The four-channel interface will be made possible by the daughter board developed by the University of the Philippines' Electrical and Electronics Engineering Department Digital signal Processing Laboratory. A synthesis of the gain sharing algorithm and automatic voice detection shall be used to control common mixing problems such as acoustic feedback and comb filtering. A graphical user interface shall be developed for a user to control the 'automixer' using a personal computer.


  :: Back to Top
   
   
  A Real-time Room Acoustic Simulator Using the ADSP 21065L EZ-Kit Lite
  ::  Nur Ishmael H. Malonzo
  ::  There is a great demand for live performances everywhere. Considerable resources are spent to give desirable acoustics to their venues. A real-time room acoustic simulator is a cheaper and more dynamic alternative to actual physical manipulation in getting desired acoustical conditions. A significant part of producing the desired acoustics of a room is to reduce the effects of the natural acoustics of that room.

This project implements a real-time acoustic simulator and an adaptively trained acoustic equalizer. The simulator and equalizer are integrated as one working unit.The project used the time-divided impulse response method, a method developed by de Jesus et al, for the acoustic simulation part and used a least mean square method for the acoustic equalization part.

The Real-time Room Acoustics Simulator was implemented in the ADSP 21065L EZ-Kit Lite. It is able to simulate a reverberation time of up to 0.3 seconds. With the integrated acoustic equalizer, it is able to equalize some reverberations. Although the project had fallen short of the original intended goal of fully adaptive equalization and failed to prove the perceptual effectivity of the equalizer, improvement in these aspects were insured if ever further work would be done.

This project was done using two ADSP 21065L EZ-Kit Lites and was coded entirely in ADSP 21065L assembly language to get the most out of the computational and memory resources of each board. 

 

    :: Back to Top
       
       
    Real Time Karaoke Grader Using Modal Analysis Implemented on an ADI Sharc
    :: Jerremeo Raynier T. Gabas
    ::

Karaoke singing is one of the most common Filipino pastimes. In this project, the researcher will develop a real time system that will truly quantify a person's singing quality by comparing the singer's tune and the timing with that of the vocal track. Modal Distribution is a time-frequency distribution that can be used to track the timing and pitch of the vocal track and the singer's voice. Modal distribution will be used to analyze and compare the vocal track and the singer's voice, thereby computing a "grade" for the singer.

 

 

 

 

:: Back to Top

 

   
Simulation of Three-Dimensional Sound Using Two Channels
    :: Mishael Paul Atienza
    :: Conventional 3-D sound is done using at least three speakers placed around the listener; higher quality is achieved with an increasing number of speakers. However, this renders the system expensive, as well as cumbersome, as more units have to be positioned accurately when being set-up.

The UP-DSP Lab has done an offline simulation of 3-D sound using only two channels, and this project implements this in real time. This project is able to process sound and output it positionally with no audible delay. It was done using the Pentium III Processor, and has a graphical user interface using a suitable programming language. This GUI enables the user to control the movement of the sound, also in real time.

At the heart of this project are the Head-Related Transfer Functions, or HRTFs. These give human hearing as a function of 3-D position. The HRTFs are used in Balanced Model Truncation, Cross-talk Cancellation and Dual-Channel Equalization, and Speaker and Room Response Cancellation, all of which are methods to be used to accurately "place" the sound source positionally in 3-D.

This project could benefit the various entertainment industries, at a lesser cost, greater flexibility, and greater efficiency.

 

    :: Back to Top
       
       
    Real Time 3D Sound Using Two Channels
    :: Ian Dexter S. Garcia
    :: The objective of this project is to determine if the 3-D listening experience can be simulated using a less expensive, conventional two-channel (two-speaker) system. To accoumplish this, the project addressed the processes involved in 3-D sound simulation. The processes, including a graphical-user-interface were implemented using MATLAB. Through listening tests, it was determined whether the implemented system can be used for 3-D sound. The listening tests also determined which among the implemented 3-D sound simulation configurations is best suited for real-time implementation.

The sound booth inside room 414 at the NEC Bldg. was outfitted with egg cartons and carpet to minimize its reverberation. This acoustically treated the room for more accurate listening tests.

During the listening tests, sound samples simulating different sound source positions were played by the system. The subject listened to the sounds and then entered his perceived positions. Tests were done to determine the accuracy and directionality of the sound in the elevation and azimuth coordinates. Mean-error for the azimuth was 17 degrees and for the elevation was 28 degrees.

It has been determined through the tests described above that the system indeed captures 3-D sound. The implemented 3-D sound system can be marketed to researchers and virtual-reality systems developers. A real-time project implementation connected to home stereo or computer systems can be marketed to consumers.

3-D sound is a field where the Philippines can have a niche in the entertainment industry, both locally and globally. This is therefore a field which must be pursued by researchers and engineers in the future.

 

    :: Back to Top
     
     
    TMS320C54X-based Coconut Age Classification System Using the Hidden Markov Model
    :: Raquel B. David
Lesly Zaren V. Endrinal
Aileen P. Santos
    :: For so many years now, the traditional tap system is still the method being used by coconut vendors to tell what meat type a coconut has. If a customer asks for a specific coconut type, they simply tap the coconut with their bolo, then based on the sound produced, will decide whether it is mala-kanin, mala-uhog, or mala-tainga.

A coconut age classifier wil be able to tell the age and type of a coconut based on the sound that wil be produced upon tapping it. This will be useful in wholesale coconut industries wherein a lot of customers, particularly foreign traders, demand a particular type of meat depending on their applications.

The researchers will create a model that will mimic the human ear and using this model, an unknown coconut sample can be classified as a mala-kanin, mala-tainga, or a mala-uhog.

 

    :: Back to Top
       
       
    Room Acoustic Simulation Using Digital Signal Processing
    :: Ellen de Jesus
Edwin Umali
Karl Villareal
    :: Different kinds of sounds are best heard in different locations. It is well established in the acoustic community that human ears prefer reverberant sounds, as evidenced by the proliferation of 'shower singers'. Sounds can be 'shaped' either through the acoustic environment or at the source. Hence, one could either build a church to one's acoustic preference, or one could engineer the physics of sound itself.

Room Acoustic Simulation (RAS) was developed to avoid huge investments in building desirable acoustic structures and environments. The RAS team developed the simulation of four acoustic environments: a small room with concrete walls, church, concert hall and an open space using digital signal processing. RAS was implemented on a Pentium computer using MatLab. Later versions of RAS are envisioned to be real-time and implemented using digital signal processing (DSP) hardware.

Three novel methods: the Image Method, the Common Acoustical Poles and Zeros (CAPZ), and the Multiple Input and Output (MINT) method of inversion of room transfer functions (RTF), were put together to come up with RAS. In line with this, the Room Acoustics Simulation Team (RAST) has modified the method CAPZ to have a faster convergence rate even for large reverberation times (RVT) and a small virtual memory requirement for computer simulation.

This project is the very first of its kind in the Philippines and intends to address the stagnant condition of audio engineering in the country where audio entertainment is a premium.

 

    :: Back to Top
     
     
   

DSP Solutions and Enhancements in the Car Acoustic Environment

    ::

Franz de Leon
Finley delos Santos Dy

    ::

This project involves the original conceptualization, design, and implementation of a novel digital signal processing solution that corrects the non-ideal response of interior car acoustics. Typical audio car systems simply equalize the frequency response of the car. In this project, however, a system was created which makes it possible for the listeners to further enhance the listening experience for desired positions. Moreover, “virtual acoustics” was created, which could move the apparent position of the sound sources and give the sensation of surround sound. This is accomplished in real-time by the use of techniques such as cross-talk cancellation, multi-channel equalization, and head-related transfer functions for three-dimensional sound.

Interior acoustics of a sedan with a four-speaker sound system was modeled off-line using the Maximum Length Sequence technique. The data gathered was processed using MatLab®. The Least-mean-square adaptive filtering algorithm was utilized to form the inverse filter network for cross-talk cancellation and dual-channel equalization. The 3-D sound algorithm was implemented in real-time using two ADSP 21065L EZ-Kit DSP boards. A computer-controlled interface was developed for real-time controls using Visual Basic.

Listening tests indicate that the system was effective in creating highly-directional 3-D sound. Moreover, the tests indicate success in shifting the “sweet spot” for any of the five car positions. This technology can be integrated in car audio systems for enhanced listening experience.

 

    :: Back to Top
       
       
   

A Robust Audio Watermarking Scheme Using the van de Par-Kohlrausch-Charestan-Heusdens Psychoacoustic Model

    ::

Lawrence C. Miranda

    ::

The proliferation of digital audio piracy caused an urgent need to provide copyright protection to media creators. Audio watermarking offers a solution to this copyright problem by the insertion of imperceptible data, such as copyright information, into digital media.

This project is a novel audio watermarking scheme based on the van de Par-Kohlrausch-Charestan-Heusdens psychoacoustic model. Watermarking was achieved by adding to the host audio a random sequence that has been weighted with the masking threshold, which was determined by the psychoacoustic model. Watermark attacks, which are malicious attempts that aim to corrupt the watermark information, were accounted for. The scheme was proven to produce an inaudible watermark and resistant to attacks such as noise addition, resampling, lowpass filtering, and mp3 coding.

 

    :: Back to Top
       
       
   

A Comparative Study of Multi-agent Algorithms for Real-time Beat and Tempo Synchronization

    ::

Joanne C. Santos

    ::

The first thing a person notices when he hears music is its rhythm. Foot-tapping, hand-clapping, and dancing come naturally to humans. So for computers to be able to understand music like humans, the first thing it must learn to do is track beats. Once it determined the beat, it can do automatic transcription, and other applications such as music video editing, and stage lighting control.

In this paper, three algorithms of multiple-agent beat and tempo trackers are simulated and compared, namely: Simon Dixon’s Automatic Extraction of Tempo and Beat from Expressive Performances, Benoit Meudic’s A Causal Algorithm for Beat Tracking, and Masataka Goto’s An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds. These implementations are evaluated based on the accuracy of their prediction, time complexity, and memory requirement.

The methods used by these implementations are combined to create a synchronizer that will follow tempo changes and detect beat locations in popular music from compact discs. The major parts of the implementation include onset detection, tempo induction, and multi-agent beat prediction. An agent is chosen based on different musical cues that best identifies the beat and tempo of a musical piece in real-time.

       
    :: Back to Top
     

  :: UP DSP Laboratory Website |Copyright 2004. Webmaster: Ma. Cristina C. Ojeda