|
Room 410, EEE Building. Velasquez Street. |
|||||
|
|
||||||
|
Although CDs have been in use for nearly twenty years, audio has remained mostly analog. Until now, digital music on the CD has been converted to analog inside the CD player and taken through analog signal processing and amplification before being applied to the speakers. But with the recent increase in such digital infrastructure as the Internet, digital networks in buildings and automobiles, and wireless digital communications, along with new digital sound sources like MP3 and DV, an unprecedented opportunity to process digital audio data has been created. And increasingly, there will be a need to do that kind of processing.
PRESENT PROJECTS
Modal Distribution Analysis and Sum of Sinusoids Synthesis of Kulintang
Musical Signals :: Franklin C. Agsaway
_____________________________________________________________________
Beat Recognition Using Neural Networks
::
Paolo Antonio C. Castro
A beat recognition system was implemented
starting with drum onset detection. Drum sounds in music are detected with
an onset detection system utilizing a neural network. Features were
extracted from the samples and used to train the network to recognize either
a base or a snare drum. Agents with tempo hypotheses are then formed from
the detected drum onsets and their intervals from each other. They output
the possible beats. The beat with the highest score by the agent is
considered the correct beat. A total of 120 songs were selected to train and test the system. Since the system requires a song to have a tempo between 80 and 160 beats per minute, a 30-second portion of each song which satisfies the required tempo were extracted. 80 of these samples were used for training, 20 were for onset detection testing, and another 20 was added for beat detection testing. 15 of the onset detection testing samples were also used for testing the beat detection making a total of 35. The outputs for these 35 samples were compared to hand-segmented ground truths for evaluation. An overall beat recognition accuracy of 76.42% was achieved for the current project and 71.13% was obtained using the test samples with the previous project. When compared with the previous project's recognition accuracy, it was found that the system performed higher by 5.29%.
_____________________________________________________________________
A
Multiple Audio Codec Interface to an ADSP-21065L SHARC Processor
::
Jaeson C. Paras
|
||||||
| PAST_PROJECTS | ||||||
| Digital Parametric Equalizer on an ADSP 21065L | ||||||
| :: | Jerremeo Raynier T. Gabas | |||||
| :: |
Equalizers are commonly implemented using analog circuits, but with the advent of high-speed digital signal processors, their capabilities can now be emulated using a digital algorithm. This
pro1ject will implement a parametric equalizer using an ADSP-21065L and can be controlled using a graphical user interface on a personal computer.
|
|||||
| :: Back to Top | ||||||
| 4-Channel Digital Automatic Microphone Mixer Using ADSP-21065L Digital Signal Processor | ||||||
| :: | Ericson L. Machacon | |||||
| :: |
A four-channel digital automatic microphone mixer will be implemented using the ADSP-21065L digital signal processor. The four-channel interface will be made possible by the daughter board developed by the University of the Philippines' Electrical and Electronics Engineering Department Digital signal Processing Laboratory. A synthesis of the gain sharing algorithm and automatic voice detection shall be used to control common mixing problems such as acoustic feedback and comb filtering. A graphical user interface shall be developed for a user to control the 'automixer' using a personal computer.
|
|||||
| :: Back to Top | ||||||
| A Real-time Room Acoustic Simulator Using the ADSP 21065L EZ-Kit Lite | ||||||
| :: | Nur Ishmael H. Malonzo | |||||
| :: |
There is a great demand for live performances everywhere. Considerable resources are spent to give desirable acoustics to their venues. A real-time room acoustic simulator is a cheaper and more dynamic alternative to actual physical manipulation in getting desired acoustical conditions. A significant part of producing the desired acoustics of a room is to reduce the effects of the natural acoustics of that room. This project implements a real-time acoustic simulator and an adaptively trained acoustic equalizer. The simulator and equalizer are integrated as one working unit.The project used the time-divided impulse response method, a method developed by de Jesus et al, for the acoustic simulation part and used a least mean square method for the acoustic equalization part. The Real-time Room Acoustics Simulator was implemented in the ADSP 21065L EZ-Kit Lite. It is able to simulate a reverberation time of up to 0.3 seconds. With the integrated acoustic equalizer, it is able to equalize some reverberations. Although the project had fallen short of the original intended goal of fully adaptive equalization and failed to prove the perceptual effectivity of the equalizer, improvement in these aspects were insured if ever further work would be done. This project was done using two ADSP 21065L EZ-Kit Lites and was coded entirely in ADSP 21065L assembly language to get the most out of the computational and memory resources of each board.
|
|||||
| :: Back to Top | ||||||
| Real Time Karaoke Grader Using Modal Analysis Implemented on an ADI Sharc | ||||||
| :: | Jerremeo Raynier T. Gabas | |||||
| :: |
Karaoke singing is one of the most common Filipino pastimes. In this project, the researcher will develop a real time system that will truly quantify a person's singing quality by comparing the singer's tune and the timing with that of the vocal track. Modal Distribution is a time-frequency distribution that can be used to track the timing and pitch of the vocal track and the singer's voice. Modal distribution will be used to analyze and compare the vocal track and the singer's voice, thereby computing a "grade" for the singer.
|
|||||
|
|
|
|
||||
|
Simulation of Three-Dimensional Sound Using Two Channels |
||||||
| :: | Mishael Paul Atienza | |||||
| :: | Conventional 3-D sound is done using at least three speakers placed around the listener; higher quality is achieved with an increasing number of speakers. However, this renders the system expensive, as well as cumbersome, as more units have to be positioned accurately when being set-up. The UP-DSP Lab has done an offline simulation of 3-D sound using only two channels, and this project implements this in real time. This project is able to process sound and output it positionally with no audible delay. It was done using the Pentium III Processor, and has a graphical user interface using a suitable programming language. This GUI enables the user to control the movement of the sound, also in real time. At the heart of this project are the Head-Related Transfer Functions, or HRTFs. These give human hearing as a function of 3-D position. The HRTFs are used in Balanced Model Truncation, Cross-talk Cancellation and Dual-Channel Equalization, and Speaker and Room Response Cancellation, all of which are methods to be used to accurately "place" the sound source positionally in 3-D. This project could benefit the various entertainment industries, at a lesser cost, greater flexibility, and greater efficiency.
|
|||||
| :: Back to Top | ||||||
| Real Time 3D Sound Using Two Channels | ||||||
| :: | Ian Dexter S. Garcia | |||||
| :: | The
objective of this project is to determine if the 3-D listening experience
can be simulated using a less expensive, conventional two-channel
(two-speaker) system. To accoumplish this, the project addressed the
processes involved in 3-D sound simulation. The processes, including a
graphical-user-interface were implemented using MATLAB. Through listening
tests, it was determined whether the implemented system can be used for
3-D sound. The listening tests also determined which among the implemented
3-D sound simulation configurations is best suited for real-time
implementation. The sound booth inside room 414 at the NEC Bldg. was outfitted with egg cartons and carpet to minimize its reverberation. This acoustically treated the room for more accurate listening tests. During the listening tests, sound samples simulating different sound source positions were played by the system. The subject listened to the sounds and then entered his perceived positions. Tests were done to determine the accuracy and directionality of the sound in the elevation and azimuth coordinates. Mean-error for the azimuth was 17 degrees and for the elevation was 28 degrees. It has been determined through the tests described above that the system indeed captures 3-D sound. The implemented 3-D sound system can be marketed to researchers and virtual-reality systems developers. A real-time project implementation connected to home stereo or computer systems can be marketed to consumers. 3-D sound is a field where the Philippines can have a niche in the entertainment industry, both locally and globally. This is therefore a field which must be pursued by researchers and engineers in the future.
|
|||||
| :: Back to Top | ||||||
| TMS320C54X-based Coconut Age Classification System Using the Hidden Markov Model | ||||||
| :: |
Raquel B. David Lesly Zaren V. Endrinal Aileen P. Santos |
|||||
| :: | For
so many years now, the traditional tap system is still the method being
used by coconut vendors to tell what meat type a coconut has. If a
customer asks for a specific coconut type, they simply tap the coconut
with their bolo, then based on the sound produced, will decide whether it
is mala-kanin, mala-uhog, or mala-tainga. A coconut age classifier wil be able to tell the age and type of a coconut based on the sound that wil be produced upon tapping it. This will be useful in wholesale coconut industries wherein a lot of customers, particularly foreign traders, demand a particular type of meat depending on their applications. The researchers will create a model that will mimic the human ear and using this model, an unknown coconut sample can be classified as a mala-kanin, mala-tainga, or a mala-uhog.
|
|||||
| :: Back to Top | ||||||
| Room Acoustic Simulation Using Digital Signal Processing | ||||||
| :: |
Ellen de Jesus Edwin Umali Karl Villareal |
|||||
| :: | Different
kinds of sounds are best heard in different locations. It is well
established in the acoustic community that human ears prefer reverberant
sounds, as evidenced by the proliferation of 'shower singers'. Sounds can
be 'shaped' either through the acoustic environment or at the source.
Hence, one could either build a church to one's acoustic preference, or
one could engineer the physics of sound itself. Room Acoustic Simulation (RAS) was developed to avoid huge investments in building desirable acoustic structures and environments. The RAS team developed the simulation of four acoustic environments: a small room with concrete walls, church, concert hall and an open space using digital signal processing. RAS was implemented on a Pentium computer using MatLab. Later versions of RAS are envisioned to be real-time and implemented using digital signal processing (DSP) hardware. Three novel methods: the Image Method, the Common Acoustical Poles and Zeros (CAPZ), and the Multiple Input and Output (MINT) method of inversion of room transfer functions (RTF), were put together to come up with RAS. In line with this, the Room Acoustics Simulation Team (RAST) has modified the method CAPZ to have a faster convergence rate even for large reverberation times (RVT) and a small virtual memory requirement for computer simulation. This project is the very first of its kind in the Philippines and intends to address the stagnant condition of audio engineering in the country where audio entertainment is a premium.
|
|||||
| :: Back to Top | ||||||
|
DSP Solutions and Enhancements in the Car Acoustic Environment |
||||||
| :: |
Franz de Leon |
|||||
| :: |
This project involves the original conceptualization, design, and implementation of a novel digital signal processing solution that corrects the non-ideal response of interior car acoustics. Typical audio car systems simply equalize the frequency response of the car. In this project, however, a system was created which makes it possible for the listeners to further enhance the listening experience for desired positions. Moreover, “virtual acoustics” was created, which could move the apparent position of the sound sources and give the sensation of surround sound. This is accomplished in real-time by the use of techniques such as cross-talk cancellation, multi-channel equalization, and head-related transfer functions for three-dimensional sound. Interior acoustics of a sedan with a four-speaker sound system was modeled off-line using the Maximum Length Sequence technique. The data gathered was processed using MatLab®. The Least-mean-square adaptive filtering algorithm was utilized to form the inverse filter network for cross-talk cancellation and dual-channel equalization. The 3-D sound algorithm was implemented in real-time using two ADSP 21065L EZ-Kit DSP boards. A computer-controlled interface was developed for real-time controls using Visual Basic. Listening tests indicate that the system was effective in creating highly-directional 3-D sound. Moreover, the tests indicate success in shifting the “sweet spot” for any of the five car positions. This technology can be integrated in car audio systems for enhanced listening experience.
|
|||||
| :: Back to Top | ||||||
| :: |
Lawrence C. Miranda |
|||||
| :: |
The proliferation of digital audio piracy caused an urgent need to provide copyright protection to media creators. Audio watermarking offers a solution to this copyright problem by the insertion of imperceptible data, such as copyright information, into digital media. This project is a novel audio watermarking scheme based on the van de Par-Kohlrausch-Charestan-Heusdens psychoacoustic model. Watermarking was achieved by adding to the host audio a random sequence that has been weighted with the masking threshold, which was determined by the psychoacoustic model. Watermark attacks, which are malicious attempts that aim to corrupt the watermark information, were accounted for. The scheme was proven to produce an inaudible watermark and resistant to attacks such as noise addition, resampling, lowpass filtering, and mp3 coding.
|
|||||
| :: Back to Top | ||||||
|
A Comparative Study of Multi-agent Algorithms for Real-time Beat and Tempo Synchronization |
||||||
| :: |
Joanne C. Santos |
|||||
| :: |
The first thing a person notices when he hears music is its rhythm. Foot-tapping, hand-clapping, and dancing come naturally to humans. So for computers to be able to understand music like humans, the first thing it must learn to do is track beats. Once it determined the beat, it can do automatic transcription, and other applications such as music video editing, and stage lighting control. In this paper, three algorithms of multiple-agent beat and tempo trackers are simulated and compared, namely: Simon Dixon’s Automatic Extraction of Tempo and Beat from Expressive Performances, Benoit Meudic’s A Causal Algorithm for Beat Tracking, and Masataka Goto’s An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds. These implementations are evaluated based on the accuracy of their prediction, time complexity, and memory requirement. The methods used by these implementations are combined to create a synchronizer that will follow tempo changes and detect beat locations in popular music from compact discs. The major parts of the implementation include onset detection, tempo induction, and multi-agent beat prediction. An agent is chosen based on different musical cues that best identifies the beat and tempo of a musical piece in real-time. |
|||||
| :: Back to Top | ||||||
|
:: UP DSP Laboratory Website |Copyright 2004. Webmaster: Ma. Cristina C. Ojeda |