Using Computer Vision to Train a Sound Tracking System

Loading...
Thumbnail Image

Files

UG_2000-2.pdf (188.06 KB)
No. of downloads: 590

Publication or External Link

Date

2000

Citation

DRUM DOI

Abstract

In this research, computer vision was used to locate a sound source for feedback into an audio system. The camera was first calibrated to determine the relationship between the world coordinates and the pixel coordinates of an object.

To aid in the calibration process, computer vision techniques such asgradient calculation and the Hough Transform were used to extract the calibration points from a series of images. These points, alongwith their corresponding world coordinates, were then used in Roger Tsai'scamera model to calibrate the camera.

The intrinsic and extrinsic camera parameters were then used to find the vector of the sound source in an image. Again, vision processing was used to extract the sound source from an image using red as a detectable feature. The largest red region was iolated, and the centroid of that region was used to mark the location of the sound source.

Finally, Tsai's model was used in reverse to find the vector in the world along which the camera lies.

Notes

Rights