Using Computer Vision to Train a Sound Tracking System
Publication or External Link
In this research, computer vision was used to locate a sound source for feedback into an audio system. The camera was first calibrated to determine the relationship between the world coordinates and the pixel coordinates of an object.
To aid in the calibration process, computer vision techniques such asgradient calculation and the Hough Transform were used to extract the calibration points from a series of images. These points, alongwith their corresponding world coordinates, were then used in Roger Tsai'scamera model to calibrate the camera.
The intrinsic and extrinsic camera parameters were then used to find the vector of the sound source in an image. Again, vision processing was used to extract the sound source from an image using red as a detectable feature. The largest red region was iolated, and the centroid of that region was used to mark the location of the sound source.
Finally, Tsai's model was used in reverse to find the vector in the world along which the camera lies.