Egocentric Vision in Assistive Technologies For and By the Blind

Thumbnail Image


Lee_umd_0117E_22976.pdf (28.06 MB)
No. of downloads:

Publication or External Link





Visual information in our surroundings, such as everyday objects and passersby, is often inaccessible to people who are blind. Cameras that leverage egocentric vision, in an attempt to approximate the visual field of the camera wearer, hold great promise for making the visual world more accessible for this population. Typically, such applications rely on pre-trained computer vision models and thus are limited. Moreover, as with any AI system that augments sensory abilities, conversations around ethical implications and privacy concerns lie at the core of their design and regulation. However, early efforts tend to decouple perspectives, considering only either those of the blind users or potential bystanders.

In this dissertation, we revisit egocentric vision for the blind. Through a holistic approach, we examine the following dimensions: type of application (objects and passersby), camera form factor (handheld and wearable), user’s role (a passive consumer and an active director of technology), and privacy concerns (from both end-users and bystanders). Specifically, we propose to design egocentric vision models that capture blind users’ intent and are fine-tuned by the user in the context of object recognition. We seek to explore societal issues that AI-powered cameras may lead to, considering perspectives from both blind users and nearby people whose faces or objects might be captured by the cameras. Last, we investigate interactions and perceptions across different camera form factors to reveal design implications for future work.