Remote Sight Assistance (RSA) technology, which connects visually impaired users to human agents via an online video chat using smartphones – helps people who have poor or no vision in completing tasks that require vision. What happens when computer vision technologies don’t assist an agent in completing specific requests, like reading directions on a medicine bottle , or understanding the flight information displayed on the digital screens of airports?
According to researchers from The Penn State College of Information Sciences and Technology There are certain issues that can’t be resolved using current computer vision tools. Instead, the researchers suggest that they can be handled by humans and AI working together to enhance technology and enhance the experience of visually impaired people and agents that support them.
In a study that was that was presented during the 27th International Conference on Intelligent User Interfaces (IUI) in March, researchers identified five issues that are emerging in RSA which they believe need to be addressed in the human-AI collaboration. The solution to these issues could help enhance the field of computer vision and help to create the next phase of RSA service, says John M. Carroll, distinguished professor of technology and information sciences.
“We’re interested in developing this particular paradigm because it is a collaborative activity involving sighted and non-sighted people, as well as computer vision capabilities,” Carroll said. Carroll. “We framed it in a very rich way where there are a lot of interesting issues of human-human interaction, human-technology interaction and technology innovation.”
Remote technology for sighted assistance is available as a free application which connect visually impaired people with sighted volunteers , or as an option for paying to connect the visually impaired to sighted professionals. The technology is used for those who is in need of assistance with a task that requires vision, such as getting an empty table in restaurants or reading a food packaging label, or knowing what hue an object’s color is. They then call an agent via the live video feature using their mobile device. The agent will then view the world around them through this lens, and acts as their eyes to assist them with their task.
However, according to Syed Billah, assistant professor of IST and co-author of this paper assistance provided by agents isn’t simple.
“For example, creating a worldview by looking through the camera is mentally demanding for the agents,” said Billah. “The good news is that part of this task can be offloaded to computers running a 3D reconstruction algorithm.”
But, some of the assistance that agents offer – like aiding a visually impaired person navigate a parking garage or reading the label on a prescription bottle — comes with greater stakes.
“To address these problems, there is room for improvement with the current computer vision technology,” said Billah.
In their research, researchers reviewed the existing RSA technologies and conducted interviews with users to learn about the difficulties with navigation and technical issues they encounter while using the services. They also identified a specific set of problems which could be resolved using the existing computer vision technology and offered design suggestions to address these issues. Additionally, they identified 5 new issues that, because of their complexity, are not solved by current computer vision methods.
The researchers believe these issues could open new possibilities to improve the RSA user experience and design:
- Recognizing that things that are typically deemed as obstacles by smartphones might not be considered to be obstacles for visually impaired users and are instead helpful instruments. For instance walls bordering an avenue could be marked as an obstacle when using common applications for navigation, but visually impaired individuals using a cane depend on it to guide their way.
- Aiding users navigate their surroundings in the event that a live camera feed might be lost in low bandwidth on cellular networks often seen in indoor environments.
- Recognizing the content displayed on LCD screens, like information about flights in airports as well as temperature controls within rooms in hotels.
- Recognizing text on surfaces with irregular shapes. Most often, crucial details are printed on surfaces that can be difficult for the human agents aiding visually impaired people to comprehend, for instance the instructions for taking medication on a pill bottle that is curved or an ingredient list in a bag of chips.
- Predicting the direction in which objects or individuals can move. Agents need to be able to quickly share information about the environment in the environment around a person like pedestrians or moving cars for the user to avoid collisions and ensure safety for the user. But, researchers have found that it’s difficult for agents to monitor these people and objects and it is nearly impossible to anticipate their movements.
Researchers believe that their research can improve the experience of visually impaired users as well as agents.
“In the future we imagine that we can use computer vision to give the agent a very immersive experience and provide them with the mixed reality technology,” said Rui Yu, a doctoral candidate at IST “And we will be able to directly help the users get some basic information about their environment based on computer vision technology.”
Sooyeon Lee, a former doctoral student at the College of IST and current postdoctoral researcher at Rochester Institute of Technology, and Jingyi Zhang, an Informatics doctoral student were also part of the research study that was funded with the help of The U.S. National Institutes of Health as well as the National Library of Medicine.
The post Human-AI Collaboration to Aid Visually Impaired appeared first on ELE Times.