. 2019 May:2019:336.

doi: 10.1145/3290605.3300566.

Hands Holding Clues for Object Recognition in Teachable Machines

Kyungjun Lee¹, Hernisa Kacorri²

Affiliations

¹ Department of Computer Science, University of Maryland, College Park, MD, USA.
² College of Information Studies, University of Maryland, College Park, MD, USA.

PMID: 32043091
PMCID: PMC7008716
DOI: 10.1145/3290605.3300566

Hands Holding Clues for Object Recognition in Teachable Machines

Kyungjun Lee et al. Proc SIGCHI Conf Hum Factor Comput Syst. 2019 May.

. 2019 May:2019:336.

doi: 10.1145/3290605.3300566.

Authors

Kyungjun Lee¹, Hernisa Kacorri²

Affiliations

¹ Department of Computer Science, University of Maryland, College Park, MD, USA.
² College of Information Studies, University of Maryland, College Park, MD, USA.

PMID: 32043091
PMCID: PMC7008716
DOI: 10.1145/3290605.3300566

Abstract

Camera manipulation confounds the use of object recognition applications by blind people. This is exacerbated when photos from this population are also used to train models, as with teachable machines, where out-of-frame or partially included objects against cluttered backgrounds degrade performance. Leveraging prior evidence on the ability of blind people to coordinate hand movements using proprioception, we propose a deep learning system that jointly models hand segmentation and object localization for object classification. We investigate the utility of hands as a natural interface for including and indicating the object of interest in the camera frame. We confirm the potential of this approach by analyzing existing datasets from people with visual impairments for object recognition. With a new publicly available egocentric dataset and an extensive error analysis, we provide insights into this approach in the context of teachable recognizers.

Keywords: blind; egocentric; hand; k-shot learning; object recognition.

PubMed Disclaimer

Figures

**Figure 1:**
An illustration of our hand-guided object recognition approach on an example from our egocentric dataset. Given a photo of an object in proximity to a hand, it first identifies the hand and then estimates the object center, which is then cropped and passed to the recognition model.

**Figure 2:**
In our approach, a hand segmentation model (Step I) is fine-tuned to estimate the center of the object in proximity to the hand (Step II). A bounding box, placed in that center is used to isolate the object and crop the image, which is then passed to the object classification model (Step III).

**Figure 3:**
An (input, annotation, output) example for our hand segmentation (a) and object localization (b) models.

**Figure 4:**
Examples from each dataset. Glassense-Vision, VizWiz, and our benchmark examples are selected to include hands.

**Figure 5:**
Nineteen objects used in our data collection. Objects in the same category are displayed in proximity.

**Figure 6:**
Positive and negative outputs of our object localization model on the Glassense-Vision and VizWiz datasets.

**Figure 7:**
Our hand-guided object recognition method (CO) tends to improve recognition accuracy on average for S and B compared to the original HO and O methods.

**Figure 8:**
Accuracy gain of our method (CO) over HO and O is more pertinent in cluttered backgrounds (wild).

**Figure 9:**
On average CO outperforms HO and O consistently across training sample sizes k = 1, 5, 20.

**Figure 10:**
Presence of hands (HO and CO) seems to have a different efect for generic vs. teachable models.

**Figure 11:**
Positive and negative results on TEgO, with outof-frame hands for some of the negative examples.

**Figure 12:**
Confusion matrix for the CO models showing that misclassifcation occurs within objects of similar shape. Cans and bottles are indicated as “-c” and “-b”, respectively.

See this image and copyright information in PMC

Cited by

AccessShare: Co-designing Data Access and Sharing with Blind People.
Kamikubo R, Zeraati FZ, Lee K, Kacorri H. Kamikubo R, et al. ASSETS. 2024;4:1-16. doi: 10.1145/3663548.3675612. ASSETS. 2024. PMID: 40568023 Free PMC article.
Sharing Practices for Datasets Related to Accessibility and Aging.
Kamikubo R, Dwivedi U, Kacorri H. Kamikubo R, et al. ASSETS. 2021;1:10.1145/3441852.3471208. doi: 10.1145/3441852.3471208. ASSETS. 2021. PMID: 35187541 Free PMC article.
Revisiting Blind Photography in the Context of Teachable Object Recognizers.
Lee K, Hong J, Pimento S, Jarjue E, Kacorri H. Lee K, et al. ASSETS. 2019 Oct;2019:83-95. doi: 10.1145/3308561.3353799. ASSETS. 2019. PMID: 32783045 Free PMC article.
Blind Users Accessing Their Training Images in Teachable Object Recognizers.
Hong J, Gandhi J, Mensah EE, Zeraati FZ, Jarjue EH, Lee K, Kacorri H. Hong J, et al. ASSETS. 2022 Oct;2022:14. doi: 10.1145/3517428.3544824. Epub 2022 Oct 22. ASSETS. 2022. PMID: 36916963 Free PMC article.
Accessing Passersby Proxemic Signals through a Head-Worn Camera: Opportunities and Limitations for the Blind.
Lee K, Sato D, Asakawa S, Asakawa C, Kacorri H. Lee K, et al. ASSETS. 2021;21:10.1145/3441852.3471232. doi: 10.1145/3441852.3471232. ASSETS. 2021. PMID: 35187543 Free PMC article.

See all "Cited by" articles

References

1. Adams Dustin, Kurniawan Sri, Herrera Cynthia, Kang Veronica, and Friedman Natalie. 2016. Blind photographers and VizSnap: A long-term study. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 201–208.
1. Adams Dustin, Morales Lourdes, and Kurniawan Sri. 2013. A qualitative study to support a blind photography mobile application. In Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments. ACM, 25.
1. Ahmed Tousif, Hoyle Roberto, Connelly Kay, Crandall David, and Kapadia Apu. 2015. Privacy concerns and behaviors of people with visual impairments. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 3523–3532.
1. Envision AI. 2018. Enabling vision for the blind. https://www.letsenvision.com
1. Seeing AI. 2017. A free app that narrates the world around you. https://www.microsoft.com/en-us/seeing-ai

Grants and funding

90REGE0008/ACL/ACL HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Hands Holding Clues for Object Recognition in Teachable Machines

Affiliations

Hands Holding Clues for Object Recognition in Teachable Machines

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources