Luigi Seminara

Recognition of actions on objects using

Microsoft HoloLens 2




Abstract

The goal of this thesis is the development of a method for recognizing human-object interactions starting from 3D coordinates of key points of hands obtained through the Microsoft HoloLens2 wearable device. Recognizing interactions using wearable devices allows you to build systems that can improve the safety of workers in a factory or assist them during their activities. For example, if a user takes a drill, the device could display information about it, such as the battery level, or a guide on how to use it. Specifically, the proposed method will try to distinguish three actions: take, release and push. This will be done using a recurrent neural network called Long-Short Term Memory (LSTM) which takes the key point sequences of the hands as input. The study reported in this document was structured considering the tools used, the phases that made it possible to achieve results and the analysis of these results. From the results obtained it emerged that none of the models developed is able to solve the problem in an optimal way, but by transforming the initial problem into two macro problems it was deduced that it is possible to use the proposed method to distinguish the moments in which actions are performed, from the moments when they are not performed.