New research leads to human interactions being recognised by computers

Thanks to new research, led by DPhil student Alonso Patron-Perez of the Department of Engineering Science, human interactions can now be automatically recognised by computers from video footage. This technology can also automatically recognise interactions such as handshakes and high fives. It is part of research to enable computers to automatically analyse the content of the vast amount of video footage generated from sources such as TV, films, YouTube and CCTV.

Hug identified by computer © OU/ABCAlonso Patron-Perez said: “Human actions and activities are of central importance in video analysis. This new work makes it possible to recognise two-person human interactions, such as hugs, kisses and hand-shakes, automatically. Once you can recognise these interactions the applications are numerous: for instance you could automatically search home videos and YouTube for kisses and handshakes or even fast forward CCTV to find incidents.

The method, developed by an Oxford University team including Alonso Patron-Perez, Dr Ian Reid, Dr Marcin Marszalek, and Professor Andrew Zisserman, is built on algorithms from computer vision and machine learning.

Teaching computers to recognise the interactions involves a number of steps: first, humans are detected and tracked through the video footage; then, once the position of the humans in the video is established, different cues such as head orientation and relative motion of people’s bodies are computed to determine if an interaction occurs and, if it does, what kind of interaction it is.

All this information is computed for several examples of each interaction (the team has focused on four interactions so far: handshakes, high fives, hugs and kisses), and machine learning methods are then used to learn a model for each interaction from these examples.

Alonso Patron-PerezAlonso Patron-Perez said: “Once a computer has learnt these models, human interactions can then be located and recognised in new videos, with the computer able to determine when in the video interactions occur, which people are interacting and what kind of interactions are involved. This work enables computers to make sense of how people are behaving in video footage in a way that has simply not been possible before.

Alonso’s work also won the best industrial paper award at the British Machine Vision Conference in 2010.