It has finally happened - the Kinect motion-sensing gadget from Microsoft has been hooked up to control avatars in virtual worlds that use the Second Life viewer:


Thanks to New World Notes for spotting this!

This kind of thing tends to get people very excited, and rightly so - making interaction with 3D virtual environments more intuitive and less clunky is key to bringing the benefits that these environments hold. I'm easily swept up by the giddy shiny too, imagining being able to have all my movements and facial gestures immediately reflected in my avatar, visible to everyone, under my control (turn down transmission of grumpy, accentuate transmission of happy, apply improve-posture algorithm etc). This is maybe a step along the way, but before getting too excited it's worth understanding exactly what is happening here and what the limitations currently are.

Currently in Second Life and OpenSim, only the position and rotation of the avatar are sent to clients - when my avatar moves, your viewer gets told more or less 'this avatar moved to position x and is facing south'. Nothing about position of joints, limbs etc.
When you see animations, even the walking motion, it is because the server sends a message to all the clients telling them to start playing a certain animation - which is a rigid predefined series of steps or frames for the avatar. This even extends to walking - when you start moving, the server will send a message to all the other nearby avatars viewers to say 'start playing the walking animation for this AV, and update his position to ...'.

This has several repercussions. Firstly the range of movements that your avatar can express is limited to the animations that you have access to. However one of the great things about Second Life and OpenSim is that users can generate their own and make them available to others, as they can with other objects. But those animations are still pre-recorded fixed sequences - all you can do is effectively press 'play' (although some subtle effects can be achieved by belnding animations together I think).

Also there could be significant lag or time differences between the arrival of the 'start playing' message at the viewers of all the nearby avatars. Some animations are built into the viewer, such as walking. Those that aren't need to be sent to the viewer before they can start playing. So if someone starts dancing, there will be a delay before you see that happen, a delay that depends upon how quickly you can download the details of the animation amongst other things. Everyone sees animations at slightly different times.

This all puts several layers of separation between the motion you would like your avatar to have as you imagine it and what everyone else actually sees.

The introduction of Kinect in the video above is actually putting another layer on top. The Kinect device plus various pieces of open source software have allowed the PC to recognise a set of gestures made by the user and convert those gestures into instructions to the viewer exactly as if you had pressed the relevant key on the keyboard. In some cases the gestures have been set up to trigger the playing of animations as described above - so waving triggers the playing of a 'wave' animation. In other cases, the gestures are being used to drive the user interface rather than the avatar - moving the camera viewpoint for example. You could imagine gestures for open inventory, take snapshot and so on.

Naturally you might imagine the next step is to have the Avatar ape the exact movements of the user. All we have to do is hack the code of the viewer to allow it to get the avatar pose directly from the Kinect, right? Actually, no. You could potentially do that - and have your avatar as shown to you by your client moving in sync with your body as captured by the Kinect. This has already been done with other motion capture technology. But then all of that avatar body and limb position information would have to be sent in real time to all the other nearby clients. There is simply no mechanism in Second Life or OpenSim to do this at the moment. There was a big project a few years ago to introduce full puppeteering into Second Life - this would have included the necessary streaming of avatar motion but also included wonderful things like rag-doll physics and the ability to manipulate the avatar freely. A comprehensive post on the work including demo videos is available at avatarpuppeteering.com. According to the SL wiki the project was put on permanent hold and the link to the source code seems broken.

Given the huge success of the Kinect, the high level of interest in it from the PC, gaming and virtual worlds communities, and the recent suggestions that Microsoft would release a Windows SDK for the Kinect, I'd love to see this pupeteering project revived with a focus on the real-time avatar motion capture and streaming. That is unlikely so maybe the code could be made available so others can continue the work. There are of course many challenges - how do you mix control of the UI with control of the Avatar? How do you control the extent to which the avatar copies the user movements? However, at a time when the focus seems to be increasingly away from PC-based fully immersive virtual experiences to mobile device based toys this kind of technology to increase the immersion would make a PC based rich client very relevant. And maybe get people dreaming again.