Skip to main content

Hand tracking

My initial attempt at bolting all the bits together has run into a delay since I've decided to try and epoxy wield some parts together and don't have any suitable,  so yesterday my focus was on marker-less hand tracking.

There are a lot of impressive videos on youtube showing hand-tracking when you throw in a moving background, varying lighting and a video stream that 99.999% of the time wont contain the object you want to track (but may contain 100000s of similar objects), well things don't tend to work as well as we would like.

The last time I looked at hand tracking was probably 5 years ago and I wasn't too impressed by the reliability or robustness of the various techniques on offer and I cant say I'm any more impressed today. I recently had a quick look at various rapid approaches - touchless works but is marker based, TLD works but loses the target in long video and ends up learning a different target (but might be applicable if some form of re-initialisation was employed) and HandVu (which I had high hopes for) was too lighting sensitive. As said, these were quick looks and I will revisit TLD at least in the near future.
MIT budget "data-glove"

While I don't want to use fiducial markers, when MIT are proposing the use of gloves that even my wife would be ashamed to wear (and she loves colourful things) in order to improved recognition accuracy, well, one has to realise that we just haven't solved this problem yet.

Just how bad is it though? Well there have been multiple studies [cite] investigating skin detection and proposing values for HSV skin segmentation, and the theory behind a lot of this work looks solid (e.g. HS values for skin in a narrow range due to skin pigmentation being the result of blood colour and the amount of melanin [cite]), but throwing my training samples at them (incredibly cluttered background, variable lighting conditions) produces far too many false positives and false negatives to be of any practical value. Looking at the underlying RGB and HSV values also suggest that this approach is going to be of little practical application in "everyday" scenarios hence I'll be moving onto fiducial markers for today.

Comments

Popular posts from this blog

I know I should move on and start a new blog but I'm keeping this my temporary home. New project, massive overkill in website creation. I've a simple project to put up a four page website which was already somewhat over specified in being hosted on AWS and S3. This isn't quite ridiculous enough though so I am using puppet to manage an EC2 instance (it will eventually need some server side work) and making it available in multiple regions. That would almost have been enough but I'm currently working on being able to provision an instance either in AWS or Rackspace because...well...Amazon might totally go down one day! Yes, its over-the-top but I needed something simple to help me climb up the devops and cloud learning curve. So off the bat - puppet installation. I've an older 10.04 Ubuntu virtual server which has been somewhat under-taxed so I've set that up as a puppet master. First lesson - always use the latest version from a tarball unless you have kept t...

New Detector Done

Much better but I'm still not happy with it - camshift + backproj + kalman means that the marker coordinates are a lot smoother with far less noise (obviously) but the nature of detecting markers in segmented video still leads to a less than robust implementation. There's room for improvement and I still need to add in some form of input dialog for naming markers (and I must confess I am CLUELESS on the c++ side for that.....wxwidgets? Qt?) but I'm that little bit happier. As per usual I had hoped for a video, but the lack of a dialog makes configuring things into a manual process (I've got basic save/load support working but given how sensitive this is to lighting still its a lot of messing around) hence I'm delaying yet again. Given my page views though I don't think I will be disappointing many people. What is frustrating is the amount of time I've had to spend on basic work with computer vision rather than looking at the actual interactions for this ...

Finally...

Children back at school and I'm back off my hols (a rather interesting time in Estonia if you're interested). I've spent most of the last week becoming increasingly frustrated with my attempts at image segmentation. I've moved to a c++ implementation for speed and, while the VERY simplistic HSV segmentation technique I am using works, the problem is that I cannot get it to work robustly and doubt that it will ever do such. I've now covered the range of available techniques and even tried to plumb the depths of just emerging ones and it seems that every computer vision based object tracking implementation or algorithm suffers for the same issue with robustness (openTLD, camshift, touchless, hsv segmentation and cvBlob etc etc). YES, it can be made to work, but issues include (depending on the algorithm) : - Object drift : over time the target marker will cease to be recognised and other objects will become the target focus. - Multiple objects : During segments...