Skip to main content

Hand tracking

My initial attempt at bolting all the bits together has run into a delay since I've decided to try and epoxy wield some parts together and don't have any suitable,  so yesterday my focus was on marker-less hand tracking.

There are a lot of impressive videos on youtube showing hand-tracking when you throw in a moving background, varying lighting and a video stream that 99.999% of the time wont contain the object you want to track (but may contain 100000s of similar objects), well things don't tend to work as well as we would like.

The last time I looked at hand tracking was probably 5 years ago and I wasn't too impressed by the reliability or robustness of the various techniques on offer and I cant say I'm any more impressed today. I recently had a quick look at various rapid approaches - touchless works but is marker based, TLD works but loses the target in long video and ends up learning a different target (but might be applicable if some form of re-initialisation was employed) and HandVu (which I had high hopes for) was too lighting sensitive. As said, these were quick looks and I will revisit TLD at least in the near future.
MIT budget "data-glove"

While I don't want to use fiducial markers, when MIT are proposing the use of gloves that even my wife would be ashamed to wear (and she loves colourful things) in order to improved recognition accuracy, well, one has to realise that we just haven't solved this problem yet.

Just how bad is it though? Well there have been multiple studies [cite] investigating skin detection and proposing values for HSV skin segmentation, and the theory behind a lot of this work looks solid (e.g. HS values for skin in a narrow range due to skin pigmentation being the result of blood colour and the amount of melanin [cite]), but throwing my training samples at them (incredibly cluttered background, variable lighting conditions) produces far too many false positives and false negatives to be of any practical value. Looking at the underlying RGB and HSV values also suggest that this approach is going to be of little practical application in "everyday" scenarios hence I'll be moving onto fiducial markers for today.

Comments

Popular posts from this blog

I know I should move on and start a new blog but I'm keeping this my temporary home. New project, massive overkill in website creation. I've a simple project to put up a four page website which was already somewhat over specified in being hosted on AWS and S3. This isn't quite ridiculous enough though so I am using puppet to manage an EC2 instance (it will eventually need some server side work) and making it available in multiple regions. That would almost have been enough but I'm currently working on being able to provision an instance either in AWS or Rackspace because...well...Amazon might totally go down one day! Yes, its over-the-top but I needed something simple to help me climb up the devops and cloud learning curve. So off the bat - puppet installation. I've an older 10.04 Ubuntu virtual server which has been somewhat under-taxed so I've set that up as a puppet master. First lesson - always use the latest version from a tarball unless you have kept t

Camshift Tracker v0.1 up

https://code.google.com/p/os6sense/downloads/list I thought I'd upload my tracker, watch the video from yesterday for an example of the sort of performance to expect under optimal conditions ! Optimal conditions means stable lighting, and removing elements of a similar colour to that which you wish to track. Performance is probably a little worse, (and at best similar to) the touchless SDK. Under suboptimal conditions...well its useless but then so are most trackers which is a real source of complaint about most of the computer vision research out there.....not that they perform poorly but rather that there is far too little honesty in just how poorly various algorithms perform under non-laboratory conditions. I've a few revisions to make to improve performance and stability and I'm not proud of the code. It's been...8 years since I last did anything with C++ and to be frank I'd describe this more as a hack. Once this masters is out of the way I plan to look a

More Observations

After this post I AM going to make videos ;) I spent some time doing some basic tests last night under non optimal (but good) conditions: 1) Double click/single click/long tap/short tap These all can be supported using in air interactions and pinch gestures. I'd estimate I had +90% accuracy in detection rate for everything apart from single click. Single click is harder to do since it can only be flagged after the delay for detecting a double click has expired and this leads to some lag in the responsiveness of the application. 2) The predator/planetary cursor design. In order to increase the stability of my primary marker when only looking at a single point e.g. when air drawing, I decided to modify my cursor design. I feel that both fiducial points should be visible to the user but it didn't quite "feel" right to me using either the upper or lower fiducial when concentrating on a single point hence I've introduced a mid-point cursor that is always 1/2 wa