Skip to main content

Finally...

Children back at school and I'm back off my hols (a rather interesting time in Estonia if you're interested).

I've spent most of the last week becoming increasingly frustrated with my attempts at image segmentation. I've moved to a c++ implementation for speed and, while the VERY simplistic HSV segmentation technique I am using works, the problem is that I cannot get it to work robustly and doubt that it will ever do such.

I've now covered the range of available techniques and even tried to plumb the depths of just emerging ones and it seems that every computer vision based object tracking implementation or algorithm suffers for the same issue with robustness (openTLD, camshift, touchless, hsv segmentation and cvBlob etc etc). YES, it can be made to work, but issues include (depending on the algorithm) :

- Object drift : over time the target marker will cease to be recognised and other objects will become the target focus.
- Multiple objects : During segments where the camera is moving new objects will appear, some of which cannot be differentiated from the target.
- Target object loss : Due to changes in size, lighting, speed etc the target will be totally lost.
- Target jitter : The centroid of the target cannot be accurately determined.

I'll expand that list as I think of more.

So basically given a semi static camera, a semi uniform "background", uniform lighting, an object can be tracked with some degree of reliability.

Its also worth noting that 2 variables, fiducial colour and lighting uniformity, have the largest impact on reliability of tracking. I was incredibly optimistic this week when I tried to segment a light green pen top and found it to be highly accurately tracked during one experiment; but then I returned to the same code and object later in the day under different lighting conditions and reliability fell massively until I recalibrated.

I am unsure of how to proceed next I must admit; while I didn't expect things to be 100% reliable I had expected one of the available techniques to produce better results than I have had so far. If I had more raw power to throw at things (and more raw time) I'd return to looking at some of the AI techniques (Deep Convoluted Networks) as well as somewhat simpler SIFT/SURF implementations but sadly I am out of time for this portion of my research (and in many ways its THE most crucial aspect)....

Comments

Popular posts from this blog

I know I should move on and start a new blog but I'm keeping this my temporary home. New project, massive overkill in website creation. I've a simple project to put up a four page website which was already somewhat over specified in being hosted on AWS and S3. This isn't quite ridiculous enough though so I am using puppet to manage an EC2 instance (it will eventually need some server side work) and making it available in multiple regions. That would almost have been enough but I'm currently working on being able to provision an instance either in AWS or Rackspace because...well...Amazon might totally go down one day! Yes, its over-the-top but I needed something simple to help me climb up the devops and cloud learning curve. So off the bat - puppet installation. I've an older 10.04 Ubuntu virtual server which has been somewhat under-taxed so I've set that up as a puppet master. First lesson - always use the latest version from a tarball unless you have kept t...

More Observations

After this post I AM going to make videos ;) I spent some time doing some basic tests last night under non optimal (but good) conditions: 1) Double click/single click/long tap/short tap These all can be supported using in air interactions and pinch gestures. I'd estimate I had +90% accuracy in detection rate for everything apart from single click. Single click is harder to do since it can only be flagged after the delay for detecting a double click has expired and this leads to some lag in the responsiveness of the application. 2) The predator/planetary cursor design. In order to increase the stability of my primary marker when only looking at a single point e.g. when air drawing, I decided to modify my cursor design. I feel that both fiducial points should be visible to the user but it didn't quite "feel" right to me using either the upper or lower fiducial when concentrating on a single point hence I've introduced a mid-point cursor that is always 1/2 wa...