Skip to main content

Have we got a video (2)?

Yes, but I'm not posting it yet *grin*

A very frustrating bug cropped up when I tried tying the camshift based detector into the marker tracking service - only 1/3 of the marker co-ordinate updates were being processed! Sure, my code is ugly, inefficient, leaking memory left right and centre BUT thats no reason to just silently discard the data I'm generating (and yes, I am generating what I think I'm generating). I strongly suspect the culprit is asyncproc - I've had some experience before with trying to parse data via pipes and hence know its....not the preferred way to do things, however proof of concept wise I hoped it would save the hassle of having to get processes talking to each other. *sigh* "If its worth doing once, its worth doing right."

Anyways, I've worked around it, and have the basics up and running. What are the basics?

- Basic gmail reader. Main purpose here is to look at pinch scrolling.

- Basic notifier. Shows new mails as they arrive. Purpose is to examine if the pico-projector provides efficient support for micro-interactions.

- Basic music player. Main purpose here is to test simple gesture control.

- Basic text input area. Main purpose is to test the air-writing concept.

The basic test run I've done so far suggests that there's work to be done, but the basic concept is sound and the hardware/software is sufficient for a small scale study.  Some thoughts :

- dwell regions need to be active rather than passive. By this I mean that its too easy to enter a region and unintentional execute the associated behaviour. Requiring that the markers are engaged when within the dwell region will satisfy this.

- Engaged distance needs to be a function of the area of the fiducials otherwise the engaged state is entered when the markers are at different widths depending on the distance from the camera. 

- if the application supports any form of position based input (whether it be dwell regions, hover buttons, clicks etc) use positions substantially away from the edge of the "screen".

- Marker trails; I'm actually finding these confusing, in part since there are two of them. I had thought a while ago about drawing a "pointer" that is at the mean distance between the two fiducials when in the "engaged" state. I think I need to experiment with that.

- I need to work with this concept of "command mode" a bit more. The basic concept I've been working on is that the user draws a circle to execute a command, and then ether a word or symbol (e.g. music note) to execute commands. "commands" in this context are things such as switching between "applications". While it works reasonably well, I have the main module interpreting gestures and then passing a text string to the application, due to the way that the gesture recogniser works performance would be improved if the unistroke recogniser was using a reduced set of gestures which are applicable to each module. This would improve gesture recognition substantially I think but I need to get some metrics to test that.

- Performance: Performance is DIRE! I've managed to do some rather nifty things with Python that I had thought would require C++ but the slapdash approach that I've taken to this prototype framework has things running at about 10% of the speed that it should.

Another long night ahead of me *sigh*

UPDATE: Frustratingly I've found that the camshift tracker is also very lighting dependant. What was a reasonably positive experience under daytime lighting conditions deteriorated rapidly once night fell and I attempted to use the code under artificial lighting. No amount of tweaking of parameters or changing marker colour would rectify things :
- the yellow marker was reasonably stable, as good as the light green marker I had been using for the hsv tracker. That said, under my homes lighting white takes on a yellowish hue and hence the track would get confused on areas of white.
- red/orange/pink were all terrible, frequently becoming confused with either skin tones , areas of my carpet or lights on my laptop.
- dark green was not detected at all.
- dark blue frequently became confused with my laptop screen, laptop keyboard or areas of my t-shirt (black).

So the biggest problem that the project has is that the implementation of a robust system is VERY difficult which is going to make user testing potentially tricky.

THANKFULLY this has all framed my user study very tightly - within group study, look at natural gesture drawing, compared to use of the system, command gestures, limited number of letters, words and sentences; some basic interface command and control tasks; and videos on public/private perception of WGI usage, one set using an "exagerated" UI (e.g. SixthSense) compared to the discrete "OS6Sense" style (do people even perceive it as being discrete?).

Only 4 weeks later than I had initially intended doing it *sigh*

Oh video? Hmmmm not today, there's still a few tweaks I need to do before putting it online (e.g. putting something other than Jakalope in my media dir...I like Jakalope but its their latest album and I cant help but feel that I'm listening to Brittany Spears! The shame!)

Comments

Popular posts from this blog

I know I should move on and start a new blog but I'm keeping this my temporary home. New project, massive overkill in website creation. I've a simple project to put up a four page website which was already somewhat over specified in being hosted on AWS and S3. This isn't quite ridiculous enough though so I am using puppet to manage an EC2 instance (it will eventually need some server side work) and making it available in multiple regions. That would almost have been enough but I'm currently working on being able to provision an instance either in AWS or Rackspace because...well...Amazon might totally go down one day! Yes, its over-the-top but I needed something simple to help me climb up the devops and cloud learning curve. So off the bat - puppet installation. I've an older 10.04 Ubuntu virtual server which has been somewhat under-taxed so I've set that up as a puppet master. First lesson - always use the latest version from a tarball unless you have kept t

Camshift Tracker v0.1 up

https://code.google.com/p/os6sense/downloads/list I thought I'd upload my tracker, watch the video from yesterday for an example of the sort of performance to expect under optimal conditions ! Optimal conditions means stable lighting, and removing elements of a similar colour to that which you wish to track. Performance is probably a little worse, (and at best similar to) the touchless SDK. Under suboptimal conditions...well its useless but then so are most trackers which is a real source of complaint about most of the computer vision research out there.....not that they perform poorly but rather that there is far too little honesty in just how poorly various algorithms perform under non-laboratory conditions. I've a few revisions to make to improve performance and stability and I'm not proud of the code. It's been...8 years since I last did anything with C++ and to be frank I'd describe this more as a hack. Once this masters is out of the way I plan to look a

More Observations

After this post I AM going to make videos ;) I spent some time doing some basic tests last night under non optimal (but good) conditions: 1) Double click/single click/long tap/short tap These all can be supported using in air interactions and pinch gestures. I'd estimate I had +90% accuracy in detection rate for everything apart from single click. Single click is harder to do since it can only be flagged after the delay for detecting a double click has expired and this leads to some lag in the responsiveness of the application. 2) The predator/planetary cursor design. In order to increase the stability of my primary marker when only looking at a single point e.g. when air drawing, I decided to modify my cursor design. I feel that both fiducial points should be visible to the user but it didn't quite "feel" right to me using either the upper or lower fiducial when concentrating on a single point hence I've introduced a mid-point cursor that is always 1/2 wa