Skip to main content

Posts

Showing posts from October, 2011

Final survey now up

Now the hard part, I have to find some people to participate! I've vastly over stated the amount of time needed to take the surveys since I know everyone is different in how they approach these. If there's any incentive, you get to see me performing gestures...more likely to be a deterrent given I didn't even shave before hand! If you wander across this blog before December 1st, if you could please take 20 minutes to participate in one of the surveys it would be a huge help.

Camshift Tracker v0.1 up

https://code.google.com/p/os6sense/downloads/list I thought I'd upload my tracker, watch the video from yesterday for an example of the sort of performance to expect under optimal conditions ! Optimal conditions means stable lighting, and removing elements of a similar colour to that which you wish to track. Performance is probably a little worse, (and at best similar to) the touchless SDK. Under suboptimal conditions...well its useless but then so are most trackers which is a real source of complaint about most of the computer vision research out there.....not that they perform poorly but rather that there is far too little honesty in just how poorly various algorithms perform under non-laboratory conditions. I've a few revisions to make to improve performance and stability and I'm not proud of the code. It's been...8 years since I last did anything with C++ and to be frank I'd describe this more as a hack. Once this masters is out of the way I plan to look a

Survey 1

Well with my current need to avoid people as much as possible I've had to make a last minute change to my methodology for data gathering. Hopefully I'll be able to mingle with the general populace again next week and do a user study but this week at least I'm in exile! Hence I have put together 3 surveys of which the first is online. The first ones quite lengthy but it would be a huge help if anyone who wanders across this would take 20 minutes to participate. Gesture Survey 1

Video 2!

http://www.youtube.com/watch?v=v_cb4PQ6oRs Took me forever to get around to that one but I've been trying to solve lots of little problems. There's no sound so please read my comments on the video for an explanation of what you're seeing. The main issue I'm now having is with the fiducial tracking in that the distance between the centroids of each fiducial is important in recognising when a pinch gesture is made, however due to the factors of distance from the camera causing the area of the fiducial to vary and, at the same time, the often poor quality of the bounding area for each fiducial causing the area to vary, I cant get the pinch point to the level where it provides "natural feedback" to the user i.e. the obvious feedback point where the systems perception and the users perception should agree is when the user can feel that they have touched their fingers together. As it stands, due to the computer vision problems my system can be off by as much

More Observations

After this post I AM going to make videos ;) I spent some time doing some basic tests last night under non optimal (but good) conditions: 1) Double click/single click/long tap/short tap These all can be supported using in air interactions and pinch gestures. I'd estimate I had +90% accuracy in detection rate for everything apart from single click. Single click is harder to do since it can only be flagged after the delay for detecting a double click has expired and this leads to some lag in the responsiveness of the application. 2) The predator/planetary cursor design. In order to increase the stability of my primary marker when only looking at a single point e.g. when air drawing, I decided to modify my cursor design. I feel that both fiducial points should be visible to the user but it didn't quite "feel" right to me using either the upper or lower fiducial when concentrating on a single point hence I've introduced a mid-point cursor that is always 1/2 wa

So Wheres the Survey/Video?

I've had a very unexpected event happen in that my little one has come down with mumps (who has already mostly recovered from it) and its something I've never had or been immunised against, hence I've had to cancel the study I had organised for this weekend (it obviously would not be ethical for me to be in close contact with people while I might have a serious communicable illness...I just wish others would take a similar attitude when ill). And I may have to avoid contact with people for up-to 3 weeks since the contagious period is 5 days before developing symptoms and 9 days afterwards which rather puts the dampers on my plans for a user study...3 weeks from now I had planned to be writing up my analysis NOT still analysing my results. PANIC! Hence, I've adapted my research plan - I'm going to be putting up a survey this weekend which I'll run for 3 weeks, run a limited (5 uesrs! lol) user study of the prototype just after that and have to base my results/d

OmniTouch

Well there's obviously going to be a flurry of interest in WGIs given the publishing of the Omnitouch paper. Brilliant stuff, anyone want to fund me to buy a PrimeSense camera? Seriously though, ToF cameras solve a lot of the computer vision problems I have been experiencing and I was very tempted to work with a Kinect , the problem being that the Kinects depth perception doesn't work below 50cm and that would have lead to an interaction style similar to Mistrys, one which I have discounted due to various ergonomic and social acceptance factors. If I had access to this technology I would be VERY interested in applying it to the non-touch gestural interaction style I've been working on since I see the near term potential of the combined projection/WGI in enabling efficient micro-interactions (interactions which take less time to perform than it does to take a mobile phone from your pocket/bag). Anyways, good stuff and its nice to see an implementation demonstrating some

Have we got a video (2)?

Yes, but I'm not posting it yet *grin* A very frustrating bug cropped up when I tried tying the camshift based detector into the marker tracking service - only 1/3 of the marker co-ordinate updates were being processed! Sure, my code is ugly, inefficient, leaking memory left right and centre BUT thats no reason to just silently discard the data I'm generating (and yes, I am generating what I think I'm generating). I strongly suspect the culprit is asyncproc - I've had some experience before with trying to parse data via pipes and hence know its....not the preferred way to do things, however proof of concept wise I hoped it would save the hassle of having to get processes talking to each other. *sigh* "If its worth doing once, its worth doing right." Anyways, I've worked around it, and have the basics up and running. What are the basics? - Basic gmail reader. Main purpose here is to look at pinch scrolling. - Basic notifier. Shows new mails as they

New Detector Done

Much better but I'm still not happy with it - camshift + backproj + kalman means that the marker coordinates are a lot smoother with far less noise (obviously) but the nature of detecting markers in segmented video still leads to a less than robust implementation. There's room for improvement and I still need to add in some form of input dialog for naming markers (and I must confess I am CLUELESS on the c++ side for that.....wxwidgets? Qt?) but I'm that little bit happier. As per usual I had hoped for a video, but the lack of a dialog makes configuring things into a manual process (I've got basic save/load support working but given how sensitive this is to lighting still its a lot of messing around) hence I'm delaying yet again. Given my page views though I don't think I will be disappointing many people. What is frustrating is the amount of time I've had to spend on basic work with computer vision rather than looking at the actual interactions for this

Rewrite of fiducial detector

Its the last thing I want to do - I've roughed out code for most of the UI elements, the plumbing for the back-end works (although you can hear it rattle in places and there is considerable scope for improvement) but the marker detection code just isn't up for to the job and is getting a rewrite to use camshift and a Kalman filter. I tried the Kalman on the current code and its effective in smoothing the jitter caused by variations in centroid position but the continual loss of marker and the extreme numbers I am having to use to sense when the markers are engaged/unengaged is making it a frustrating experience. I MUST come up with something working by Monday so that I can do something with this and was hoping to be tweaking various parameters of the interaction today but I'm going right back to stage one. Very frustrating but I ran a few experiments with the camshift algorithm and feel its required to make the air-writing implementation flow smoothly. All nighter it lo

Drag Drop - a gesture design delema!

So I've run into an interesting interaction design problem. I've implemented some very basic list interface elements and initially supported the scrolling interaction via dwell regions. I'm unhappy with this for a number or reasons : 1) Dwell regions are not obvious to a user since there is no visual feedback to the user as to their presence. While I can provide feedback, there are times where I may choose not to do so (e.g. where the dwell region overlaps with the list). 2) Dwell regions when combined with other UI elements can hinder users interaction - e.g. if a user wishes to select an item that is within the dwell region and the dwell region initiates the scrolling behaviour causing the users selected item to move. 3) Interaction is very basic and I dont really want to implement any more support for these. The obvious alternative to a dwell region though is drag and drop (or in the case of OS6Sense, pinch and unpinch) however since these are gestures, there's

Another couple of observations

Schwaller in his reflections noted that developing an easy way to calibrate the marker tracking was important. I've observed that for application development, providing alternate input methods is equally important...quick a general usability principle of course, and all harkens back to providing multiple redundant modalities but.... My framework is about 50% of the way there, I'm becoming VERY tempted to look at a native android client but on x86 since I have the horsepower to drive things. If I had more time I'd go for it but...

Some observations/questions

Study delayed since I think I can make progress with the prototype and answer some of my questions while opening up new ones :/ I'm glad I know this sort of last minute thing is quite common in research or I might be panicking (3 months to go, omg!). I'm still having problems with marker tracking due to varying lighting conditions.  At home, my "reliable" green marker doesn't like my bedroom but is great downstairs and in my office. Blue/red/yellow - all tend to suffer from background noise. I may have to try pink! Basically I know that colour based segmentation and blob tracking is a quick and easy way of prototyping this, but real world? Terrible! If using dynamic gestures what are the best symbols to use? In fact is any semiotic system useful for gesture interaction? One could also ask are symbolic gestures really that useful for a wearable system.... Where should a camera point? i.e. where should its focus be? I've found myself starting gestures sli

Another Quick Update

I've been very busy putting together a framework to support a number of small applications for implementation - the apps are intended to be nothing more than proof of concept and to explore some of the interaction issue e.g. are dwell regions a better option than selectable areas (we're in eye tracking territory now)?; can these be applied to navigation?; How do we implement a mobile projected UI (terra incognita I believe)? The framework is largely event/message driven since it affords us with loose coupling and dynamic run-time binding for both messages and classes ~ if I wasn't farting around with abstraction of the services (useful in the longer term...services become both sinks and producers of events) it would probably come in at < 200 lines of code... The point being while I'm not supposed to be writing code at this stage I am and hope to have at least a video by the end of the weekend (yes a week late).

VERY excited!

Back to the research today and today is the day where I had set myself the serious goal of knuckling down and rewriting my introduction and literature review because I am VERY unhappy with both. I'd finished up my study definition document for my exploratory study next week and was doing some research into social acceptability and gestures...when it hit me. Most of the research suggests that only discrete gestures are socially acceptable (thus sixthsense/minority report style interaction is unlikely to be accepted by users in many social situations) so I asked myself : 1) Why look at how users naturally perform the gestures? Good question....and to be honest, because I honestly don't KNOW what I will find out. I *think* I know, but theres a huge gulf there! 2) How do I make a discrete gesture based system? and I had also been asking myself : 3) How do I expand the number of states that I can represent using my current implementation? And it hit me like an express train