My point was that using the word 'intuitive' isn't right here. If it were actually 'intuitive' then it wouldn't need to be explained or the explanation would be very common sense. A mouse is intuitive. The explanation is something like "moving this mouse attached to the computer moves this pointer on the screen. When the pointer on the screen is over something you want to interact with, click the button on the mouse". Even small children get that easily.
A set of gestures, although they can be learned and some may make more sense than others, are not 'intuitive'.
Well, "pinching" and "stretching", IMHO, are incredibly intuitive - far moreso than sliding a zoom slider on the screen using the "intuitive" mouse.
In other words, yes, mice are "intuitive". However, they are used to manipulate on-screen artifacts whose effects are, in fact, quite
not intuitive. Gestures are even more intuitive than the mouse (to make a gesture I move my fingers on the surface), and have the distinct advantage of allowing for multiple action vectors (ie, your fingers may all move in different directions to the extent that is possible with human fingers and hands). This allows them to manipulate on-screen objects in a much more intuitive manner.
There are four obvious aspects to an intuitive interface:
1. It works analogous to something else the user is familiar with
2. It works the same way in a similar context (ie, the "zoom" gesture in Excel is the same as the "zoom" gesture in Safari)
3. Commands are as distinct as their effects (ie, similar gestures will achieve similar effects, and there is generous distinction between any two gestures)
4. Effects are reinforced in multiple media (ie, on-screen realtime display of what the user is doing on the gesture pad; perhaps dynamic texturization on the gesture pad itself could be used for additional feedback)
When all you have is "point", "poke", and "double poke", you fail a lot of those tests by default.