I think that some aspects of what Gates says about convergence are correct, while he still misses some points.
Right now we use different devices for different tasks because each specialised divice is better and cheaper than a general device. Given sufficiently advanced technology, the technical barriers to convergence will fall, but will the social, psychological, and ergonomic reasons be addressed? As well, given that advanced technology, will the specialised devices then continue to stay ahead inside their own domains?
What are these domains? What tasks to users actually wish to do? Can they be divided into "dimensions" or aspects of domains? Let me try to list some dimensions, to explain:
- Content type
- Text
- Picture
- Audio
- Video
- Storage
- Local
- Networked
- Consumption vs Production
- Background vs Foreground (concentration, interaction)
- Private vs Shared
- Shape (portability, compactness)
Let's give some use cases for technology, and then explore the relevant aspects/dimensions.
- Reading text (PDF, webpage, book)
- Writing documents (word processor, spreadsheet, presentation)
- Texting, emailing, blogging
- Text chatting
- Organising contact information
- Taking pictures
- Looking at pictures
- Listening to music
- Phoning someone
- Watching TV, movies
- Video conferencing
- Playing video games
Reading text
------------
- Content type = Text
- Storage = Local and Networked
- Consumption
- Foreground
- Private
- Shape: Screen as small as a book when mobile, or monitor size when stationary, or projector sized when sharing
Writing documents (word processor, spreadsheet, presentation)
-------------------------------------------------------------
Content type = Text, Picture
- Storage = Local
- Production
- Foreground
- Private
- Shape: Screen monitor sized, full sized keyboard and mouse preferred, but can handle mobile sizes
Texting, emailing, blogging
-------------------------
Content type = Text
Storage = Networked
Consumption and Production
Foreground
Shared
Shape: Need adequate text entry better than cellphone, but can settle for blackberry size. Screen should show 4+ lines of text
Text chatting
-------------
Content type = Text
Storage = Networked
Consumption and Production
Foreground: Intermittant, since typing is slow, people multitask
Shared
Shape: Same entry as texting, but need larger screen to show last several sent and received messages
Organising contact information
-----------------------------
Content type = Text, some Picture
Storage = Local and Networked for backing up, syncing
Consumption and Production
Foreground
Private
Shape: Same entry as texting, but need somewhere inbetween texting and text chatting screen
Taking pictures
---------------
Content type = Picture, Video
Storage: Local, Networked to archive print and share
Production
Foreground
Private and Shared
Shape: Optically the lens can vary in size for zooming. Screen arround 2"x2". Don't want larger. CCD of 4+ mpixels for serious pictures, and less for fun pictures
Looking at pictures
------------------
Content type = Picture, Video
Storage = Local
Consumption
Foreground
Private and Shared
Shape: Private viewing can vary from 2"x2" to 6"x4" to monitor size. Group viewing goes from book size to large television size
Listening to music
-----------------
Content type = Audio
Storage = Local is preferred, Network for radio or streaming or subscription
Consumption
Background
Private and Shared
Shape: For private the smaller the better. The more songs requires organisation, so a screen with 5+ lines. Shared requires larger speakers.
Phoning someone
---------------
Content type = Audio
Storage = Networked
Consumption and Production
Foreground
Shared
Shape: Screen ~5 lines of text. As small as possible, while fingers can still dial easily. Or use voice to dial.
Watching TV, movies
-------------------
Content type = Video
Storage = Local preferred, Network for streaming or subscription
Consumption
Foreground
Private and Shared
Shape: As large as possible is preferred, but travellers like display from 6"x4" to 12"x8"
Video conferencing
------------------
Content type = Video
Storage = Networked
Consumption and Production
Foreground
Shared
Shape: Need screen from 4"x4" upwards
Playing video games
-------------------
Content type = Video
Storage = Local preferred, Network for streaming or subscription
Consumption
Foreground
Private and Shared
Shape: Need screen from 4"x4" upwards
So, what has possibility of converging, based on form factor?
- Small screen of a few lines of text, up to 2"x2"
- Having a keyboard
- Texting, emailing, blogging [screen 4+lines, mini kb]
- Organising contact information [screen 6+lines, mini kb]
- Phoning someone [screen 5 lines, numerical keypad]
- Not keyboard, but some buttons
- Taking pictures [screen 2"x2"]
- Looking at pictures [screen 2"x2" to 6"x4"]
- Listening to music [screen 5+lines, song selection buttons]
- Video conferencing [screen 4"x4"+]
- Playing video games [screen 4"x4"+, joystick buttons]
- Book sized screen (assume it folds down when in pocket)
- Having a keyboard
- Reading text (PDF, webpage, book) [book->projector]
- Writing documents [monitor, full kb / notebook size]
- Texting, emailing, blogging [screen 4+lines, mini kb]
- Text chatting [screen 10+lines, mini kb]
- Organising contact information [screen 6+lines, mini kb]
- Not keyboard, but some buttons
- Reading text (PDF, webpage, book) [book->projector]
- Looking at pictures [screen 2"x2" to 6"x4"]
- Watching TV, movies [6"x4" to 12"x8" to larger]
- Video conferencing [screen 4"x4"+]
- Playing video games [screen 4"x4"+, joystick buttons]
Predictions (some are obvious, already happenning)
- Ok, so look for phones to converge with PDAs (contact info, small text gfx), and move more into blackberry-ish texting/emailing
- Small (flash) music players that involve no user interaction will remain single purpose, to keep costs down
- Large capacity music players need a screen large enough for organising the music, which can be large enough for previewing pictures and playing video games. The different hand control requirements might limit this.
- I think that camera CCDs will be integrated right into music player/recorders, since they both require large storage, and have similar screen, input requirements
- Video will only take off if there are video inputs and outputs, to act as a PVR
>>> So, no Mr Gates, music players won't converge with cell phones. Phone+PDA communication device and iPod+Camera audio and picture device
- How will book sized text entry (and viewing) devices work? Why have one when you could use your notebook or phone? I think they'll be niche
- It would be convergence of Tablet+PDA, which are both dead or dying
- Book sized viewers, on the other hand, may succeed.
- Might replace books, especially if can fold or roll to pocket size
- Sony PSP ?
- Future of tablets? Tablets fail because suck for text entry. So forget data entry, and make a cheap data viewer
- Why won't these converge with phones, iPods? Phones and iPods want to be as small as possible. This wants to have as large a screen as possible, while still being portable.
- Unless video phoning takes off, then people will want a large screen when chatting, and then it could all collapse into one.
Ok, so I don't want to have three devices on me, my Phone+PDA, iPod+Camera, and eBook+PSP. How will this work?
Well, when all folded up, all three would fit in one large pocket anyway.
Plus, with bluetooth, you can keep them in a bag, and only hold one at a time.
So, why isn't this happenning yet?
- Cell phone standards change rapidly, whereas audio, video and text consumption does not. So no one would want to couple phones to anything yet.
- Networks are not fast enough to send video, so video phones aren't here yet
- Camera phone CCDs are not at 4 or 5 mega pixel yet
- We don't have flexible or foldable screens
- Costs are too high for some of the specialised devices, let alone converged devices. We have to wait for more market penetration of the specialised devices before even bothering at convergence.
- Cell phones still use those damn numerical keypads. They haven't clued in to use the ones with one letter per key. This requires a folding phone
- Storage desities aren't large enough to have a full movie on a small enough disk or chip. Maybe blue lasers will solve this. HD movies will push this further away.