|
16 December 2002 |
Paul Novarese is right to be sceptical about embedded speech recognition, but two points may work in its favour:
-
voice-to-SMS has a smaller speech recognition "problemspace": people will typically use voice for SMS entry because they're eg driving (or, yes, because it's easier for them). The kinds of message they'll send present a smaller set of recognition/lexicographic possibilities than voice-commanding the PC (which has diverse use cases such as dictating a letter or email, surfing the web, navigating the OS...)
-
Speech recognition *is* getting better. Even though the typical PC-based system requires the same speak-and-refine cycle to learn how you say words (aka speaker-dependent speech recognition), some people find dictation-on-PC works really well. And speaker-independent services (where anyone can call it, and the speech rec usually sits on a server at the end of the phone line) are pretty competent these days: TellMe, BeVocal, SRC, Eckoh, etc.
And he hints at a potential problem: if you're embedding a speechrec engine in a device and performing neither over the air updates nor speaker-dependent learning, then it had better be really good.
See also: David Isen: SMS-with-voice-input as killer app
11:26:06 PM
|
|
While Googling is innocuous, it is not entirely reliable. For starters, people share names. You can't be sure if you're reading about one guy who has had a varied career (or who can't hold a job) or several people. And you can't be sure which one will be buying you a cocktail. (The surgeon? The sock collector?) Googling myself -- which sounds more perverse than it is -- turned up an architecture prof at McGill University, a video editor, a broker of sports tickets and a table-tennis champ; I'm now jealous of all of them.
Moral of the story: if you want to have good google juice, don't have a common name, and don't share a name with someone famous. See also: Google dating.
11:25:02 PM
|
|
It's critical to note that if you get the same model phone with a different carrier, it may fundamentally execute a different kind of code. The two major platforms are: J2ME and BREW.
Colin Fahey's home test (namely: how he successfully created and downloaded his own "hello world" program to his cell phone, over the air, through his own WWW site, without paying for anything more than air time and data transfer) concludes that J2ME is free, open, developer-friendly (but less optimised for mobile phone performance); Qualcomm's BREW is closed, developer-hostile, and therefore market-hostile.
Related: Gamasutra's Comparing J2ME and BREW for wireless games, which had this last word:
Based purely on API features, BREW has more options, and the low-level C/C++ access to the phone's features is a more familiar environment to the seasoned game programmer. Additionally, Qualcomm continues to update BREW with new SDK and tool releases at a faster pace than J2ME spec revisions.
As a development environment, J2ME is far easier to get started with and much simpler to implement than BREW, but J2ME has been around far longer than BREW. This year will be critical for BREW - if Qualcomm can sign more carriers, produce more powerful handsets, and refine the toolset, it will be a major rival to Sun's early lead in mobile application development platforms.
Related2: And Brew and Beyond (also at Gamasutra), makes some interesting points which suggest that BREW is getting some traction, and addressing some of the issues that make it developer-unfriendly compared with J2ME:
- Qualcomm will allow the use of the free GCC ARM compiler in the creation of BREW applications
- The domestic/overseas BREW markets are growing: Verizon, Alltell and U.S. Cellular will bring around 10m potential customers; a new deal with China Unicom another 55m; KDDI in Japan...
- ... but "Qualcomm's biggest hurdle has been getting into the critical European market. Because of Europe's early adoption of GSM and GPRS, there seems to be a major political barrier with getting Qualcomm, the creator of the competing CDMA standard, a foothold in the European market. [...] Considering the massive investment European carriers have made in GSM/GPRS and various flavors of 3G, not to mention the legal and bureaucratic obstacles to installing competing technologies in Europe, it is more likely that BREW will be coming to GSM/GPRS chipsets than CDMA coming to Europe."
11:22:11 PM
|
|
Speech recognition will always fail as an input mechanism for screen-based computers. Keyboard input (or handwriting) uses eye-hand co-ordination, which we have been developing for thousands of years. On the other hand, speaking to a screen is not natural. The form factor that works for speech recognition is a headset, where people get the input from earphones rather than from screens.
2:37:11 PM
|
|
© Copyright 2003 rodcorp.
|
|
|