|
14 August 2002 |
For Rodney Brooks, Oxygen is about making computers enter the human world rather than the other way round. For Victor Zue, it's about semantics and intent rather than syntax and form (eg: understanding meaning rather than merely transcribing from a speech interface).
TR: When an intelligent room gets crowded, how does the computer know who to pay attention to?
ZUE: We are trying to combine speech and vision in ways that they can complement each other. In a very noisy environment you invariably begin to pay attention to people in terms of their facial expressions. Lip reading can improve speech recognition performance. We might also be able to steer the microphone array toward the person whose mouth is moving.
It's a hugely ambitious project, and there's a long way to go: "sometimes people laugh, and the curtains open".
11:37:46 AM
|
|
Great article by Jonathan Eisenzopf on VoiceXML myths. We particularly liked these:
- Speech Recognition is 98% accurate: actually we've seen those levels of accuracy, but as JE says, you need to have limited grammars and directed interaction. Background noise gets addressed by having frequency-based or grammar-based barge rather than energy-based barge.
- VoiceXML is portable if I use the standard tags: "Wrong. Even though I wish it were so, I can't copy my applications from Tellme, to BeVocal, to Voxeo, to VoiceGenie and have it work without any changes. I'm not sure that I will EVER be able to because of the subtle differences in how vendors implement the standard." This relates to one of JE's other misconceptions - that VoiceXML gateways are all the same - and Rodcorp suspects that there's no real incentive for gateway vendors to go further than they have to date. Support-VoiceXML means that the box is ticked. Maybe some of them would care to correct/comment? Eric?
- It's easy to write VoiceXML applications: it's too easy for web developers to see VoiceXML as a silver bullet: slap in some VoiceXML, and it's done. Wrong. Developers need code, telephony, networking and UI skills.
9:24:25 AM
|
|
Booting up my collection of old Macs and external hard disks etc to get all the writing, art and other stuff on them onto CD, I noticed 3 interesting things:
-
The kit all works well still, except some SCSI niggles mounting the external drives. Macs do have a certain quality (Walter Benjamin's aura maybe?)
-
The PowerMac 8100 (80MHz, circa 1994) boots up faster than my current PC laptop. (however Elsie the LC (25MHz, 1993) doesn't).
-
I did some good work 1991-6!
9:22:02 AM
|
|
Dave, John, the Userland team,
I started using Radio's news aggregator today and... what a tool! Until today I understood XML/RSS as "a good thing" but only in a vague, abstract way. But today I'm actually seeing how powerful it is - and using it. Thank you.
12:29:44 AM
|
|
© Copyright 2003 rodcorp.
|
|
|