e "Super Human Speech Recognition Initiative," IBM's push aims to create new technology that supports what IBM Voice Systems Director Nigel Beck calls "conversational computing."
The Super Human Speech Recognition Initiative's ultimate goal is to create technology that performs better than humans for any transcription task, without the need for customization. It seeks accurate transcription of everything from voice mail to meetings and customer service calls -- with full audio (and possibly) video searching capabilities. Along the way, the company plans a number of milestones that it expects will have wide-ranging applications in everything from data mining in call centers to interpersonal communication to biometrics....
'The state of the speech world is roughly where the state of the Web world was six years ago,' Beck said....
To that end, working with partners Motorola and Opera, IBM has submitted a specification for Multimodal Access to the World Wide Web Consortium (W3C). The specification, XHTML+VoiceXML, would allow users to access data on devices through multiple modes of interaction.
'Multimodal is the mixing of voice and data," Beck said. "People operate in multiple modes at once....'
The company has also put together a number of prototypes to display its ideas.
One, Meeting Miner, is an agent used during meetings to passively capture and analyze meeting discussion. It also has the capability of becoming an active participant in the meeting when it finds information it determines to be pertinent to the discussion. Meeting Miner uses the audio streams from one or more microphones to capture the speech during the meeting and converts it to a text transcript." [allNetDevices Wireless News]
[
The Shifted Librarian]
10:23:29 AM