How does Audio Blogger work?

Jeremy Allaire's Radio
An exploration of media, communications and applications over the Internet.

How does Audio Blogger work?

Audio Blogger was a very simple application to build, using ColdFusion MX, Flash MX, Dreamweaver MX, Flash Communication Server MX and Flash Remoting. This short article provides a high-level overview of how the utility was built and how it operates.

What it does?

Audio Blogger allows anyone to easily create a piece of original voice content and then embed that content in any web page or weblog. The voice content can be of any length, and it will automatically stream to end-users. The utility allows anyone to create an account, create audio blog items, and review how many people have listened to the items.

Why is it interesting?

In the past, enabling end-users to create and publish original audio content to the Web was time-consuming, required technical proficiency in sound-recording programs, encoding tools, file management tools, web servers, etc. With the advent of Flash Player 6 and the Flash Communication Server MX, it's now relatively easy to allow end-users to capture and publish audio content directly to a server as a recording or live broadcast. Just as the transparent and ubiquitous web browser enabled the easy creation and publishing of document content with HTML, this new platform enables the creation and publishing of richer forms of communication that involve audio, video, graphics and data.

How does it run?

The "runtime" for audio blogger consists of a few pieces:

Flash Player 6. Blog item creation, including capturing and publishing audio is handled in a Flash-based application interface. Once an audio blog item has been created, it can be embedded in any webpage with a generic SWF (Flash) file that provides the basic mechanism for playback.
ColdFusion MX. The user and message management is handled by ColdFusion MX.
Flash Remoting MX. Communication between the Flash client application and ColdFusion-based web services is handled with Flash Remoting.
Flash Communication Server MX. Audio capture, storage and streaming playback is handled by FlashCom.

How is the front-end put together?

There are two client-side pieces: the "authoring" or audio capture application, and the playback mechanism.

Authoring

The "authoring" utility is a Flash MX-based application built using a variety of pre-built components as well as a range of built-in APIs provided by the Flash runtime. The primary UI components used to create the interface include:

Push Button component
Scrollbar component
DataGrid component
ToolTip component

The primary client-side APIs used to create the authoring experience include:

NetServices APIs (Flash Remoting)
NetConnection (for real-time connections to FlashCom)
Microphone API (for capture and encoding of microphone data)
NetStream API (for client-to-server and server-to-client audio streaming and storage)

Playback

The playback SWF file is a very small (13k) and very simple file that takes URL arguments to determine which audio blog entry to load. When the file loads, it initializes a connection to the FlashCom server, preparing for streaming. When the user selects play, it uses Flash Remoting to update the messages database with a log entry that it was listened to and then streams the recorded file to the player.

What is the back-end?

There are two primary pieces to the back-end: the audio streaming and storage provided by FlashCom and the message and user management provided by CFMX.

FlashCom

Interestingly, the Flash Communication Server portion involves no server-side code. FlashCom is used to capture, store and stream audio files. This is a very limited use of FlashCom, which provides a much broader, more general purpose infrastructure for communications applications. In any case, all of the code needed to connect to the server, capture and publish audio, record it to the server and eventually play it back are handled using the NetConnection and NetStream APIs that are built into the Flash Player.

For those not familiar with how this works, here goes. Flash Player 6 includes a built-in voice-centric audio codec for both encoding and decoding. It also includes an API for using local, PC-based microphones. In an application like this, the player uses the local microphone to capture voice data, and then real-time encodes the voice data into a very efficient, compressed form. It then real-time streams the data to the FlashCom server, where it can be re-broadcasted (subscribed to) in real-time, or stored for later playback, or both. At a later point (in this case), a client connects to the FlashCom server and requests the recorded stream, which then is played back in real-time.

ColdFusion MX

In this case, ColdFusion provides two crucial functions: managing user account creation, authentication and access control for the 'authoring' component, and managing a database of recorded audio items, including various meta-data associated with those items. The entire server-side portion is provided by two components (CFCs) that are exposed as web services to Flash via Flash Remoting.