Standardizing and modeling. Why not begin with the early voice blog best practices and evolve from there?
Take for example Adam's model, where he creates a voice file that directly relates to a post. Presently the voice file represents a one to one realtionship with a post but it could have easily have been a one to many or even a many to one relationship.
Another practice is to just alow the voice file to flow with the text of the post. Just like clip art or a photo image which is common today in many html pages.
In either case the audio icon plays an inportant role, the visual clue to identify the voice file presence.
Is that enough to get us busy??
What do you think?