Steve Goldstein: Finding your voice

Steve Goldstein’s Amplifi Media works with media companies and podcasters in developing audio content strategies. Goldsteing writes frequently at Blogstein, the Amplifi blog.


Two weeks ago, it was the bright lights of Las Vegas at the CES (Consumer Electronic Show) with rides, giveaways and manufacturers from refrigerators to toilets to cars touting their smart speaker integrations.

One week later, I was far from the glitz of Vegas in Chattanooga, Tennessee at a conference of several hundred Amazon Echo developers working to figure out what people will respond to, and how they interact with smart speakers.

It is by no means certain that people are interested in a whole lot more than weather forecasts and music from their devices, but this group is aggressively going after it.

Some things observed:

Innovation abounds – Smart speakers are a nascent category exploding with experimentation. We saw the gamut ranging from interactive book adaptations aimed at children who index as early users of smart speakers, navigational skills, voice-based surveys and way more.

Brush your teeth, alot, and listen to podcasts – Gimlet Media won Alexa Skill of the Year for “Chompers,” a 2 minute daily feature which accompanies kids as they brush their teeth. Wilson Standish from Gimlet was on hand and also joined me and a few others for a session on smart speakers and podcasts, which certainly has its early-inning challenges but the opportunity is clear.

To your health – Health is a big voice category with plenty of developers on hand. Notable is the Mayo Clinic. They have spent several years developing an impressive first aid skill. Dr. Sandhya Pruthi spoke to the group about the initiative which takes the valuable resources of their website and reinterprets it for voice with shorter more directed answers to first-aid queries. There is little time to waste when someone is choking or there has an odd reaction to a bee sting. A large part of the population, especially as the nation gets older, feels more comfortable with voice. Look for more innovation in this sector.

Finding An Audience/Users For A Skill/Show, Can Be Elusive With 70,000 Active Skills

Discovery is on everyone’s mind – Similar to the challenge facing many podcasters, finding an audience/users for their skill/show, can be elusive with 70,000 active skills worldwide and a still confusing “enable” process. Megaphones matter.

Interactive voice games – Among the highlights of the conference was the appearance of the inventor of Atari and Chuck E Cheese, Nolan Bushnell. He regaled the crowd in his Alexa “triggered” light-up sneakers and in his keynote announcing his new venture, X2games featuring interactive voice games on the Alexa platform. St. Noire is their first game to market with his Hollywood creative director and co-founder Zai Ortiz. It is a murder mystery with various clues.

Some skills have daunting menus – I’m thinking about those old endless phone tree menus (and not so old….. I’m looking at you, every airline …) There are many skills that offer too many selections and options and thus are difficult to follow. Many (most?) end up in the “I tried it once, but never again” bin. The business faces a great deal of skill abandonment and this is certainly a part of that.

Complex multi-part requests – The mission for many is to go beyond simple data requests such as the forecast, to more conversational engagement with voice devices. Soundhound VP/GM Katie McMahon and others talked about the advances made toward multi-part requests such as asking for “Italian restaurants with a four star rating, but excluding pizza shops.” Both Amazon and Google are marching quickly to conversational episodes with their devices. However, they don’t yet know that Tom Brady is playing in the Superbowl, (I asked) so we are not there just yet.

Searching for audio has been elusive – Israeli company Audioburst listens and tracks audio from hundreds of radio stations and podcasts and serves up “relevant” audio curated by the user. Try their skill “news feed.” I had some trouble getting things going, I didn’t ask for weather, but it opened to a forecast and then a traffic report from a New York radio station. I see the vast potential, but so far it surfaces superfluous content.

The data is talkingVoicebot.AI chief Bret Kinsella kicked off the conference with a series of useful data points illustrating the different hierarchy of voice use in different environments. On smart speakers, for example, people ask questions, stream music and check the weather. On smartphones they ask questions, seek directions or call someone. Voice systems in the car are dominated by calling, asking for directions and sending texts.

Congratulations to Bradley Metrock who dreamed up the conference, and put together a varied agenda and attendee list ranging from big companies to small entrepreneurs. It was a pleasure to be there and meet so many people aiming to build the “voicefirst” future.

Steve Goldstein