“HD” Audio for Voice

For nearly a decade I’ve been using Skype and it’s now the ‘hub’ of my voice communications, especially since (in 2005) Skype enabled “SkypeIn” which allowed a user to select a phone number, pay a one-time $60 fee, and have that number “point” to Skype. This is especially useful when a landline or mobile phone user calls — or when I forward my direct landline or Google Voice number to Skype — but it misses one huge benefit achieved with Skype-to-Skype calling: high fidelity audio.

The sampling rate of the plain old telephone system (POTS) is 8kHz and Skype is usually sampling at 16kHz (depending on ones bandwidth at the time). But that tells only a small part of the story since Skype’s SILK codec — which can actually sample at 8, 12, 16 or 24 kHz and at a bitrate from 6 to 40 kbits/second — can scale up or down on the fly and give one the best possible call quality with available resources.

So imagine my delight to see this post at Skype today talking about an even newer codec called “Opus“, a “...totally open, royalty-free, highly versatile audio codec. Opus is unmatched for interactive speech and music transmission over the Internet, but also intended for storage and streaming applications. It is standardized by the Internet Engineering Task Force (IETF) as RFC 6716 which incorporated technology from Skype’s SILK codec and Xiph.Org’s CELT codec.

Why should you care? 

When humans communicate signal matters. Think about a conversation you were having with someone when a jet airplane went overhead, a leaf blower started up right behind you, or a huge garbage truck drove by. “Huh? What did you say?” is the refrain as we ask the person talking to repeat themselves. Too much noise stomps all over the audio ‘signal’ or volume coming from that other person.

Or you’re watching a movie in a dark theater. Imperceptibly the dialogue audio begins to drop and you lean forward, body tensing, as you strain to hear what is being said. Suddenly ARGGH!! a hand pops out and grabs the protagonist by the neck and the entire audience SCREAMS as their tension is released. It works because our bodies tense up as we try to stay connected to the thought stream of the person talking.

So now imagine you’re on one of those conference calls we all hate but find necessary. Some people are calling in on mobile phones, some on landlines, and some maybe on Skype (FreeConferenceCallHD allows all, for example) and the ambient noise on the conference call makes it really hard to hear. THAT is why better audio is key.

If this Opus codec is leveraged by developers, and all indications are it already is, we will see high fidelity audio services popping up all over the place: in mobile apps; our web browsers; and services that take advantage of it for conference calling.

FINALLY we’ll be able to stop saying, “Huh!?! What did you say? Can you repeat that?

Posted in ,  

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About Steve Borsch

Strategist. Learner. Idea Guy. Salesman. Connector of Dots. Friend. Husband & Dad. CEO. Janitor. More here.

Facebook | Twitter | LinkedIn

Posts by Category

Archives (2004 – Present)

Connecting the Dots Podcast

Podcasting hit the mainstream in July of 2005 when Apple added podcast show support within iTunes. I'd seen this coming so started podcasting in May of 2005 and kept going until August of 2007. Unfortunately was never 'discovered' by national broadcasters, but made a delightfully large number of connections with people all over the world because of these shows. Click here to view the archive of my podcast posts.