In the last few weeks I've been doing a lot of work on voice communications (and messaging / video / context):
A common, over-arching, theme is starting to form for me. The future sources of value in voice are all about SPs / vendors asking the right questions when they design new services and solutions.- I attended Enterprise Connect in Orlando discussing collaboration, UCaaS, cPaaS, WebRTC and related themes
- I spoke at a private workshop, for a Tier-1 operator group's communications-service internal experts team
- I've helped a client advise a strategy around the new European eCall in-vehicle emergency-call standard
- I've been writing a report on VoLTE adoption and impact, for my Future of the Network research stream published by STL Partners / Telco 2.0 (Subscribe! Link here)
Historically, most value in voice communications has come from telephony (Sidenote: voice is 1000 applications/functions. Phone calls are merely one of these). And in particular, the revenue has stemmed from answering the following:
- Who is calling?
- Where are they?
- Who is being called?
- Where are they?
- How long did they speak for?
- Plus (sometimes):
- When did they call?
- What networks were they on?
- Was the call high-quality? (drops, glitches etc)
- Is it an emergency?
Clearly, the answers to these questions are worth a lot of money: many billions of dollars. But equally clearly, they don't seem to be enough to protect the industry from competition and substitution from other voice-comms providers, or alternative ways of conducting conversations and transactions. As a result, voice telephony services are (mostly) being bundled as flat-rate offers into data-led bundles for consumers, or perhaps per-month/per-seat fees for unified comms (or SIP trunks) for business.
In other words, current voice revenues are being delivered based on answering fewer questions than in the past. Unsurprisingly, this is not helping to defend the voice business.
The current "mainstream" telecoms industry seems to be focused only on adding a few more questions to the voice roster:
- Is it VoIP / VoLTE / VoWiFi? (Answer = sometimes, but "so what" for the customer?)
- Can we use it to drag through RCS? (Answer = No)
- How can we reduce the costs of implementation? (Answer = maybe NFV/cloud)
- Are there special versions for emergencies? (Answer = yes, eg MCPTT and eCall)
- Is there a role for CSPs in business UCaaS? (Answer = yes, but it's hard to differentiate against Microsoft, Cisco, RingCentral, Vonage and 100 others)
- What do we do about Amazon Echo? (Answer = "Errrrmmmm... chatbots?")
Fixed and cable operators are in a slightly better position - they have long had hybrid business models partnering with PBX/UC vendors for businesses and can monetise various solutions, especially where they bundle with enterprise connectivity. For fixed home telephony, most operators have long viewed basic calls as a commodity, and are either protected by regulators via line-rental and emergency-call requirements, or can outsource provision to third parties.
In my view, there are many other questions that can be asked and answered - and that is where the value lies for the future of voice communications. None are easy to achieve, but then they wouldn't be valuable if they were:
- Why is the call occurring? (To buy something, ask a question, catch up with a friend, arrange a meeting or 100 other underlying purposes)
- Where is the call being made and received (physically)? For instance indoors, in a noisy bar, on a beach with crashing waves, in a car, in a location with eavesdroppers?
- Is the communication embedded in an app, website or business process?
- Is the call part of an ongoing (multi-occasion) conversation or relationship?
- Is a "call" the right format, with interruptive ringing and no pre-announcement? Is a push-to-talk, one-way, "whisper mode", broadcast, team or other form more appropriate?
- Are both/all parties human, or is a machine involved as well?
- What device(s) are being used? (eg headset, car, wearable, TV, Echo, whiteboard?)
- Who gets to record the call, and own/delete/transcribe the recording?
- Are the call records secure, and can they be tampered with?
- What's the most effective style of the call? (Business-like, genial, brusque, get-to-the-point-quickly etc)
- What languages and accents are being spoken? Can these be adjusted for better understanding? What about background noise - is that helpful or hindering?
- Can the call add/drop other parties? Are these pre-arranged, or can they be suggested by the system in context?
- Are the participants displaying emotion? (Happiness, anger, eagerness, impatience, boredom etc) . How can this be measured, and if necessary, managed?
- Is there a role for ultrasound and/or data-over-sound signalling before or during the call?
- How can the call be better scheduled / postponed / rescheduled?
- Is a normal phone number the best "identifier"? What about a different number, or a social / enterprise / gaming / secure identity?
- Are there multiple networks involved/available for connection, or just one? What happens when there are multiple choices of access or transit providers? What happens where the last 10m is over WiFi or Bluetooth beyond the SP's visibility?
- Is encryption needed? Whose?
- What solutions are needed to meet the needs of specific vertical-markets or other user groups? (Banking, healthcare, hospitality, gaming etc)
- What are the desired/undesired psychological effects of the communications event? How can the user interface and experience by improved?
- Did the call meet the underlying objectives of all parties? How could a similar call be improved the next time?
- How do we track, monetise and bill any of this?
I'm seeing various answers to some of these questions - for example, contact-centre solutions seem to be most advanced on some of the emotional analysis, language-detection and other aspects. There are some interesting human-driven psychology considerations being built into new codec designs like EVS (eg uncomfortable silences between words). MVNOs and cPaaS players are doing cool things to "program" telephony for different applications and devices. The notion of "hypervoice" was a good start, but hasn't had the traction it deserved (link). Machine-learning is being applied to help answer some of these questions - most obviously with Alexa/Siri/Assistant voice products, but also behind the scenes in some UC and contact-centre applications.
But we still lack any consistent recognition that voice is "more than calls". 99% of effort still seems to go on "person A calls person B for X minutes". Very little is being done around intention and purpose - ask a CSP "Why do people make phone calls?" and most can't give a list of the top-10 uses for a "minute". Most people still use "voice" and "telephony" synonymously - a sure-fire indicator they don't understand the depth of possibility here. And we still get hung up on replacing voice with video (they have a Venn overlap, but most uses are still voice-centric or video-centric).
Until both the telco and traditional enterprise solutions marketplaces expand their views of voice (and entrench that vision among employees, vendors and partners), we should continue to expect Internet- and IoT-based innovators to accelerate past the humble, 140yr-old phone call. Start asking the right questions, and look for ways to provide answers.
4 comments:
Another great post Dean, thank you!
It reminded me of a previous post of yours where you argued that VoLTE & WiFi calling were just excuses by the telco's to avoid real voice innovation.
I'm curious where in the "stack" you see the answers to these questions being implemented?
e.g. by the operators, new 3rd party services/platforms built specially to answer these questions, companies like twilio/plivo/callrail, etc ?
Thanks!
I see a mix of approaches. Quite a lot is going into the cPaaS layer (so Twilio et al) although at least in theory that could be done by larger telcos, at least for on-net connections.
Another chunk is going into enterprise systems, especially contact centres (traditional and/or cloud), which are coming both from the UC/UCaaS side and (especially with Amazon & MS) from the cloud domain as well.
There are point innovators offering APIs for contextual analysis, speech/tone/emotion analysis, data-over-sound or sometimes those are built-into other applications. Very little in telco-land though.
Telcos have a number of standalone apps (eg Telefonica TUGo, DT immmr) which have a variety of differentiators. One good idea in telco-land is the idea of pre- and post-call information, but it's unfortunately been wasted by constraining it to an RCS use-case, rather than looked at & implemented more broadly.
> the revenue has stemmed from answering the following
Then follows a list of billing parameters.
Revenue stemmed from one thing simple utility. The rentier profits from path and technology dependent monopoly.
So the list of questions you describe are irrelevant to revenue but merely the measurable and enforceable parameters that the billing system could use within the envelope of acceptability by consumers. Long distance, must cost more, not so much these days on the Internet, though my traffic was once metered by national and international destinations.
Then a much longer list of things that might affect how people communicate by voice and the context of such exchanges. Consider each of those variables (and their answers) as dimensions and you'll quickly recognise that no-one entity can cope with the complexity. Even individual users with a single set of desires (and even those vary over time and mood) have difficulty resolving all the options to a satsificing outcome. That telcos might be able to do so in a cost effective manner is risible.
Sure all of those things are important at some time to someone, but who measures, or can, "Did the call meet the underlying objectives of all parties?" Who even knows with any certainty what the objectives are?
Telephony, once all that telecommunications in the main was, succeeded due to its monopoly and simplicity. It was dial-tone and some routing (from the customer perspective) and we used this plain product to meet our ends. We continue to do so with the much more flexible commodity service of IP transit. Sadly for the incumbents, without a monopoly they lose share to competition. And they'll continue to do so.
Every attempt at complex and sophisticated solutions to consumer needs has failed miserably, because that list of questions you propose is too long, too complex and in the final analysis none of a carriers business to know, let alone solve optimally for an increasingly diverse audience.
The future of carriage is Einstein's advice, as simple as possible, but no simpler.
Post a Comment