Speaking Engagements & Private Workshops - Get Dean Bubley to present or chair your event

Need an experienced, provocative & influential telecoms keynote speaker, moderator/chair or workshop facilitator?
To see recent presentations, and discuss Dean Bubley's appearance at a specific event, click here

Thursday, December 27, 2012

WebRTC is the new battleground for peer-to-peer vs. server-based models for communications

I'm doing a really deep dive into WebRTC technology and business models at the moment. My view is that it's going to be a huge trend during 2013, and will be subject to the highest levels of hype, hope, marketing, debunking, politics, ignorance and misinformation. I'm not predicting it will take over the world (yet) - but I certainly am predicting that it's going to be a major disruptor.

**NEW Feb 2013 - Disruptive Analysis WebRTC report - details here** 

It's a fast-moving and multi-layer landscape encompassing telcos, network suppliers, device vendors, Internet players, software developers, chip vendors, industry bodies, enterprise communications specialists and probably regulators. Because my research and analysis "beat" covers all of those, I'm hoping to be the best-placed analyst and strategy consultant to decode the various threads and tease out predictions, opportunities, threats and variables.

One of the most interesting aspects is the linkage between intricate technology issues, and the ultimate winners and losers from a business point of view. Just projecting based on the "surface detail" from PR announcements and vendor slide-decks misses what's going on beneath. I'm finding myself going ever deeper down the rabbit-hole, before I can return and emerge with a synthesised and sanitised picture of possibilities, probabilities and impossibilities. That's not to say that there aren't also a set of top-down commercial and end-user trends driving WebRTC as well - but that's for another day.

A cursory glance at the WebRTC landscape reveals a number of technical battlegrounds - or at least foci of debate:

  • Codec choices, especially VP8 vs. H.264 for video
  • Current draft WebRTC vs. Microsoft's proposed CU-RTC-Web vs. whatever Apple has up its sleeve
  • The role of WebSockets, PeerConnection, SPDY and assorted other protocols for creating realtime-suitable browser or application connections
  • What signalling protocols will get adopted along with WebRTC - SIP, XMPP and so on
  • What does WebRTC offer that Flash, Silverlight and other platforms don't?
  • What bits of all this does each major browser support, when, and how? How and when are browsers updated?
While a lot of these seem remote and abstruse, there is another (mostly unspoken) layer of debate here:

Is WebRTC mostly about browser-to-browser use cases? Or is it aimed more at browser-server/gateway applications?

That is the secret question which is both chicken and egg here. Certain of the technical debates above tend to favour one set of use cases over the other - perhaps by making things easier for developers, or introducing the role of third parties who operate the middle-boxes and monetise them as "services". Because of this, it is also the hidden impetus behind various proposals and political machinations of various vendors and service providers. Other, less "Machiavellian" players are going to find themselves in the role of passengers on the WebRTC train, their prospects enhanced or damaged by these external factors without  their control.

Let's take an example. Cisco and Ericsson are both fans of H.264 being made a mandatory video codec for WebRTC. Now there are some good objective reasons for this - it is widespread on the Internet and on mobile devices and it is acknowledged as being of good quality and bandwidth-efficiency. But.... and this is the pivot point... it is not open-source, but instead incurs royalty payments for any application with more than 100K users. Conversely, Google's preferred VP8 is royalty-free but has limited support today - especially in terms of hardware acceleration on mobile devices. Maybe in future we'll see VP8-capable chipsets, but for now it has to be done in software, at considerable cost in terms of power use.

On the face of it, Cisco and Ericsson are behaving entirely rationally and objectively here. A widely adopted, hardware-embedded codec is clearly a good basis for WebRTC. But.... by choosing one with a royalty element, they are also swaying the market towards use-cases that have business models associated with them; especially ones that are based on "services" rather than "functions", as someone, somewhere, will need to pay the H.264 licence. (Ericsson is a member of the MPEG-LA patent holders for it, too). That works against "free-to-air" WebRTC applications that work purely in a browser-to-browser or peer-to-peer fashion. I guess that it could just push the licensing cost onto the browser providers, ie Google and Mozilla etc, but that doesn't help non-browser in-app implementations of WebRTC APIs.

But looking more broadly at all the battles above, I see a "meta-battle" which perhaps hasn't even been identified, and which also links to things like WebSockets (which is a browser-server protocol) and PeerConnection (browser-browser) as well as the role of SIP (very server/gateway-centric).

In a browser-to-browser communications scenario, there is very little role for communications service providers, or those vendors who provide complex and expensive boxes for them. Yes, there is a need for addressing and assorted capabilities for dealing with IP and security complexities like firewalls and NATs, but the actual "business logic" of the comms capability gets absorbed into the browser, rather than a server or gateway. It's a bit like having the Internet equivalent of a pair of walkie-talkies - once you've got them, there's no recurring service element tied to "sessions". Only with WebRTC, they'd be "virtual" walkie-talkies blended into apps and web-pages.

Now, the server-side specialists have other considerations here too. Firstly, they have existing clients - telcos - that would like to inter-work with all the various end-points that support WebRTC. Those organisations want to re-use, extend and entrench their existing service models, especially telephony and SIP/IMS-based platforms and offerings. Various intermediaries such as Voxeo, Twilio and others are helping developers target and extend the reach of those services via APIs, as discussed in my last post. Some vendors like SBC suppliers are perhaps a bit less exposed than those more focused on switching and application servers.

There is also the enterprise sector here, which will clearly like to see its call-centres and websites connect to end-users via whatever channel makes most sense. WebRTC offers all sorts of possibilities for voice, video and data interaction with customers and suppliers. They'd also (generally) prefer to reduce their reliance on expensive services-based business models in the middle, but they're a bit more pragmatic if the costs become low enough to be ignored in the wider scheme of things.

Now all of this looks like a big Venn diagram. There are some use-cases for which servers and gateways are absolutely essential - for example, calling from a browser to normal phone. Equally, there are others for which P2P makes a huge amount of sense, especially where lowest-latency connections (and maximum security/privacy) are desirable. It's the bit in the middle that is the prize - how exactly we do video-calling, or realtime gaming, or TV-hyper-karaoke, or a million other possible new & wonderful applications? Are they enabled by communications services? Or are they just functions of a browser or web-page. We don't have a special service provider to enable italic words online, so why do we need one for spoken words or moving visuals?

This isn't the only example of a P2P vs. P2Server battle - obviously the music industry knows this, as well as (historically) Skype. But it goes further, for example in local wireless connectivity (Bluetooth or WiFi Direct, vs. service-based hotspots or Qualcomm's proposed LTE-Direct). The Internet itself tends to reduce the role of service providers, although the line dividing them from content/application providers is much more blurry.

It would be wrong to classify Google as being purely objective here either. Despite high-profile moves like Google Voice, Gmail and Chat, I think that its dirty secret is that it doesn't actually want to control or monetise communications per se. I suspect it sees a trillion-dollar market in telecoms services such as phone calls and SMS's that could - eventually - be dissipated to near-zero and those sums diverted into alternate businesses in cloud infrastructure, advertising and other services.

I suspect Google believes (as do I) that a lot of communications will eventually move "into" applications and contexts. You'll speak to a taxi driver from the taxi app, send messages inside social networks, or conclude business deals inside a collaboration service. You'll do interviews "inside" LinkedIn, message/speak to possible partners inside a dating app etc. If your friend wants to meet you at the pub, you'll send the message inside a mapping widget showing where it is... and so on.

I think Google wants to monetise communications context rather than communications sessions, through advertising or other enabling/exploiting capabilities.

Even when abstracted via network APIs, conventional communications services pull through a lot of "baggage" (ie revenue and subscriber lock-in). They perpetuate the use of scarce (and costly) resources like E.164 phone numbers.

I also think that Microsoft and Apple are somewhere in the middle of this continuum, which is why they are procrastinating. They both have roles to play in both scenarios - and therefore, perhaps, are the kingmakers. Both are advocates on the specific issue of H.264 - Apple because of FaceTime, and Microsoft for reasons that seem unclear to me, as Skype is adopting VP8. More generally, Microsoft seems more server/network-centric, but is also wary of doing anything that allows the IE browser to fall further behind.

Either way, this contretemps is about more than just technology - it is, ultimately, rooted in the nature of WebRTC as a business. Specifically, it is about drawing the boundary between WebRTC services and WebRTC features.

I'm not making a judgement call here. This is not so much an iceberg analogy as a tectonic one. We've got a number of plates colliding. The action - the subduction zone - is occurring at a deep level. And over the next few years we're going to get some sudden movements that generate earthquakes and tsunamis.

(Amusingly, the first line on the tectonics web-page says "When two oceanic plates collide, the younger of the two plates, because it is less dense, will ride over the edge of the older plate" - perhaps a better analogy than I realised at first!)

Stay reading this blog in coming days: I'm working on the first seismic map of the WebRTC world. Sign up for updates here and follow @disruptivedean on Twitter.

**NEW Feb 2013 - Disruptive Analysis WebRTC report - details here** 


Ramsundar Kandasamy said...

Hi Dean,

Why do you see a meta-battle between WebSockets/PeerConnection/Sip?

When the PC based DataChannel API is ready, we can forget Websockets. In fact the js API for DC looks (exactly) the same as of WS.

In my some of my webrtc demos, I use WS to carry "meta-data" between endpoints. For such use cases, once when DC api is ready WS will be considered as deprecated. One server less in the setup :)

Steven Sokol said...

Not sure I understand your argument here. WebSocket is a browsers-server communications technology while DataChannel will enable browser-browser data communications. Both have their place. You may be able to move some meta data functions to the data channel, but the process of setting up a session between two browsers will still require some (albeit basic) signaling mechanism between the participants. WebSocket is arguably the best of several methods for handling this function.

Dean Bubley said...

Thanks - perhaps I should have been a little clearer what I meant by "meta-battle". I'm not expecting pitched fights between different factions - more using the alternatives (and their maturity) as symbolising the choice for WebRTC in general: is it primarily going to benefit of SPs or not?

Ramsunder's comment "one server less" actually nails it - as well as being more convenient for developers, it also implies less opportunity for service providers, who (generally) interact with users via servers or gateways along the signalling or media path.

Steven - much the same for your comment as well. "Both have their place" - yes, but which will be more important overall? Whichever generates the most interesting use-cases, investment, value will (probably) drive the future direction of WebRTC as a whole.

Unknown said...

Dean – great synopsis, but I disagree with the title. Is there really still a P2P vs. server-based battle? That debate has been around since Napster and I think the general computing industry has come to some consensus on it. Each is valid depending on what you want to do.
I do not see what WebRTC adds to P2P take away from the server-based deployments and change the debate. From a provider/website perspective, P2P pushes processing costs to the user - which is good for the provider/website. But P2P also inherently it limits control - which is usually not perceived as bad until you need to the control to add capabilities or improve the experience. P2P media is usually preferred, unless you need to transcode media, do some other DSP processing on it, or need to be able to capture it for lawful interception purposes. P2P signaling works for simple call models and addressing requirements. When you get into the messy real-world of legacy requirements, multi-vendor implementations, disparate address spaces, and regulation, more server control is always needed. For example P2P signaling made sense for Skype when it was small, but now that is very large and people really care about it they have moved with Microsoft into more of a server-based model.

Dean Bubley said...

Chad - yes, I agree that the more complicated the use-case (multi-vendor, transcoding etc), the more likely it is that servers and services will be involved.

And while I agree that "Each is valid depending on what you want to do", I'm thinking about the bit that's the overlap on the Venn diagram, because my sense with WebRTC is that it's larger than we anticipate.

With hardware, P2P voice is very niche and usually short-range - walkie-talkies, taxis' dispatch radios, CB and so forth. With apps, P2P addressability has got a bit broader. My belief is that WebRTC makes it even more viable & widespread - although at the same time it *also* creates a set of new server/gateway use-cases too.

But there's also a battle for attention, VC money & M&A exits, and political strength in standards creation.

I wouldn't underestimate the disruptive potential of P2P use-cases in WebRTC, although I suspect it will need an "AH-hah" moment before we really get what is possible. (Maybe "user-generated comms"?)

Meantime, it is important that others - vendors, telcos, enterprises, crack on with server/gateway based approaches so they don't get lost in the noise when it happens.

There's also a ton of stuff around device/browser evolution which is going to be an indirect driver of all this. Currently working on that as part of my WebRTC report