Speaking Engagements & Private Workshops - Get Dean Bubley to present or chair your event

Need an experienced, provocative & influential telecoms keynote speaker, moderator/chair or workshop facilitator?
To see recent presentations, and discuss Dean Bubley's appearance at a specific event, click here

Tuesday, April 19, 2016

TelcoFuturism: Will AI & machine-learning kill the need for network QoS?

Following on from my introductory post about TelcoFuturism (link), this is a forward-looking "what if?" scenario. It arises from one impending technology intersection - the crossover between network policy-management, real-time applications (especially voice & video) and machine-learning/artificial intelligence (AI)

One of the biggest clichés in telecoms is that every new network technology allows the creation of special "quality of service" characteristics, that potentially enable new, revenue-generating, differentiated services. But while QoS and application-based traffic-engineering certainly is useful in some contexts - for example, managed IPTV on home broadband lines, or prioritisation of specific data on enterprise networks - its applicability to a wider audience remains unproven. 

In particular, end-to-end QoS on the public Internet, paid-for by application or content providers and enforced by DPI and in-network policy engines, remains a fantasy. Not only does Net Neutrality legislation prohibit it in many cases, but the concept is an undesirable and unworkable fallacy to begin with

App-specific QoS doesn't work technically on most shared networks (ask colleague Martin Geddes, who'll enlighten you about the maths of contention-management). There's no way to coordinate it all the way from server-to-user-access. While CDNs and maybe future mobile edge nodes might help a bit, that's only a mid-point, for certain applications. On mobile devices, the user is regularly using one of millions of 3rd-party WiFi access points, over which the app-provider has no control, and usually no knowledge. The billing and assurance systems aren't good enough to charge for QoS and ensure it was delivered as promised. Different apps behave differently on different devices and OS, and there's no native APIs for developers to request network QoS anyway. And increasing use of end-to-end encryption makes it really hard to separate out the packets for each application, without a man-in-the-middle.

There's also another big problem: network quality and performance isn't just about throughput, packet-loss, latency or jitter. It's also about availablility - is the network working at all? Or has someone cut a fibre, misconfigured a switch, or just not put radio coverage in the valley or tunnel or basement you're in? If you fall off of 4G coverage back to 3G or 2G, no amount of clever policy-management is going to paper over the cracks. What's the point of 5-9's reliability, if it only applies 70% of the time?

Another overlooked part of QoS management is security. Can DDoS overload the packet-scheduling so that even the "platinum-class" apps won't get through? Does the QoS/policy infrastructure change or expand the attack surface? Do the compromises needed to match encryption + QoS introduce new vulnerabilities? Put simply, is it worth tolerating occasionally-glitchy applications, in order to reduce the risks of "existential failure" from outages or hacks? 

There are plenty of other "gotchas" about the idea of paid QoS, especially on mobile. I discussed them in a report last year (link) about "non-neutral" business models, where I forecast that this concept would have a very low revenue opportunity.

There's also another awkwardness: app developers generally don't care about network QoS enough to pay for more of it, especially at large-enough premiums to justify telcos' extra cost and pain of more infrastructure and IT (and lawyers)

While devs might want to measure network throughput or latency, the general tendency is to work around the limitations, not pay to fix them. That's partly because the possibility isn't there today, but also because they don't want to negotiate with 1000 carriers around the world with different pricing schemes and tax/regulatory environments (not to mention the 300 million WiFi owners already mentioned). Most would also balk at paying for networks' perceived failings, or possibly to offset rent-seeking or questionable de-prioritisation. Startups probably don't have the money, anyway. 

Moreover - and to the core of this post - in most cases, it's better to use software techniques to "deal with" poor network quality, or avoid it. We already see a whole range of clever "adaptive" techniques employed, ranging from codecs that change their bit-rate and fidelity, through to forward error-correction, or pre-cacheing of data in advance if possible. A video call might drop back to voice-only, or even messaging as a fallback. Then there's a variety of ways of repairing damage, such as packet-loss concealment for VoIP. In some cases, the QoS-mitigation goes up to the UI layer of the app: "The person you're talking to has a poor connection - would you like to leave a voicemail instead?"

And this is where machine-learning and AI comes in. Because no matter how fast network technology is evolving - NFV & SDN, 5G, "network-slicing" or anything else - the world of software and cognitive intelligence is evolving faster still. 

I think that machine-learning and (eventually) AI will seriously damage the future prospects for monetising network QoS. As Martin points out regularly, you can't "put quality back into the network" once it's lost. But you can put quality, cognitive smarts or mitigation into the computation and app-logic at each end of the connection, and that's what already occurring and is about to accelerate further.

At the moment, most of the software mitigation techniques are static point solutions - codecs built-into the media engines, for instance. But the next generation is more dynamic. An early example is that of enterprise SD-WAN technology, which can combine multiple connections and make decisions about which application data to send down which path. It's mostly being used to combine cheap commodity Internet access connections, to reduce the need to spend much more on expensive managed MPLS WANs. In some cases, it's cheaper and more reliable to buy three independent Internet connections, mark and send the same packets down all of them simultaneously, and just use whichever arrives first at the other end to minimise latency. As I wrote recently (link), SD-WAN allows the creation of "Quasi-QoS".

Furthermore, an additional layer of intelligence and analytics allows the SD-WAN controller (sitting in the cloud) to learn which connections tend to be best, and under which conditions. The software can also learn how to predict warning-signs of problems and what the best fixes are. Potentially it could also signal to the app, to allow preventative measures to be taken - although this will obviously depend on the timescales involves (it won't be able to cope with millisecond transients, for instance).

But that is just the start, and is still just putting intelligence into the network, albeit an overlay.

What happens when the applications themselves get smarter? Many are already "network-aware" - they know if they're connected via WiFi or 4G, for example, and adapt their behaviour to optimise for cost, bandwidth or other variables. They may be instrumented to monitor quality and self-adapt, warn the user, or come up with mitigation strategies. They have access to location, motion-sensor and other APIs, that could inform them about which network path to choose.

But even that is still not really "learning" or AI. But now consider the next stage - perhaps a VoIP application spots glitches, but rather than an inelegant drop, it subtly adds an extra "um" or "err" in your voice (or just a beep) to buy itself an extra 200ms to wait for the network to catch up? Perhaps it is possible to send voice-recognised words and tone to a voice-regenerating engine at the far end, rather than the modulated wave-forms of your actual speech?

Or look forward another few years, and perhaps imagine that you have a "voice bot" that can take over the conversation on your behalf, within certain conversational or ethical guidelines. Actually, perhaps you could call it an "ambassador" - representing your views and empowered to take action in your absence if necessary. If two people in a trusted relationship can send their ambassadors to each others' phone, the computers can take over if there's a network problem. Your "mini-me" would be an app on your friend's or client's device and create "the illusion of realtime communications".
   
Obviously it would need training, trust and monitoring, but in some cases it might even generate better results. "Siri, please negotiate my mobile data plan renewal for the best price, using my voice". "Cortana, please ask this person out on a date, less awkwardly than I normally do" (OK, maybe not that one...)

Investment banks already use automated trading systems, so there are already examples of important decisions being made robotically. If the logic and computation can be extended locally to "the other end" - with appropriate security and record-keeping - then the need for strict network QoS might be reduced. 

Machine-learning may also be useful to mitigate risks from network unavailability, or security exploits. If the app knows from past experience that you're about to drive through a coverage blackspot, it can act accordingly in advance. The OS could suggest an alternative app or method for acheiving your underlying goal or outcome - whether that is communication or transaction - like a SatNav suggesting a new route when you miss a turn.



For some applications, maybe the network is only used a secondary approach, for error-correction or backup. In essence, it takes the idea of "edge computing" to its ultimate logical extension - the "edge" moves right out to the other user's device lor gateway, beyond the network entirely. (This isn't conceptually much different to a website's JavaScript apps running in your browser)

Obviously, this approach isn't going to work ubiquitously. Network QoS will still be needed for transmitting unpredictable real-time data, or dealing with absolutely mission-critical applications. Heavy-lifting will still need to be done in the cloud - whether that's a Google search, or a realtime lookup in a sales database. Lightweight IoT devices won't support local computing and maintain low power consumption. But clever application design, plus cognitively-aware systems, can reduce the reliance on the access network for many cases. It could just be argued that this is just a lower quality threshold, but at a certain point that coincides with what is routinely available from a normal Internet connection, or perhaps two or three bonded or load-balanced together.

But overall, just as we expect to see robots taking over from humans in "automatable jobs", so too will we see computation and AI taking over from networks in dealing with "automatable data". The basis for the network "translocating" data becomes less of an issue, if the same data (or a first-approximation) can be generated locally to begin with.