Originally published as a LinkedIn Newsletter Article - see here
I think the low-latency 5G Emperor is almost naked. Not completely starkers, but certainly wearing some unflattering Speedos.
Much of the promise around the 5G – and especially the “ultra-reliable low-latency” URLLC versions of the technology – centres on minimising network round-trip times, for demanding applications and new classes of device.
Edge-computing architectures like MEC also often focus on latency as a key reason for adopting regional computing facilities - or even servers at the cell-tower. Similar justifications are being made for LEO satellite constellations.
The famous goal of 1 millisecond time is often mentioned, usually in the context of applications like autonomous vehicles with snappy responses, AR/VR headsets without nausea, cloud-gaming, the “tactile Internet” and remote drone/robot control.
(In theory this is for end-to-end "user plane latency" between the user and server, so includes both the "over the air" radio and the backhaul / core network parts of the system. This is also different to a "roundtrip", which is there-and-back time).
Usually, that 1ms objective is accompanied by some irrelevant and inaccurate mention of 20 or 50 billion connected devices by [date X], and perhaps some spurious calculation of trillions of dollars of (claimed) IoT-enabled value. Gaming usually gets a mention too.
I think there are two main problems here:
- Supply: It’s not clear that most 5G networks and edge-compute will be able to deliver 1ms – or even 10ms – especially over wide areas, or for high-throughput data.
- Demand: It’s also not clear there’s huge value & demand for 1ms latency, even where it can be delivered. In particular, it’s not obvious that URLLC applications and services can “move the needle” for public MNOs’ revenues.
Delivering URLLC requires more than just “network slicing” and a programmable core network with a “slicing function”, plus a nearby edge compute node for application-hosting and data processing, whether that in the 5G network (MEC or AWS Wavelength) or some sort of local cloud node like AWS Outpost. That low-latency slice needs to span the core, the transport network and critically, the radio.
Most people I speak to in the industry look through the lens of the core network slicing or the edge – and perhaps IT systems supporting the 5G infrastructure. There is also sometimes more focus on the UR part than the LL, which actually have different enablers.
Unfortunately, it looks to me as though the core/edge is writing low-latency checks that the radio can’t necessarily cash.
Without going into the abstruse nature of radio channels and frame-structure, it’s enough to note that ultra-low latency means the radio can’t wait to bundle a lot of incoming data into a packet, and then get involved in to-and-fro negotiations with the scheduling system over when to send it.
Instead, it needs to have specific (and ideally short) timed slots in which to transmit/receive low-latency data. This means that it either needs to have lots of capacity reserved as overhead, or the scheduler has to de-prioritise “ordinary” traffic to give “pre-emption” rights to the URLLC loads. Look for terms like Transmission Time Interval (TTI) and grant-free UL transmission to drill into this in more detail.
It’s far from clear that on busy networks, with lots of smartphone or “ordinary” 5G traffic, there can always be a comfortable coexistence of MBB data and more-demanding URLLC. If one user gets their 1ms latency, is it worth disrupting 10 – or 100 – users using their normal applications? That will depend on pricing, as well as other factors.
This gets even harder where the spectrum used is a TDD (time-division duplexing) band, where there’s also another timeslot allocation used for separating up- and down-stream data. It’s a bit easier in FDD (frequency-division) bands, where up- and down-link traffic each gets a dedicated chunk of spectrum, rather than sharing it.
There’s another radio problem here as well – spectrum license terms, especially where bands are shared in some fashion with other technologies and users. For instance, the main “pioneer” band for 5G in much of the world is 3.4-3.8GHz (which is TDD). But current rules – in Europe, and perhaps elsewhere - essentially prohibit the types of frame-structure that would enable URLLC services in that band. We might get to 20ms, or maybe even 10-15ms if everything else stacks up. But 1ms is off the table, unless the regulations change. And of course, by that time the band will be full of smartphone users using lots of ordinary traffic. There maybe some Net Neutrality issues around slicing, too.
There's a lot of good discussion - some very technical - on this recent post and comment thread of mine: https://www.linkedin.com/posts/deanbubley_5g-urllc-activity-6711235588730703872-1BVn
Various mmWave bands, however, have enough capacity to be able to cope with URLLC more readily. But as we already know, mmWave cells also have very short range – perhaps just 200 metres or so. We can forget about nationwide – or even full citywide – coverage. And outdoor-to-indoor coverage won’t work either. And if an indoor network is deployed by a 3rd party such as neutral host or roaming partner, it's far from clear that URLLC can work across the boundary.
Sub-1GHz bands, such as 700MHz in Europe, or perhaps refarmed 3G/4G FDD bands such as 1.8GHz, might support URLLC and have decent range/indoor reach. But they’ll have limited capacity, so again coexistence with MBB could be a problem, as MNOs will also want their normal mobile service to work (at scale) indoors and in rural areas too.
What this means is that we will probably get (for the forseeable future):
- Moderately Low Latency on wide-area public 5G Networks (perhaps 10-20ms), although where network coverage forces a drop back to 4G, then 30-50ms.
- Ultra* Low Latency on localised private/enterprise 5G Networks and certain public hotspots (perhaps 5-10ms in 2021-22, then eventually 1-3ms maybe around 2023-24, with Release 17, which also supports deterministic "Time Sensitive Networking" in devices)
- A promised 2ms on Wi-Fi6E, when it gets access to big chunks of 6GHz spectrum
This really isn't ideal for all the sci-fi low-latency scenarios I hear around drones, AR games, or the cliched surgeon performing a remote operation while lying on a beach. (There's that Speedo reference, again).
* see the demand section below on whether 1-10ms is really "ultra-low" or just "very low" latency
Almost 3 years ago, I wrote an earlier article on latency (link), some of which I'll repeat here. The bottom line is that it's not clear that there's a huge range of applications and IoT devices that URLLC will help, and where they do exist they're usually very localised and more likely to use private networks rather than public.
One paragraph I wrote stands out:
I have not seen any analysis that tries to divide the billions of devices, or trillions of dollars, into different cohorts of time-sensitivity. Given the assumptions underpinning a lot of 5G business cases, I’d suggest that this type of work is crucial. Some of these use-cases are slow enough that sending data by 2G is fine (or by mail, in some cases!). Others are so fast they’ll need fibre – or compute capability located locally on-device, or even on-chip, rather than in the cloud, even if it’s an “edge” node.
I still haven't seen any examples of that analysis. So I've tried to do a first pass myself, albeit using subjective judgement rather than hard data*. I've put together what I believe is the first attempted "heatmap" for latency value. It includes both general cloud-compute and IoT, both of which are targeted by 5G and various forms of edge compute. (*get in touch if you'd like to commission me to do a formal project on this)
A lot of the IoT examples I hear about are either long time-series collections of sensor data (for asset performance-management and predictive maintenance), or have fairly loose timing constraints. A farm’s moisture sensors and irrigation pumps don’t need millisecond response times. Conversely, a chemical plant may need to alter measure and alter pressures or flows in microseconds.
I've looked at time-ranges for latency from microseconds to days, spanning 12 orders of magnitude (see later section for more examples). As I discuss below, not everything hinges on the most-mentioned 1-100 millisecond range, or the 3-30ms subset of that that 5G addresses.
I've then compared those latency "buckets" with distances from 1m to 1000km - 7 orders of magnitude. I could have gone out to geostationary satellites, and down to chip scales, but I'll leave that exercise to the reader.
The question for me is - are the three or four "battleground" blocks really that valuable? Is the 2-dimensional Goldilocks zone of not-too-distant / not-too-close and not-too-short / not-too long, really that much of a big deal?
And that's without considering the third dimension of throughput rate. It's one thing having a low-latency "stop the robot now!" message, but quite another doing hyper-realistic AR video for a remote-controlled drone or a long session of "tactile Internet" haptics for a game, played indoors at the edge of a cell.
If you take all those $trillions that people seem to believe are 5G-addressable, what % lies in those areas of the chart? And what are the sensitivities to to coverage and pricing, and what substitute risks apply - especially private networks rather than MNO-delivered "slices" that don't even exist yet?
Here are some more examples of timing needs for a selection of applications and devices. Yes, we can argue some of them, but that's not the point - it's that this supposed magic range of 1-100 milliseconds is not obviously the source of most "industry transformation" or consumer 5G value:
- Sensors on an elevator doors may send sporadic data, to predict slowly-worsening mechanical problems – so an engineer might be sent a month before the normal maintenance visit. Similarly, sensors monitoring a building’s structural condition, vegetation cover in the Amazon, or oceanic acidity isn’t going to shift much month-by-month.
- A car might download new engine-management software once a week, and upload traffic observations and engine-performance data once a day (maybe waiting to do it over WiFi, in the owner’s garage, as it's not time-critical).
- A large oil storage tank, or a water well, might have a depth-gauge giving readings once an hour.
- A temperature sensor and thermostat in an elderly person’s home, to manage health and welfare, might track readings and respond with control messages every 10 minutes. Room temperatures change only slowly.
- A shared bicycle might report its position every minute – and unlock in under 10 seconds when the user buys access with their smartphone app
- A payment or security-access tag should check identity and open a door, or confirm a transaction, in a second or two.
- Voice communication seems laggy with anything longer than 200 millisecond latency.
- A networked video-surveillance system may need to send a facial image, and get a response in 100ms, before the person of interest moves out of camera-shot.
- An online video-game ISP connection will be considered “low ping” at maybe 50ms latency.
- A doctor’s endoscope or microsurgery tool might need to respond to controls (and send haptic feedback) 100 times a second – ie every 10ms
- Teleprotection systems for high-voltage utility grids can demand 6-10ms latency times
- A rapidly-moving drone may need to react in 2-3 millisecond to a control signal, or a locally-recognised risk.
- A sensitive industrial process-control system may need to be able to respond in 10s or 100s of microseconds to avoid damage to finely-calibrated machinery
- Image sensors and various network sync mechanisms may require response times measured in nanoseconds
- Photon sensors for various scientific uses may operate at picosecond durations
- Ultra-fast laser pulses for machining glass or polymers can be measured in femtoseconds
Latency is important, for application developers, enterprises and many classes of IoT device and solution. But we have been spectacularly vague at defining what "low-latency" actually means, and where it's needed.
A lot of what gets discussed in 5G and edge-computing conferences, webinars and marketing documents is either hyped, or is likely to remain undeliverable. A lot of the use-cases can be adequately serviced with 4G mobile, Wi-Fi - or a person on a bicycle delivering a USB memory stick.
What is likely is that average latencies will fall with 5G. An app developer that currently expects a 30-70ms latency on 4G (or probably lower on Wi-Fi) will gradually adapt to 20-40ms on mostly-5G networks and eventually 10-30ms. If it's a smartphone app, they likely won't use URLLC anyway.
Specialised IoT developers in industrial settings will work with specialist providers (maybe MNOs, maybe fully-private networks and automation/integration firms) to hit more challenging targets, where ROI or safety constraints justify the cost. They may get to 1-3ms at some point in the medium term, but it's far from clear they will be contributing massively to MNOs or edge-providers' bottom lines.
As for wide-area URLLC? Haptic gaming from the sofa on 5G, at the edge of the cell? Remote-controlled drones with UHD cameras? Two cars approaching each other on a hill-crest on a country road? That's going to be a challenge for both demand and supply.