I think the low-latency 5G Emperor is almost naked. Not completely starkers, but certainly wearing some unflattering Speedos.
Much
of the promise around the 5G – and especially the “ultra-reliable
low-latency” URLLC versions of the technology – centres on minimising
network round-trip times, for demanding applications and new classes of
device.
Edge-computing architectures like MEC also often focus on
latency as a key reason for adopting regional computing facilities - or
even servers at the cell-tower. Similar justifications are being made
for LEO satellite constellations.
The famous goal of 1 millisecond
time is often mentioned, usually in the context of applications like
autonomous vehicles with snappy responses, AR/VR headsets without
nausea, cloud-gaming, the “tactile Internet” and remote drone/robot
control.
(In theory this is for end-to-end "user plane
latency" between the user and server, so includes both the "over the
air" radio and the backhaul / core network parts of the system. This is
also different to a "roundtrip", which is there-and-back time).
Usually,
that 1ms objective is accompanied by some irrelevant and inaccurate
mention of 20 or 50 billion connected devices by [date X], and perhaps
some spurious calculation of trillions of dollars of (claimed)
IoT-enabled value. Gaming usually gets a mention too.
I think there are two main problems here:
- Supply: It’s
not clear that most 5G networks and edge-compute will be able to
deliver 1ms – or even 10ms – especially over wide areas, or for
high-throughput data.
- Demand: It’s also
not clear there’s huge value & demand for 1ms latency, even where it
can be delivered. In particular, it’s not obvious that URLLC
applications and services can “move the needle” for public MNOs’
revenues.
Supply
Delivering URLLC
requires more than just “network slicing” and a programmable core
network with a “slicing function”, plus a nearby edge compute node for
application-hosting and data processing, whether that in the 5G network
(MEC or AWS Wavelength) or some sort of local cloud node like AWS
Outpost. That low-latency slice needs to span the core, the transport
network and critically, the radio.
Most people I speak to in the
industry look through the lens of the core network slicing or the edge –
and perhaps IT systems supporting the 5G infrastructure. There is also
sometimes more focus on the UR part than the LL, which actually have
different enablers.
Unfortunately, it looks to me as though the core/edge is writing low-latency checks that the radio can’t necessarily cash.
Without
going into the abstruse nature of radio channels and frame-structure,
it’s enough to note that ultra-low latency means the radio can’t wait to
bundle a lot of incoming data into a packet, and then get involved in
to-and-fro negotiations with the scheduling system over when to send it.
Instead,
it needs to have specific (and ideally short) timed slots in which to
transmit/receive low-latency data. This means that it either needs to
have lots of capacity reserved as overhead, or the scheduler has to
de-prioritise “ordinary” traffic to give “pre-emption” rights to the
URLLC loads. Look for terms like Transmission Time Interval (TTI) and
grant-free UL transmission to drill into this in more detail.
It’s
far from clear that on busy networks, with lots of smartphone or
“ordinary” 5G traffic, there can always be a comfortable coexistence of
MBB data and more-demanding URLLC. If one user gets their 1ms latency,
is it worth disrupting 10 – or 100 – users using their normal
applications? That will depend on pricing, as well as other factors.
This
gets even harder where the spectrum used is a TDD (time-division
duplexing) band, where there’s also another timeslot allocation used for
separating up- and down-stream data. It’s a bit easier in FDD
(frequency-division) bands, where up- and down-link traffic each gets a
dedicated chunk of spectrum, rather than sharing it.
There’s
another radio problem here as well – spectrum license terms, especially
where bands are shared in some fashion with other technologies and
users. For instance, the main “pioneer” band for 5G in much of the world
is 3.4-3.8GHz (which is TDD). But current rules – in Europe, and
perhaps elsewhere - essentially prohibit the types of frame-structure
that would enable URLLC services in that band. We might get to 20ms, or
maybe even 10-15ms if everything else stacks up. But 1ms is off the
table, unless the regulations change. And of course, by that time the
band will be full of smartphone users using lots of ordinary traffic.
There maybe some Net Neutrality issues around slicing, too.
There's
a lot of good discussion - some very technical - on this recent post
and comment thread of mine:
https://www.linkedin.com/posts/deanbubley_5g-urllc-activity-6711235588730703872-1BVn
Various
mmWave bands, however, have enough capacity to be able to cope with
URLLC more readily. But as we already know, mmWave cells also have very
short range – perhaps just 200 metres or so. We can forget about
nationwide – or even full citywide – coverage. And outdoor-to-indoor
coverage won’t work either. And if an indoor network is deployed by a
3rd party such as neutral host or roaming partner, it's far from clear
that URLLC can work across the boundary.
Sub-1GHz bands, such as
700MHz in Europe, or perhaps refarmed 3G/4G FDD bands such as 1.8GHz,
might support URLLC and have decent range/indoor reach. But they’ll have
limited capacity, so again coexistence with MBB could be a problem, as
MNOs will also want their normal mobile service to work (at scale)
indoors and in rural areas too.
What this means is that we will probably get (for the forseeable future):
- Moderately
Low Latency on wide-area public 5G Networks (perhaps 10-20ms), although
where network coverage forces a drop back to 4G, then 30-50ms.
- Ultra*
Low Latency on localised private/enterprise 5G Networks and certain
public hotspots (perhaps 5-10ms in 2021-22, then eventually 1-3ms maybe
around 2023-24, with Release 17, which also supports deterministic "Time
Sensitive Networking" in devices)
- A promised 2ms on Wi-Fi6E, when it gets access to big chunks of 6GHz spectrum
This
really isn't ideal for all the sci-fi low-latency scenarios I hear
around drones, AR games, or the cliched surgeon performing a remote
operation while lying on a beach. (There's that Speedo reference,
again).
* see the demand section below on whether 1-10ms is really "ultra-low" or just "very low" latency
Demand
Almost 3 years ago, I wrote an earlier article on latency (link),
some of which I'll repeat here. The bottom line is that it's not clear
that there's a huge range of applications and IoT devices that URLLC
will help, and where they do exist they're usually very localised and more likely to use private networks rather than public.
One paragraph I wrote stands out:
I
have not seen any analysis that tries to divide the billions of
devices, or trillions of dollars, into different cohorts of
time-sensitivity. Given the assumptions underpinning a lot of 5G
business cases, I’d suggest that this type of work is crucial. Some of
these use-cases are slow enough that sending data by 2G is fine (or by
mail, in some cases!). Others are so fast they’ll need fibre – or
compute capability located locally on-device, or even on-chip, rather
than in the cloud, even if it’s an “edge” node.
I still
haven't seen any examples of that analysis. So I've tried to do a first
pass myself, albeit using subjective judgement rather than hard data*.
I've put together what I believe is the first attempted "heatmap" for
latency value. It includes both general cloud-compute and IoT, both of
which are targeted by 5G and various forms of edge compute. (*get in touch if you'd like to commission me to do a formal project on this)
A
lot of the IoT examples I hear about are either long time-series
collections of sensor data (for asset performance-management and
predictive maintenance), or have fairly loose timing constraints. A
farm’s moisture sensors and irrigation pumps don’t need millisecond
response times. Conversely, a chemical plant may need to alter measure
and alter pressures or flows in microseconds.
I've looked at
time-ranges for latency from microseconds to days, spanning 12 orders of
magnitude (see later section for more examples). As I discuss below,
not everything hinges on the most-mentioned 1-100 millisecond range, or
the 3-30ms subset of that that 5G addresses.
I've then compared
those latency "buckets" with distances from 1m to 1000km - 7 orders of
magnitude. I could have gone out to geostationary satellites, and down
to chip scales, but I'll leave that exercise to the reader.
The
question for me is - are the three or four "battleground" blocks really
that valuable? Is the 2-dimensional Goldilocks zone of not-too-distant /
not-too-close and not-too-short / not-too long, really that much of a
big deal?
And that's without considering the third dimension of
throughput rate. It's one thing having a low-latency "stop the robot
now!" message, but quite another doing hyper-realistic AR video for a
remote-controlled drone or a long session of "tactile Internet" haptics
for a game, played indoors at the edge of a cell.
If you take all
those $trillions that people seem to believe are 5G-addressable, what %
lies in those areas of the chart? And what are the sensitivities to to
coverage and pricing, and what substitute risks apply - especially
private networks rather than MNO-delivered "slices" that don't even
exist yet?
Examples
Here are some more
examples of timing needs for a selection of applications and devices.
Yes, we can argue some of them, but that's not the point - it's that
this supposed magic range of 1-100 milliseconds is not obviously the
source of most "industry transformation" or consumer 5G value:
- Sensors
on an elevator doors may send sporadic data, to predict
slowly-worsening mechanical problems – so an engineer might be sent a
month before the normal maintenance visit. Similarly, sensors monitoring
a building’s structural condition, vegetation cover in the Amazon, or
oceanic acidity isn’t going to shift much month-by-month.
- A car
might download new engine-management software once a week, and upload
traffic observations and engine-performance data once a day (maybe
waiting to do it over WiFi, in the owner’s garage, as it's not
time-critical).
- A large oil storage tank, or a water well, might have a depth-gauge giving readings once an hour.
- A
temperature sensor and thermostat in an elderly person’s home, to
manage health and welfare, might track readings and respond with control
messages every 10 minutes. Room temperatures change only slowly.
- A
shared bicycle might report its position every minute – and unlock in
under 10 seconds when the user buys access with their smartphone app
- A payment or security-access tag should check identity and open a door, or confirm a transaction, in a second or two.
- Voice communication seems laggy with anything longer than 200 millisecond latency.
- A
networked video-surveillance system may need to send a facial image,
and get a response in 100ms, before the person of interest moves out of
camera-shot.
- An online video-game ISP connection will be considered “low ping” at maybe 50ms latency.
- A
doctor’s endoscope or microsurgery tool might need to respond to
controls (and send haptic feedback) 100 times a second – ie every 10ms
- Teleprotection systems for high-voltage utility grids can demand 6-10ms latency times
- A rapidly-moving drone may need to react in 2-3 millisecond to a control signal, or a locally-recognised risk.
- A
sensitive industrial process-control system may need to be able to
respond in 10s or 100s of microseconds to avoid damage to
finely-calibrated machinery
- Image sensors and various network sync mechanisms may require response times measured in nanoseconds
- Photon sensors for various scientific uses may operate at picosecond durations
- Ultra-fast laser pulses for machining glass or polymers can be measured in femtoseconds
Conclusion
Latency is
important, for application developers, enterprises and many classes of
IoT device and solution. But we have been spectacularly vague at
defining what "low-latency" actually means, and where it's needed.
A
lot of what gets discussed in 5G and edge-computing conferences,
webinars and marketing documents is either hyped, or is likely to remain
undeliverable. A lot of the use-cases can be adequately serviced with
4G mobile, Wi-Fi - or a person on a bicycle delivering a USB memory
stick.
What is likely is that average latencies will fall
with 5G. An app developer that currently expects a 30-70ms latency on
4G (or probably lower on Wi-Fi) will gradually adapt to 20-40ms on
mostly-5G networks and eventually 10-30ms. If it's a smartphone app,
they likely won't use URLLC anyway.
Specialised IoT developers in
industrial settings will work with specialist providers (maybe MNOs,
maybe fully-private networks and automation/integration firms) to hit
more challenging targets, where ROI or safety constraints justify the
cost. They may get to 1-3ms at some point in the medium term, but it's
far from clear they will be contributing massively to MNOs or
edge-providers' bottom lines.
As for wide-area URLLC? Haptic
gaming from the sofa on 5G, at the edge of the cell? Remote-controlled
drones with UHD cameras? Two cars approaching each other on a hill-crest
on a country road? That's going to be a challenge for both demand and
supply.