I think the low-latency 5G Emperor is almost naked. Not completely starkers, but certainly wearing some unflattering Speedos.
Much
 of the promise around the 5G – and especially the “ultra-reliable 
low-latency” URLLC versions of the technology – centres on minimising 
network round-trip times, for demanding applications and new classes of 
device.
 Edge-computing architectures like MEC also often focus on 
latency as a key reason for adopting regional computing facilities - or 
even servers at the cell-tower. Similar justifications are being made 
for LEO satellite constellations.
The famous goal of 1 millisecond
 time is often mentioned, usually in the context of applications like 
autonomous vehicles with snappy responses, AR/VR headsets without 
nausea, cloud-gaming, the “tactile Internet” and remote drone/robot 
control.
(In theory this is for end-to-end "user plane 
latency" between the user and server, so includes both the "over the 
air" radio and the backhaul / core network parts of the system. This is 
also different to a "roundtrip", which is there-and-back time).
Usually,
 that 1ms objective is accompanied by some irrelevant and inaccurate 
mention of 20 or 50 billion connected devices by [date X], and perhaps 
some spurious calculation of trillions of dollars of (claimed) 
IoT-enabled value. Gaming usually gets a mention too.
I think there are two main problems here:
- Supply: It’s
 not clear that most 5G networks and edge-compute will be able to 
deliver 1ms – or even 10ms – especially over wide areas, or for 
high-throughput data.
- Demand: It’s also 
not clear there’s huge value & demand for 1ms latency, even where it
 can be delivered. In particular, it’s not obvious that URLLC 
applications and services can “move the needle” for public MNOs’ 
revenues.
Supply
Delivering URLLC 
requires more than just “network slicing” and a programmable core 
network with a “slicing function”, plus a nearby edge compute node for 
application-hosting and data processing, whether that in the 5G network 
(MEC or AWS Wavelength) or some sort of local cloud node like AWS 
Outpost. That low-latency slice needs to span the core, the transport 
network and critically, the radio.
Most people I speak to in the 
industry look through the lens of the core network slicing or the edge –
 and perhaps IT systems supporting the 5G infrastructure. There is also 
sometimes more focus on the UR part than the LL, which actually have 
different enablers.
Unfortunately, it looks to me as though the core/edge is writing low-latency checks that the radio can’t necessarily cash.
Without
 going into the abstruse nature of radio channels and frame-structure, 
it’s enough to note that ultra-low latency means the radio can’t wait to
 bundle a lot of incoming data into a packet, and then get involved in 
to-and-fro negotiations with the scheduling system over when to send it.
Instead,
 it needs to have specific (and ideally short) timed slots in which to 
transmit/receive low-latency data. This means that it either needs to 
have lots of capacity reserved as overhead, or the scheduler has to 
de-prioritise “ordinary” traffic to give “pre-emption” rights to the 
URLLC loads. Look for terms like Transmission Time Interval (TTI) and 
grant-free UL transmission to drill into this in more detail.
It’s
 far from clear that on busy networks, with lots of smartphone or 
“ordinary” 5G traffic, there can always be a comfortable coexistence of 
MBB data and more-demanding URLLC. If one user gets their 1ms latency, 
is it worth disrupting 10 – or 100 – users using their normal 
applications? That will depend on pricing, as well as other factors.
This
 gets even harder where the spectrum used is a TDD (time-division 
duplexing) band, where there’s also another timeslot allocation used for
 separating up- and down-stream data. It’s a bit easier in FDD 
(frequency-division) bands, where up- and down-link traffic each gets a 
dedicated chunk of spectrum, rather than sharing it.
There’s 
another radio problem here as well – spectrum license terms, especially 
where bands are shared in some fashion with other technologies and 
users. For instance, the main “pioneer” band for 5G in much of the world
 is 3.4-3.8GHz (which is TDD). But current rules – in Europe, and 
perhaps elsewhere - essentially prohibit the types of frame-structure 
that would enable URLLC services in that band. We might get to 20ms, or 
maybe even 10-15ms if everything else stacks up. But 1ms is off the 
table, unless the regulations change. And of course, by that time the 
band will be full of smartphone users using lots of ordinary traffic. 
There maybe some Net Neutrality issues around slicing, too.
There's
 a lot of good discussion - some very technical - on this recent post 
and comment thread of mine: 
https://www.linkedin.com/posts/deanbubley_5g-urllc-activity-6711235588730703872-1BVn
Various
 mmWave bands, however, have enough capacity to be able to cope with 
URLLC more readily. But as we already know, mmWave cells also have very 
short range – perhaps just 200 metres or so. We can forget about 
nationwide – or even full citywide – coverage. And outdoor-to-indoor 
coverage won’t work either. And if an indoor network is deployed by a 
3rd party such as neutral host or roaming partner, it's far from clear 
that URLLC can work across the boundary.
Sub-1GHz bands, such as 
700MHz in Europe, or perhaps refarmed 3G/4G FDD bands such as 1.8GHz, 
might support URLLC and have decent range/indoor reach. But they’ll have
 limited capacity, so again coexistence with MBB could be a problem, as 
MNOs will also want their normal mobile service to work (at scale) 
indoors and in rural areas too.
What this means is that we will probably get (for the forseeable future):
- Moderately
 Low Latency on wide-area public 5G Networks (perhaps 10-20ms), although
 where network coverage forces a drop back to 4G, then 30-50ms.
- Ultra*
 Low Latency on localised private/enterprise 5G Networks and certain 
public hotspots (perhaps 5-10ms in 2021-22, then eventually 1-3ms maybe 
around 2023-24, with Release 17, which also supports deterministic "Time
 Sensitive Networking" in devices)
- A promised 2ms on Wi-Fi6E, when it gets access to big chunks of 6GHz spectrum
This
 really isn't ideal for all the sci-fi low-latency scenarios I hear 
around drones, AR games, or the cliched surgeon performing a remote 
operation while lying on a beach. (There's that Speedo reference, 
again).
* see the demand section below on whether 1-10ms is really "ultra-low" or just "very low" latency
Demand
Almost 3 years ago, I wrote an earlier article on latency (link),
 some of which I'll repeat here. The bottom line is that it's not clear 
that there's a huge range of applications and IoT devices that URLLC 
will help, and where they do exist they're usually very localised and more likely to use private networks rather than public.
One paragraph I wrote stands out:
I
 have not seen any analysis that tries to divide the billions of 
devices, or trillions of dollars, into different cohorts of 
time-sensitivity. Given the assumptions underpinning a lot of 5G 
business cases, I’d suggest that this type of work is crucial. Some of 
these use-cases are slow enough that sending data by 2G is fine (or by 
mail, in some cases!). Others are so fast they’ll need fibre – or 
compute capability located locally on-device, or even on-chip, rather 
than in the cloud, even if it’s an “edge” node.
I still 
haven't seen any examples of that analysis. So I've tried to do a first 
pass myself, albeit using subjective judgement rather than hard data*. 
I've put together what I believe is the first attempted "heatmap" for 
latency value. It includes both general cloud-compute and IoT, both of 
which are targeted by 5G and various forms of edge compute. (*get in touch if you'd like to commission me to do a formal project on this)
A
 lot of the IoT examples I hear about are either long time-series 
collections of sensor data (for asset performance-management and 
predictive maintenance), or have fairly loose timing constraints. A 
farm’s moisture sensors and irrigation pumps don’t need millisecond 
response times. Conversely, a chemical plant may need to alter measure 
and alter pressures or flows in microseconds.
I've looked at 
time-ranges for latency from microseconds to days, spanning 12 orders of
 magnitude (see later section for more examples). As I discuss below, 
not everything hinges on the most-mentioned 1-100 millisecond range, or 
the 3-30ms subset of that that 5G addresses.
I've then compared 
those latency "buckets" with distances from 1m to 1000km - 7 orders of 
magnitude. I could have gone out to geostationary satellites, and down 
to chip scales, but I'll leave that exercise to the reader.
  The
 question for me is - are the three or four "battleground" blocks really
 that valuable? Is the 2-dimensional Goldilocks zone of not-too-distant /
 not-too-close and not-too-short / not-too long, really that much of a 
big deal?
And that's without considering the third dimension of 
throughput rate. It's one thing having a low-latency "stop the robot 
now!" message, but quite another doing hyper-realistic AR video for a 
remote-controlled drone or a long session of "tactile Internet" haptics 
for a game, played indoors at the edge of a cell.
If you take all 
those $trillions that people seem to believe are 5G-addressable, what % 
lies in those areas of the chart? And what are the sensitivities to to 
coverage and pricing, and what substitute risks apply - especially 
private networks rather than MNO-delivered "slices" that don't even 
exist yet?
Examples
Here are some more 
examples of timing needs for a selection of applications and devices. 
Yes, we can argue some of them, but that's not the point - it's that 
this supposed magic range of 1-100 milliseconds is not obviously the 
source of most "industry transformation" or consumer 5G value:
- Sensors
 on an elevator doors may send sporadic data, to predict 
slowly-worsening mechanical problems – so an engineer might be sent a 
month before the normal maintenance visit. Similarly, sensors monitoring
 a building’s structural condition, vegetation cover in the Amazon, or 
oceanic acidity isn’t going to shift much month-by-month.
- A car 
might download new engine-management software once a week, and upload 
traffic observations and engine-performance data once a day (maybe 
waiting to do it over WiFi, in the owner’s garage, as it's not 
time-critical).
- A large oil storage tank, or a water well, might have a depth-gauge giving readings once an hour.
- A
 temperature sensor and thermostat in an elderly person’s home, to 
manage health and welfare, might track readings and respond with control
 messages every 10 minutes. Room temperatures change only slowly.
- A
 shared bicycle might report its position every minute – and unlock in 
under 10 seconds when the user buys access with their smartphone app
- A payment or security-access tag should check identity and open a door, or confirm a transaction, in a second or two.
- Voice communication seems laggy with anything longer than 200 millisecond latency.
- A
 networked video-surveillance system may need to send a facial image, 
and get a response in 100ms, before the person of interest moves out of 
camera-shot.
- An online video-game ISP connection will be considered “low ping” at maybe 50ms latency.
- A
 doctor’s endoscope or microsurgery tool might need to respond to 
controls (and send haptic feedback) 100 times a second – ie every 10ms
- Teleprotection systems for high-voltage utility grids can demand 6-10ms latency times
- A rapidly-moving drone may need to react in 2-3 millisecond to a control signal, or a locally-recognised risk.
- A
 sensitive industrial process-control system may need to be able to 
respond in 10s or 100s of microseconds to avoid damage to 
finely-calibrated machinery
- Image sensors and various network sync mechanisms may require response times measured in nanoseconds
- Photon sensors for various scientific uses may operate at picosecond durations
- Ultra-fast laser pulses for machining glass or polymers can be measured in femtoseconds
Conclusion
Latency is
 important, for application developers, enterprises and many classes of 
IoT device and solution. But we have been spectacularly vague at 
defining what "low-latency" actually means, and where it's needed.
A
 lot of what gets discussed in 5G and edge-computing conferences, 
webinars and marketing documents is either hyped, or is likely to remain
 undeliverable. A lot of the use-cases can be adequately serviced with 
4G mobile, Wi-Fi - or a person on a bicycle delivering a USB memory 
stick.
What is likely is that average latencies will fall
 with 5G. An app developer that currently expects a 30-70ms latency on 
4G (or probably lower on Wi-Fi) will gradually adapt to 20-40ms on 
mostly-5G networks and eventually 10-30ms. If it's a smartphone app, 
they likely won't use URLLC anyway.
Specialised IoT developers in 
industrial settings will work with specialist providers (maybe MNOs, 
maybe fully-private networks and automation/integration firms) to hit 
more challenging targets, where ROI or safety constraints justify the 
cost. They may get to 1-3ms at some point in the medium term, but it's 
far from clear they will be contributing massively to MNOs or 
edge-providers' bottom lines.
As for wide-area URLLC? Haptic 
gaming from the sofa on 5G, at the edge of the cell? Remote-controlled 
drones with UHD cameras? Two cars approaching each other on a hill-crest
 on a country road? That's going to be a challenge for both demand and 
supply.