What is edge AI?

Edge AI is artificial intelligence inference running directly on a device — a microcontroller, embedded processor, or industrial CPU — rather than in the cloud. Data is analysed where it is created. Decisions are made on the device. Nothing needs to leave.

That single architectural difference changes everything: cost, latency, reliability, power consumption, and who can realistically deploy AI at scale.

01 • The problem with cloud AI

The cloud model has hard limits.

For most of the past decade, AI meant cloud AI. Data left the device, travelled to a data centre, was processed by servers running GPU-accelerated models, and a result came back. For some tasks — training large models, processing rich media, running language generation — that architecture still makes sense.

For edge applications, it breaks down fast.

Cloud AI requires a reliable network connection, acceptable round-trip latency, manageable power budgets, and infrastructure costs that make commercial sense at scale. In industrial plant floors, remote infrastructure, battery-operated sensors, and regulated environments, most of those assumptions fail simultaneously.

The problems of running AI with cloud-based data centres

Connectivity

Intermittent in industrial environments, absent in remote deployments, and prohibited in applications where data must not leave the device. Cloud AI is not viable where the network is not.

Latency

A cloud round-trip takes 50 to 500 milliseconds. For a motor protection system or an automotive safety function, that response time is simply too slow. The decision needs to happen in microseconds — before the fault propagates.

Power

Transmitting sensor data to the cloud consumes more energy than running inference locally. For a battery-operated IoT device, this is not a marginal concern. It is the difference between a product with a three-month field life and one measured in years.

Data volume

A dense sensor network generating continuous readings will saturate network infrastructure long before any analysis takes place. Streaming everything upstream is neither efficient nor, in many cases, technically feasible.

Cost at scale

Cloud inference carries a per-inference charge that compounds across large fleets. 10,000 sensors making 100 inferences per day each generate 1 million API calls daily. Edge AI inference carries no marginal cost once deployed.

The shift

Edge AI resolves these constraints by moving inference to the device. Data is analysed where it originates. Decisions are made immediately, locally, without transmission. The cloud is no longer in the critical path.

02 • How edge AI works

Four steps. No cloud required.

Every edge AI system follows the same basic pipeline. A sensor captures raw data. Preprocessing transforms it into a structured form the model can read. Inference runs on the device and returns a result. The result drives an action or a report. Raw data stays on the device. Network connectivity is not in the critical path.

01 — Data acquisition

A sensor captures raw signal: vibration from a motor bearing, acoustic output from a machine, temperature and current from electrical infrastructure, gas concentration in an industrial environment, or a frame from a camera system.

02 — Preprocessing

Raw sensor data is converted into structured features the AI model can read. For vibration analysis, a Fast Fourier Transform produces a frequency spectrum. For audio, mel-frequency cepstral coefficients capture the signal's character. This step runs on-device, in firmware, before inference begins.

03 — Inference

The preprocessed features are passed to the AI model, which runs on the edge processor and returns a classification or prediction. On a well-implemented edge AI system, this takes microseconds to low milliseconds. No network call. No server. No GPU.

04 — Action or reporting

The inference result drives something: an alert, a control signal, a maintenance flag, or a summarised status point sent upstream. Because only the result leaves the device rather than the raw sensor data, bandwidth consumption drops sharply and sensitive data remains local.

The critical architectural point is that the cloud is absent from this pipeline. The system works identically whether the device has a network connection or not.

03 • Edge vs cloud

Two architectures. Very different results.

Cloud AI and edge AI are not competing versions of the same thing. They are architecturally different approaches suited to different constraints. Cloud AI excels at large-scale training, iterative model development, and tasks requiring enormous compute — language models, image generation, search. Edge AI removes the dependency on connectivity, cloud infrastructure, and per-inference cost. For applications where latency, power, privacy, or reliability matter, the comparison is rarely close.

Dimension	Edge AI	Cloud AI
Inference location	On-device	Remote server
Latency	Microseconds to low milliseconds	50–500ms typical
Network dependency	None	Required
Data privacy	Data stays on device	Data transmitted and stored remotely
Power per inference	Microwatts to milliwatts	Kilowatts (server-level)
Hardware cost	Sub-$1 to $50 MCU or CPU	GPU server infrastructure
Ongoing inference cost	None after deployment	Per-inference cloud fees
Reliability	Operates offline	Fails without connectivity
Scales with	Device count	Cloud spend

For latency-critical, power-constrained, privacy-sensitive, or high-volume applications, edge AI is not a compromise on cloud AI. It is the correct architecture for the job.

04 • Why it matters

Five reasons edge wins.

01 — Real-time response

Cloud inference latency, including network round-trip and server processing, typically runs between 50 and 500 milliseconds. For motor protection systems detecting bearing failure, automotive safety functions, or audio event detection, that window is too wide. Inference on a microcontroller completes in under a millisecond. For safety-critical or time-sensitive applications, this is not a performance improvement. It is the difference between a system that works and one that does not.

02 — Privacy and data sovereignty

Data that never leaves the device cannot be intercepted, stored on third-party infrastructure, or caught up in cloud data governance obligations. For applications handling medical signals, behavioural patterns, audio from occupied spaces, or commercially sensitive operational data, keeping inference on-device is often a regulatory or contractual requirement. Edge AI lets organisations extract intelligence from sensitive data without exposing it.

03 — Connectivity independence

The most valuable environments for industrial AI — remote infrastructure, plant floors, mining operations, maritime systems, agricultural monitoring — have intermittent or no network access. Cloud AI is simply not viable in these environments. Edge AI works offline by design. Even where connectivity exists, removing the dependency eliminates a failure mode: an edge AI system performs identically whether it has a network connection or not.

03 — Energy efficiency

For battery-operated devices, the dominant power draw is often the radio, not the processor. Transmitting data to the cloud costs energy. Running inference locally costs far less. A well-designed logic-based edge model, consuming 455 microjoules per inference, can run on a lithium battery for ten years at a prediction rate of one every five seconds. That changes the economics of deploying sensors in infrastructure that is difficult or expensive to service.

05 — Total cost at scale

Cloud inference carries a per-call cost that compounds with fleet size. A network of 10,000 sensors making 100 inferences per day each generates 1 million cloud API calls daily. Once an edge AI model is deployed, inference is free. For high-volume, always-on monitoring, the lifetime cost difference between cloud and edge inference frequently determines whether a product is commercially viable at all.

05 • Real constraints

Edge AI has hard constraints. Here is what they are.

The benefits are real. So are the limits. The constraints of edge AI are primarily about the hardware it runs on — and how well the chosen AI algorithm fits within them.

Compute and memory

A mid-range Arm Cortex-M4 has 256 KB of SRAM and 1 MB of flash. A high-end server GPU has 80 GB of memory. Fitting a useful AI model into kilobytes while maintaining accuracy is the central engineering challenge of edge AI. Traditional neural networks were not designed for this constraint. Compressing them involves trade-offs — quantisation, pruning, knowledge distillation — that sometimes preserve accuracy and sometimes do not.

Power budgets

Inference on a neural network requires floating-point operations. On hardware without an FPU, those operations are emulated in software, which is slow and energy-intensive. The power budget for a sensor running on a coin cell is measured in microwatts. Neural network inference typically runs in the milliwatt range or above. That gap, for many product designs, is insurmountable with conventional algorithms.

Development complexity

Training a model for edge deployment is more involved than cloud training. The model must be validated on target hardware, in the target operating environment, with a representative data distribution. Historically, this required machine learning expertise that embedded engineering teams rarely have in-house.

How Logic-Based Networks address these constraints

Logic-Based Networks were designed for edge constraints from the ground up. Rather than floating-point matrix multiplication, they use propositional logic: inference is bitwise operations on integer data, running on any 32-bit processor without an FPU. Models are typically under 5 KB. Inference completes in nanoseconds to low microseconds. Output is deterministic — the same input always produces the same result, with no variance or hallucination.

On an Arm Cortex-M4 running the MLPerf Tiny anomaly detection benchmark, LBN inference is 54 times faster and uses 52 times less energy than an equivalent neural network autoencoder. ModelMill automates the training-to-deployment workflow and delivers a C-code SDK that integrates with existing firmware, without requiring ML expertise from the embedded team.

06 • Where it runs

Edge AI across industries.

Edge AI is not a niche application. It is the default architecture for any AI problem where the device is resource-constrained, the environment is connectivity-limited, the data is sensitive, or the inference volume makes cloud costs unworkable. That covers most of industry.

Industrial monitoring and predictive maintenance

Motors, pumps, compressors, and bearings degrade in ways detectable in vibration, current, and acoustic data long before visible failure occurs. Edge AI models trained on healthy and degraded signatures detect anomalies continuously, on the asset itself, without cloud infrastructure. A UK water utility trialling LBNs on sewer monitoring sensors found it could replace a mains-powered LSTM solution requiring £15,000 of power infrastructure per site with a battery-powered sensor running ten years on a lithium battery. The numbers that previously made monitoring 100,000 sites unviable became viable.

Automotive and vehicle systems

Automotive applications require millisecond-level response and operate in variable or absent connectivity. Edge AI enables real-time classification of driving conditions, vehicle dynamics monitoring, driver behaviour analysis, and powertrain anomaly detection on existing embedded hardware. A leading European ADAS Tier 1 supplier used an R-LBN to replace a physical sensor on a PowerPC e200 platform — a chip from 2006 — achieving a 4 microsecond inference time at 4 KB of memory, 13.4 times faster than the recurrent neural network it replaced, which had required a 200% increase in SoC cost.

Supply chain and demand forecasting

One of the UK's largest food supply companies needed eight demand forecasts per day across 2,400 product lines. Their GPU-based neural network could produce one forecast per day, at 25% WMAPE error margins. An LBN deployed on a single CPU produces a forecast in 0.03 seconds per product line — all 2,400 in 40 seconds — at 6.42% WWMAPE. No GPU. Stood up in four weeks.

IoT sensor networks

Dense sensor networks in smart buildings, agriculture, environmental monitoring, and utility infrastructure generate data volumes that would be prohibitively expensive to transmit and process centrally. Inference at the node level lets each sensor report only meaningful events: anomalies, threshold crossings, classified states. The network reports intelligence, not data. Edge AI for IoT →

Healthcare and wearables

Continuous physiological monitoring — heart rate variability, movement classification, sleep staging, fall detection — on devices with severe power and size constraints. Edge AI keeps sensitive health data on-device, extends battery life, and enables clinical-grade inference on hardware costing a few dollars.

Audio and acoustic monitoring

Classifying machine sounds, detecting acoustic anomalies, or identifying environmental audio events on a microcontroller without sending audio across a network. Applications include industrial condition monitoring, smart building occupancy sensing, and wearable health devices.

A full breakdown of edge AI applications with benchmark data and case study snapshots is available in the applications section.

07 • Hardware

Edge AI on the hardware you already have.

The most significant cost and operational benefit of logic-based edge AI is that it runs on processors already present in your product. No new SoC. No dedicated AI accelerator. No GPU. The hardware tier determines which class of model fits, but the requirement to replace existing infrastructure is usually absent.

Microcontrollers

The lowest tier and the most common production target. Arm Cortex-M series — M0 through M7 — covers most MCU deployments. RISC-V is increasingly common in cost-sensitive designs. ESP32 is widely used in connected IoT products. MCUs operate in the milliwatt range, cost from sub-$1 to around $15, and have SRAM and flash measured in kilobytes to a few megabytes.

LBNs are well-matched to MCU deployment. Inference is standard C code that compiles with existing embedded toolchains. No FPU is required. Memory footprint typically falls within 5 KB, well inside the constraints of a Cortex-M4 or similar.

Application processors and SoCs

Mid-tier hardware with more memory, typically running a real-time OS or embedded Linux. Arm Cortex-A series, NXP i.MX, and similar. This tier accommodates larger models and more complex preprocessing while remaining far below the power and cost of a GPU.

Edge servers and gateways

x86-based industrial computers, gateway devices, or compact servers that aggregate data from multiple sensor nodes before running inference. Higher compute, higher power, higher cost — appropriate for complex classification tasks or inference across aggregated sensor streams. A single CPU running an LBN handled 2,400 supply-chain SKU forecasts in 40 seconds, running continuously 24 hours a day, 18 months of forecast horizon per run.

The supported platforms section covers each architecture in detail, including specific hardware targets and deployment guidance.

08 • The direction

More intelligence. Less hardware.

The hardware is getting better. MCU silicon is acquiring more SRAM, faster clocks, and more capable peripherals without meaningfully increasing cost. The range of devices capable of running useful edge AI inference is widening. The per-unit cost of deploying that capability is falling.

The algorithms are keeping pace. Non-neural architectures — logic-based approaches among them — are an active and productive research area. The constraints of edge deployment are driving algorithm design in directions the cloud-first AI paradigm largely bypassed. Models are getting smaller. Inference is getting faster. Accuracy is no longer the trade-off it was assumed to be.

Regulatory pressure is reshaping where computation can happen. Data sovereignty requirements, EU AI Act obligations, and sector-specific privacy rules are creating structural incentives to keep data on-device, not because engineers prefer it, but because the alternatives are becoming legally complex.

These pressures are reinforcing each other. The sectors where edge AI produces the most value — industrial, automotive, healthcare, regulated infrastructure — are the same sectors where regulatory pressure is strongest and where performance requirements rule out cloud dependency. The convergence of algorithmic capability, affordable hardware, and regulatory necessity is not a trend. It is a structural shift in how AI gets deployed.

09 • Getting started

From raw data to deployed model.

The practical path from labelled sensor data to a running edge AI inference system has four stages. ModelMill handles most of the work.

1. Collect labelled data

Capture sensor readings representing the states the model needs to classify. Healthy vs anomalous. Class A vs class B. Normal vs fault condition. The data does not need to be large — LBNs train well on constrained datasets.

2. Train with ModelMill

ModelMill accepts the labelled data and handles preprocessing, configuration, and scaled training across hundreds of LBN candidates. It outputs the model best suited to your target hardware and performance requirements.

3. Receive the deployment package

A C-code SDK containing the trained LBN, its inference engine, build configuration, example code, and integration documentation. Ready to compile with standard embedded toolchains.

4. Integrate and deploy

The SDK integrates with existing firmware. Inference runs on the target hardware: Arm Cortex-M, RISC-V, ESP32, x86, or PowerPC.

No GPU. No cloud service. No dedicated ML expertise required within the firmware team.