Code icon SDK architecture

What the SDK contains

ModelMill's edge AI SDK is a self-contained deployment package produced at the end of the five-step ModelMill workflow. It comprises four components.

The Logic-Based Network

The trained LBN is represented as a set of propositional logic clauses — compact, deterministic, and directly executable. Unlike a neural network weight file, the LBN does not require floating-point matrix operations to run. It operates through bitwise logic on integer data: fast on any processor that can handle 32-bit operations, and naturally suited to the constrained environments where edge AI actually lives.

The LBN is generated specifically for your training data and your target classification task. It is not a general-purpose model compressed down to fit; it is a model built from the ground up for the problem.

The inference engine

The inference engine is the runtime that executes the LBN on the target hardware. Written in C, it has no external dependencies and compiles cleanly against standard embedded toolchains — including ARM GCC, RISC-V GCC, and Espressif's IDF. The engine is optimised for the bitwise operations the LBN uses, which is why it achieves the inference speeds it does on modest hardware.

Memory allocation is static. There is no dynamic allocation at inference time — which matters in RTOS environments and bare-metal deployments where heap fragmentation is a genuine engineering concern.

Build configuration and example code

The SDK includes a build configuration appropriate for the target architecture, alongside example code demonstrating initialisation, feature ingestion, and inference call structure. The examples are written to be adapted rather than copied wholesale — they show the integration pattern clearly without making assumptions about your firmware architecture.

Integration documentation accompanies the code, covering the expected input format, output classification schema, platform constraints, and guidance on fitting the inference call into an existing firmware loop. Engineers familiar with embedded ML frameworks will find it considerably more direct than most.

Globe icon Deployment targets

Supported architectures

ModelMill generates deployment packages for all major 32-bit embedded architectures. The inference engine is the same across targets; only the build configuration and platform-specific optimisations differ.

Arm Cortex-M

The primary target for most embedded AI deployments. Cortex-M0 through M7 are all supported, spanning the range from ultra-low-power IoT sensors to more capable control applications. The LBN's bitwise execution model maps well onto the Cortex-M instruction set — no FPU required, and inference fits comfortably within the cycle budgets of real-time tasks.

Arm Cortex-M deployment details →

RISC-V

Adoption in embedded systems has accelerated significantly, particularly in cost-sensitive designs and custom silicon. The SDK compiles cleanly for RISC-V RV32 targets, including common SoCs from Espressif and SiFive lineages.

RISC-V deployment details →

ESP32

The ESP32 family is a practical target for connected edge AI: dual-core architecture, integrated wireless, and a development ecosystem (ESP-IDF) engineers already know. The SDK integrates with ESP-IDF directly with no separate build system required.

ESP32 deployment details →

x86

For edge servers, industrial gateways, and development workflows, the SDK also targets x86. Useful both for rapid prototyping on a workstation before deploying to an MCU, and for deployments where the target is a compact industrial PC rather than a microcontroller.

x86 deployment details →

Code icon Integration methods

Integration patterns

The SDK fits into existing firmware — it does not restructure it. The inference call is a function: feed it preprocessed feature data, receive a classification result. There is no inference server to stand up, no runtime to initialise before the application boots, and no background thread required.

RTOS environments

In FreeRTOS, Zephyr, or similar environments, the inference call can be placed in an existing task without modification. Because memory allocation is static and inference time is bounded and predictable, there are no scheduling surprises. Typical inference latency on Cortex-M4 hardware is in the low-millisecond range — well within the cycle budget of most sensor fusion tasks.

Bare-metal deployments

For bare-metal firmware, the pattern is equally direct. After hardware initialisation, the inference engine is initialised once, and inference runs from the main loop or an interrupt handler. The absence of dynamic allocation means the SDK is safe to call from interrupt context on architectures where that is permitted.

Code example

The essential pattern — initialise once, feed features, read result:

#include "lbn_inference.h"

/* Initialise the inference engine (call once at startup) */
lbn_ctx_t ctx;
lbn_init(&ctx, &model_params);

/* In your sensor processing loop */
int32_t features[FEATURE_COUNT];
collect_features(features);           /* your sensor/signal processing */

lbn_result_t result;
lbn_infer(&ctx, features, &result);   /* classify */

if (result.class_id == CLASS_ANOMALY) {
    trigger_alert();
}

The full SDK documentation covers the complete API surface, including batch inference, confidence thresholds, and output interpretation.

Lightning icon Benchmark results

Performance characteristics

The case for the SDK rests on measured performance. On Arm Cortex-M4 hardware — an STM32-class chip costing under $5 — LBN inference benchmarked at MLPerf Tiny delivers:

54×

Faster inference than an equivalent neural network on the same chip

52×

Less energy per inference

Kilobytes

Memory footprint, not megabytes — fitting within the SRAM available on mid-range MCUs

Full benchmark methodology and results are in the benchmarks section.

The practical consequence is not merely that inference is faster. It is that the class of hardware required to run production-grade AI shrinks substantially. Deployments that would otherwise require an NPU-equipped SoC or a GPU-backed cloud API can run on commodity MCU hardware at commodity MCU costs, with no cloud dependency at any point in the inference path.

FAQs icon FAQs

Frequently asked questions

What language is the SDK? chevron

The SDK is written in C — pure C99 with no external library dependencies, compatible with any embedded toolchain. It integrates naturally with C++ codebases without modification to the headers.

Can I deploy to multiple targets from one model? chevron

Yes. A model trained in ModelMill can be deployed to multiple target architectures. The trained LBN is architecture-agnostic; what changes per target is the build configuration. If your product ships on more than one hardware variant, one training run covers all targets.

Does deployment require internet access? chevron

No. The SDK is self-contained. Once ModelMill has generated the deployment package, the model runs entirely on-device with no network dependency. This is a deliberate architectural property — not an afterthought — and it matters for deployments in environments with intermittent connectivity, and for applications where privacy makes cloud inference unsuitable.

ModelMill logo mark

Ready to deploy Logic-Based AI?

Get your trained Logic-Based Network in production on any 32-bit processor. No GPU, no cloud dependency, no runtime overhead.

Book a deployment demo