Deploy to any 32-bit processor.
When ModelMill finishes training your Logic-Based Network, it does not hand you a weight file and leave the integration to you. It generates a complete deployment package: trained model, inference engine, build configuration, and integration documentation — everything needed to put your model into production on the target hardware of your choice.
No GPU required. No cloud dependency. No runtime overhead from a framework designed for something else.
What the SDK contains
ModelMill's edge AI SDK is a self-contained deployment package produced at the end of the five-step ModelMill workflow. It comprises four components.
The Logic-Based Network
The trained LBN is represented as a set of propositional logic clauses — compact, deterministic, and directly executable. Unlike a neural network weight file, the LBN does not require floating-point matrix operations to run. It operates through bitwise logic on integer data: fast on any processor that can handle 32-bit operations, and naturally suited to the constrained environments where edge AI actually lives.
The LBN is generated specifically for your training data and your target classification task. It is not a general-purpose model compressed down to fit; it is a model built from the ground up for the problem.
The inference engine
The inference engine is the runtime that executes the LBN on the target hardware. Written in C, it has no external dependencies and compiles cleanly against standard embedded toolchains — including ARM GCC, RISC-V GCC, and Espressif's IDF. The engine is optimised for the bitwise operations the LBN uses, which is why it achieves the inference speeds it does on modest hardware.
Memory allocation is static. There is no dynamic allocation at inference time — which matters in RTOS environments and bare-metal deployments where heap fragmentation is a genuine engineering concern.
Build configuration and example code
The SDK includes a build configuration appropriate for the target architecture, alongside example code demonstrating initialisation, feature ingestion, and inference call structure. The examples are written to be adapted rather than copied wholesale — they show the integration pattern clearly without making assumptions about your firmware architecture.
Integration documentation accompanies the code, covering the expected input format, output classification schema, platform constraints, and guidance on fitting the inference call into an existing firmware loop. Engineers familiar with embedded ML frameworks will find it considerably more direct than most.
Supported architectures
ModelMill generates deployment packages for all major 32-bit embedded architectures. The inference engine is the same across targets; only the build configuration and platform-specific optimisations differ.
Arm Cortex-M
The primary target for most embedded AI deployments. Cortex-M0 through M7 are all supported, spanning the range from ultra-low-power IoT sensors to more capable control applications. The LBN's bitwise execution model maps well onto the Cortex-M instruction set — no FPU required, and inference fits comfortably within the cycle budgets of real-time tasks.
RISC-V
Adoption in embedded systems has accelerated significantly, particularly in cost-sensitive designs and custom silicon. The SDK compiles cleanly for RISC-V RV32 targets, including common SoCs from Espressif and SiFive lineages.
ESP32
The ESP32 family is a practical target for connected edge AI: dual-core architecture, integrated wireless, and a development ecosystem (ESP-IDF) engineers already know. The SDK integrates with ESP-IDF directly with no separate build system required.
x86
For edge servers, industrial gateways, and development workflows, the SDK also targets x86. Useful both for rapid prototyping on a workstation before deploying to an MCU, and for deployments where the target is a compact industrial PC rather than a microcontroller.
Integration patterns
The SDK fits into existing firmware — it does not restructure it. The inference call is a function: feed it preprocessed feature data, receive a classification result. There is no inference server to stand up, no runtime to initialise before the application boots, and no background thread required.
RTOS environments
In FreeRTOS, Zephyr, or similar environments, the inference call can be placed in an existing task without modification. Because memory allocation is static and inference time is bounded and predictable, there are no scheduling surprises. Typical inference latency on Cortex-M4 hardware is in the low-millisecond range — well within the cycle budget of most sensor fusion tasks.
Bare-metal deployments
For bare-metal firmware, the pattern is equally direct. After hardware initialisation, the inference engine is initialised once, and inference runs from the main loop or an interrupt handler. The absence of dynamic allocation means the SDK is safe to call from interrupt context on architectures where that is permitted.
Code example
The essential pattern — initialise once, feed features, read result:
#include "lbn_inference.h"
/* Initialise the inference engine (call once at startup) */
lbn_ctx_t ctx;
lbn_init(&ctx, &model_params);
/* In your sensor processing loop */
int32_t features[FEATURE_COUNT];
collect_features(features); /* your sensor/signal processing */
lbn_result_t result;
lbn_infer(&ctx, features, &result); /* classify */
if (result.class_id == CLASS_ANOMALY) {
trigger_alert();
}The full SDK documentation covers the complete API surface, including batch inference, confidence thresholds, and output interpretation.
Performance characteristics
The case for the SDK rests on measured performance. On Arm Cortex-M4 hardware — an STM32-class chip costing under $5 — LBN inference benchmarked at MLPerf Tiny delivers:
54×
Faster inference than an equivalent neural network on the same chip
52×
Less energy per inference
Kilobytes
Memory footprint, not megabytes — fitting within the SRAM available on mid-range MCUs
Full benchmark methodology and results are in the benchmarks section.
The practical consequence is not merely that inference is faster. It is that the class of hardware required to run production-grade AI shrinks substantially. Deployments that would otherwise require an NPU-equipped SoC or a GPU-backed cloud API can run on commodity MCU hardware at commodity MCU costs, with no cloud dependency at any point in the inference path.
Frequently asked questions
What language is the SDK?
The SDK is written in C — pure C99 with no external library dependencies, compatible with any embedded toolchain. It integrates naturally with C++ codebases without modification to the headers.
Can I deploy to multiple targets from one model?
Yes. A model trained in ModelMill can be deployed to multiple target architectures. The trained LBN is architecture-agnostic; what changes per target is the build configuration. If your product ships on more than one hardware variant, one training run covers all targets.
Does deployment require internet access?
No. The SDK is self-contained. Once ModelMill has generated the deployment package, the model runs entirely on-device with no network dependency. This is a deliberate architectural property — not an afterthought — and it matters for deployments in environments with intermittent connectivity, and for applications where privacy makes cloud inference unsuitable.
Ready to deploy Logic-Based AI?
Get your trained Logic-Based Network in production on any 32-bit processor. No GPU, no cloud dependency, no runtime overhead.