TernML is the first multi-architecture framework for ternary neural networks with {-1,0,+1} weights — 5 architectures (Graph, CNN, Transformer, RNN, ViT), 84% CIFAR-10, C codegen for Cortex-M0+. No DSP. No FPU. Just pure add-and-shift inference on the cheapest microcontrollers.
TernML unifies Graph, CNN, Transformer, RNN, and ViT under a single ternary QAT pipeline — all with float-beating accuracy and Cortex-M0+ codegen.
Unified QAT pipeline that works across GraphKAN, CNN, Transformer, RNN, and ViT — producing ultra-efficient ternary networks with C codegen for any MCU.
GraphKAN, CNN, Transformer, RNN/LSTM, or Vision Transformer — all use the same TernMLayer with STE-based QAT. Pick any architecture, get ternary weights.
5 architectures • Unified API4-phase QAT pipeline: float clamp → STE ternarization → hard clamp → finetune. Ternary quantization acts as a regularizer, improving accuracy over the float baseline.
Regularization-by-quantizationExport ternary weights in p/m bit-sliced format (32 trits/8 bytes). Generate pure C for Cortex-M0+ with no DSP, no FPU, no OS — bare-metal inference in milliseconds.
From $0.50 MCU • CodegenGraphKAN, CNN, Transformer, RNN/LSTM, Vision Transformer — all with the same 4-phase QAT pipeline and regularization-by-quantization effect.
| Architecture | Dataset | Float | Ternary | Size |
|---|---|---|---|---|
| GraphKAN 256→100→10 | MNIST | 94.77% | 96.15% | 15.4 KB |
| GraphKAN 256→100→10 | Fashion-MNIST | 83.03% | 84.67% | 15.4 KB |
| CNN conv32→64→128→10 | CIFAR-10 | 82.50% | 84.03% | ~100 KB |
| CNN conv32→64→FC128→10 | Fashion-MNIST | 91.57% | 92.02% | 102.8 KB |
| Transformer 2L/4H/128d | CIFAR-10 | TBD | TBD | TBD |
| LSTM 128d | Sequence | TBD | TBD | TBD |
| ViT patch8/128d | CIFAR-10 | TBD | TBD | TBD |
* All architectures use the same 4-phase QAT pipeline with STE. C codegen available for Cortex-M0+ target. ELM mode: frozen random ternary hidden layer + closed-form least squares.
Head-to-head against industry-standard TinyML frameworks. TernML delivers competitive accuracy at a fraction of the memory footprint.
| Metric | TernML | TFLite Micro (8-bit) | Edge Impulse |
|---|---|---|---|
| MNIST | 96.15% | 96.80% | 96.20% |
| CIFAR-10 (deep CNN) | 84.03% | 47.00% | 45.50% |
| Fashion-MNIST (CNN) | 92.02% | 91.50% | 91.00% |
| Model Size (GraphKAN) | 15 KB | 128 KB | 256+ KB |
| Peak RAM | 4 KB | 32 KB | 64 KB |
| Bits / Parameter | 1.58 | 8.00 | 8.00 |
| Architectures | 5 | CNN only | CNN only |
| C Codegen | Yes | No | No |
| Natural Sparsity | 49% | 0% | 0% |
| DSP Required | No | Yes | Yes |
| FPU Required | No | Optional | Yes |
| Min. MCU Cost | $0.50 | $1.50+ | $3.00+ |
* TFLite and Edge Impulse benchmarks based on default 8-bit quantized models. TernML supports 5 architectures (Graph, CNN, Transformer, RNN, ViT) with C codegen for Cortex-M0+. Accuracy exceeds float baseline due to regularization-by-quantization effect.
No hardware accelerator, no DSP, no FPU. TernML runs on the cheapest MCUs on the market.
TernML is an independent research project focused on bringing neural network inference to the cheapest microcontrollers.
TernML is a multi-architecture framework for ternary neural networks — supporting GraphKAN, CNN, Transformer, RNN/LSTM, and Vision Transformer under one unified QAT pipeline with C codegen for Cortex-M0+. The core discovery: ternary quantization acts as a regularizer during training, improving generalization over the float baseline across all architectures.
The result: models that don't just compress well — they generalize better. 5 architectures, 95 passing tests, Cortex-M0+ codegen. This is TernML’s core insight.