Engineering Deep Dives

Field-tested articles on on-device AI, mobile NPUs, transformer quantization, and Android development.

Running Transformer Models on Mobile NPUs: What Actually Works (and What Breaks)

Field-tested engineering report from deploying RoBERTa-base (125M params) on Snapdragon 8 Elite's Hexagon NPU. 8 backends tested, real failure cases with Issue/Effect/Fix analysis, w8a16 quantization breakthrough, and the design decisions behind SentiLog's production AI stack.

Read article →