How to run a Deep Neural Network on a AA Battery for a Week
- Kenneth Joel
- 3 days ago
- 4 min read
Updated: 2 days ago
A new hardware platform for deep learning on the extreme edge.
When we talk about deep learning and neural networks we often think about Google Colab or the latest Nvidia GPUs that can double up as a space heater. But unlike training, deep learning inference can happen on more humble platforms. Some humble enough to even be considered ‘edge compute’.

No, we aren’t here to talk about the Google Coral Edge TPU Board or the NVIDIA Jetson Nano Dev board. Consuming power in the range of 5–10 watts, they are still power guzzlers compared to our topic of discussion. But if you still want to read about them you can check out this detailed comparison by Manu.
We’re talking about a true extreme edge AI compute platform that can do keyword detection and FaceID all while sipping on power in MilliWatts. The closest competitor would be a low power ARM Cortex-M4 or M7 that would be orders of magnitude slower and hence orders of magnitude more energy hungry for a given task. We’re talking about the MAX7800x family of AI Microcontrollers, who’s name is almost as cool as it’s core temperature when running a deep neural network.

Let’s talk about the secret sauce in the MAX78000. Apart from the ARM Cortex M4 primary MCU and RISC-V based Smart DMA, it has a dedicated CNN accelerator that comprises of 64 parallel processors. Yes, 64. It also has a 432kB weight memory for up to 3.5 million weights (assuming 1 bit weights) that is SRAM based, so the weights can be changed even after deployment. This accelerator can be used to implement a neural network of up to 64 layers — with pooling every alternate layer — or 32 layers without pooling with a maximum of 1024 inputs or outputs per layer.
This means, instead of forward propagation running serially in triple nested matrix multiplication loops — as it is on classic hardware — it can run with a much higher degree of parallelism using the CNN accelerator. Additionally, while is primarily a CNN accelerator, it can also be used to implement traditional neural networks as well as RNNs.
A deep learning engineer may describe a 64-layer, 0.4-3.5 million param neural network as tiny compared to other modern networks, but a lot can still be done within these constraints. It will be interesting to see how the deep learning community innovates with constrained hyper-parameters.
The recently launched MAX78002 takes this a step further by almost doubling most of the specs for the CNN accelerator. We can only expect continued improvements and more powerful accelerators in the near future.
When to use an ‘AI Micro’
Deterministic, Low Latency performance is a priority — A completely offline implementation is not only faster than one requiring a network connection but also more deterministic as it depends only on the hardware performance which is fixed, rather than network speeds which may vary with bandwidth or sometimes fail altogether.
Privacy and data security is critical — How would you feel if audio and video recordings of your home are being live streamed to the cloud? Your data is safest when it never leaves the source.
‘Goldilocks’ Problem Statement — The problem is complex enough to require a neural network but simple enough that it can be solved with a relalively small one.
Here are some examples of usecases that meet the above criteria, broadly divided into three categories:
Speech and Other Audio
Key Word / Safe Word detection for different domains. CNNs that can recognise up to 20 unique words have been built using the MAX78000.
Audio based fault detection in Industrial applications. The unique sound signature of a machine in operation can convey a lot about it’s health and prove useful for predictive maintenance. Particularly in remote locations like offshore oil rigs.
Computer Vision
Remote Vision — Computer vision in cases where network may be unavailable, for example in wildlife conservation — animal tracking and survailence in particular. A ‘Camera Trap’ that locks only when a particular animal is detected would be an interesting solution that can be built with the MAX78000.
Manufacturing — Solving a very specific fault detection problem where down time due to network issues cannot be tolerated.
‘Visual Wake Words’ — Human Presence detection is already being implemented on some less capable micros, this can be extended to include gaze detection which could be particularly useful in home automation.
FaceID — A network can be trained to recognise up to 20 faces for completely offline FaceID systems using the MAX78000.
Other Sensors
Inertial Sensors — Real time Activity Recognition and Pose detection problems are getting increasingly complex, requiring more powerful techniques to provide insights. This LSTM and CNN based IMU sensor fusion algorithm is one such example
Biomedical Sensors — An offline neural network can be employed to detect atrial fibrillation or other anomalies in an ECG signal from a wearable ECG patch or holter monitor. It can also be hooked up to a multi-parameter patient monitor and combine data from multiple sources for early insights into patient deterioration.
Getting started with the MAX7800x family has been simplified with the help of this Github repository from Analog Devices which contains the SDK for the MCU as well as all the necessary tools for training and synthesis of your own custom models.
The MAX78000 is truly a unique microcontroller and is set to revolutionise deep learning as we know it. We can expect even more ultra low power deep learning technology as ARM announced that ARM v9 will priortise DSP and ML hardware accelerators. The new ARM Cortex-M55 paired with the Ethos-U55 Neural Processing Unit is expected to give a 480X improvement in ML performance over existing Cortex-M based systems.
Deep learning was already an exciting field, with these latest hardware innovations we can expect it to scale even greater heights with more widespread application. Stay tuned for more musings on EdgeAI and Deep Learning!
References
YouTube. (2021, November 11). Maxim Integrated MAX78000 AI MCU w/ Neural Network Accelerator | Featured Product Spotlight. YouTube. https://www.youtube.com/watch?v=K-J-HAh4I5Q
Bush, S. (2021, July 21). Ai Camera Reference Design runs from a battery and includes AI hearing. Electronics Weekly. https://www.electronicsweekly.com/news/products/bus-systems-sbcs/ai-camera-reference-design-runs-battery-includes-ai-hearing-2021-07/
Was 2020 the year of edge ai compute? - news. (n.d.). https://www.allaboutcircuits.com/news/2020-year-of-edge-ai-compute/
ARM’s solution to the future needs of AI, security and specialized computing is V9. Arm Newsroom. (2024, April 8). https://newsroom.arm.com/news/arms-solution-to-the-future-needs-of-ai-security-and-specialized-computing-is-v9
Arm ethos-U55 - net. (n.d.-a). https://armkeil.blob.core.windows.net/developer/Files/pdf/product-brief/arm-ethos-u55-product-brief.pdf