Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Llama Running on a Microcontroller (github.com/maxbbraun)
49 points by maxbbraun on Nov 15, 2023 | hide | past | favorite | 5 comments


I was wondering if it's possible to fit a non-trivial language model on a microcontroller. Turns out the answer is some version of yes!

This project is using the Coral Dev Board Micro with its FreeRTOS toolchain. The board has a number of neat hardware features not currently being used here (notably a TPU, sensors, and a second CPU core). It does, however, also have 64MB of RAM. That's tiny for LLMs, which are typically measured in the GBs, but comparatively huge for a microcontroller.

The LLM implementation itself is an adaptation of llama2.c and the tinyllamas checkpoints trained on the TinyStories dataset. The quality of the smaller model versions isn't ideal, but good enough to generate somewhat coherent (and occasionally weird) stories.


The "microcontroller" is a Coral AI accelerator.


Just to clarify: Inference is happening on the Arm Cortex-M7. The Coral TPU chip is off in this implementation.


Thank You.


epic!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: