From Porting Models to Custom Silicon to a Tiny LLM That Writes Music: the Story of BookMusic
·13 mins
I started by researching how to port Google’s Magenta RealTime to AWS Inferentia2. I ended up with a small local model writing live-coding patterns that play in the browser while you read a book. How the idea evolved, why the idea was the hard part, and why prompt engineering is still very much a thing.
