How about putting a transformer model on a stock C=64?
A proper decoder-only transformer running on a stock C64. Attention, RMSNorm, feed-forward, residuals, the works. Two layers, four heads, about 25,000 parameters. All int8.
A proper decoder-only transformer running on a stock C64. Attention, RMSNorm, feed-forward, residuals, the works. Two layers, four heads, about 25,000 parameters. All int8.
Going for that classic early 90s Amiga FMV look. And for the first time, I attempt some rudimentary lipsync!