RWKV

RNN-meets-transformer linear-attention LM architecture running with O(n) memory—unique open line for long-context and embedded inference.

Why it is included

Non-transformer open architecture with active community ports (CUDA, Metal, Web).

Experimenters wanting recurrent-style LLMs without full KV cache growth.

Transformer LMs (Llama-class) · Linear-attention research stacks