Debuting the first production-grade Mamba-based model delivering best-in-class quality and performance.
Mamba, the novel Structured State Space model (SSM) architecture, was designed to address the limitations of the traditional Transformer architecture, but it has shortcomings of its own. Jamba offers the best of both worlds.
Jamba scores the highest on reasoning related benchmarks.
Jamba delivers 3X throughput on long contexts, making it the most efficient model in its size class.
Jamba is the only model of its size that fits 140K context on a single GPU.
Jamba is built on top of an SSM-Transformer mixture-of-experts (MoE)
architecture. It is based on hybrid interleaving Transformer & SSM layers,
enjoying the benefits of both architectures.
Jamba is a base model intended for use
as a foundation layer for fine tuning, training, & developing custom solutions.
Jamba is a base model intended for use as a foundation layer for fine tuning, training, & developing custom solutions.
start building