Enhanced reasoning
Dedicated encoder significantly boosts performance on tasks requiring deep context comprehension, such as math reasoning (GSM8K).
Explore T5Gemma
T5Gemma adapts pretrained decoder-only Gemma 2 models into an encoder-decoder architecture. These models are trained with either PrefixLM for strong generative performance or UL2 for high-quality contextual representations.
Dedicated encoder significantly boosts performance on tasks requiring deep context comprehension, such as math reasoning (GSM8K).
Model adaptation techniques allows for flexible configurations, including "unbalanced" models where the encoder and decoder have different sizes.
Superior quality-to-efficiency ratio without extensive compute requirements.
Checkpoints based on the official Gemma 2 2B and 9B models, as well as the “unbalanced” 9B-2B checkpoint.
Small, Base, Large, and XL sizes following the T5 configuration, plus an additional model sized between T5 Large and T5 X.