Ministral 3 Family Uses Cascade Distillation Method
Press Space for next Tweet
Mistral released the open-weights Ministral 3 family (14B, 8B, and 3B parameters), compressed from a larger model using a new pruning and distillation method called cascade distillation. Despite their smaller size, the vision-language models rival or beat similarly sized competitors on several benchmarks while requiring far less training data and compute. Learn more in The Batch: https://hubs.la/Q042TGnk0

Topics
Read the stories that matter.The stories and ideas that actually matter.
Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.