Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.

Ministral 3 Family Uses Cascade Distillation Method

Mistral released the open-weights Ministral 3 family (14B, 8B, and 3B parameters), compressed from a larger model using a new pruning and distillation method called cascade distillation. Despite their smaller size, the vision-language models rival or beat similarly sized competitors on several benchmarks while requiring far less training data and compute. Learn more in The Batch: https://hubs.la/Q042TGnk0

Content

Topics

Read the stories that matter.

Save hours a day in 5 minutes