Yet another brilliant week for Open Source AI:
- Meta released Segment Anything 2 – Apache 2.0 license, SOTA unified image & video model, capable of segmenting any object across frames!
- Google dropped Gemma 2 2B – With MMLU: 56.1 & MBPP: 36.6, 2.6B parameter model that beats GPT3.5/ Mixtral 8x7B on LYMSYS elo rankings – pretty strong model for on-device.
- Along with Gemma 2 2B, Google and DeepMind also shipped ShieldGemma and GemmaScope—ShieldGemma (2B, 9B & 27B) safety and content moderation model. GemmaScope—400 Sparse Auto Encoders (SAE) with over 30 million learned features.
- BlackForestLabs open-sourced FLUX Dev & Schnell – 12B parameter rectified flow transformer model for text to image, based on vibes, it beats Midjourney! Schnell is distilled using latent adversarial diffusion for faster generation (1-4 steps)
- Stability released Stable Fast 3D – based on TripoSR, generates a textured UV-unwrapped 3D mesh asset. One image to direct 3D model!
and.. much more, Arcee shipped DistilKit, Eleuther released SAE for Llama 3.1 8B, SEA LLM v3.
Author: Vaibhav (VB) Srivastav
Read related articles: