Gen AI News [November 1st, 2024]

Another great week in open ML!
Here’s a small recap by @mervenoyann.

Model releases

⏯️ Video Language Models

@AIatMeta released LongVU, a new family of state-of-the-art long video LM models based on DINOv2, SigLIP, Qwen2 and Llama 3.2

💬 Small language models

@huggingface released SmolLM2, a new family of smol language models with Apache 2.0 license with, along with datasets of sizes 135M, 360M and 1.7B 🤗

Meta released MobileLLM, a new family of on-device LLMs of sizes 125M, 350M and 600M

🖼️ Image Generation

@StabilityAI released Stable Diffusion 3.5 Medium, an 2B model with commercially permissive license

🖼️💬Any-to-Any

MiniOmni-2 is closest reproduction to GPT-4o, a new LLM that can take image-text-audio input and output speech is released!

Dataset releases

🖼️ PD12M, a new captioning dataset of 12.4 million examples generated using Florence-2

Read other news in our Blog.