Holidays doesn’t seem to be slowing down Open ML/ AI:
- In a double surprise, Qwen team released – Qwen 2 Math – 1.5B, 7B & 72B models – beats GPT4o, Claude 3.5 on AIME 24/ AMC 23 with 1.5 & 7B released under Apache 2.0 licensed and 72B under Qianwen license. Qwen 2 Audio – 8.5B, Apache 2.0 licensed Audio Language Models (Bas + Instruct) achieving SoTA on ASR, S2TT & AIR-Bench, trained on ~550K hours of audio.
- Hugging Face dropped Parler TTS – 885M (Mini) & 2.2B (Large) – fully open-source Text-to-Speech models (English only)! 4x faster than v0.1 (thanks to torch compile) and trained on ~45,000 hours of speech.
- LG put out the EXAONE 7.8B trained on 8T tokens, beats L3.1 8B, Phi3, Mistral & bilingual (English & Korean), 72.0 Human eval, 34.4 MATH, 9.01 MT-Bench, (non commercial license).
- Hugging Face dropped IDEFICS3 Llama 8B – Introducing Apache 2.0 licensed VLM with enhanced Document QA capabilities! Vision backbone: SigLip, Text backbone: Llama 3.1 8B, 10K context, DocVQA 87.7; MMStar 55.9.
- Intern LM open released Intern LM 2.5 20B with Apache 2.0 license, up-to 1M context window & trained on copious amounts of synthetic data! competitive with Gemma 27B IT; MMLU: 73.5, MATH: 64.7
- Tsingua KEG released CogVideoX 2B – an Open AI SORA like Text to Video model, generates up to 6 seconds of video with 8 frames per second, pretty decent quality!
And.. a lot more happened, PyTorch released FlexAttention, aiola dropped Whisper Medusa (150% faster inference), Maxime released an frankenmerge 1 trillion token Llama 3.1, etc
Author: Vaibhav (VB) Srivastav
Read related news: