AI News [12 Aug 2024]

Holidays doesn’t seem to be slowing down Open ML/ AI:

  1. In a double surprise, Qwen team released – Qwen 2 Math – 1.5B, 7B & 72B models – beats GPT4o, Claude 3.5 on AIME 24/ AMC 23 with 1.5 & 7B released under Apache 2.0 licensed and 72B under Qianwen license. Qwen 2 Audio – 8.5B, Apache 2.0 licensed Audio Language Models (Bas + Instruct) achieving SoTA on ASR, S2TT & AIR-Bench, trained on ~550K hours of audio.
    • Qwen 2 Math
  2. Hugging Face dropped Parler TTS – 885M (Mini) & 2.2B (Large) – fully open-source Text-to-Speech models (English only)! 4x faster than v0.1 (thanks to torch compile) and trained on ~45,000 hours of speech.
    • Parler TTS
  3. LG put out the EXAONE 7.8B trained on 8T tokens, beats L3.1 8B, Phi3, Mistral & bilingual (English & Korean), 72.0 Human eval, 34.4 MATH, 9.01 MT-Bench, (non commercial license).
    • EXAONE 7.8B
  4. Hugging Face dropped IDEFICS3 Llama 8B – Introducing Apache 2.0 licensed VLM with enhanced Document QA capabilities! Vision backbone: SigLip, Text backbone: Llama 3.1 8B, 10K context, DocVQA 87.7; MMStar 55.9.
    • 
Idefics3-8B-Llama3
  5. Intern LM open released Intern LM 2.5 20B with Apache 2.0 license, up-to 1M context window & trained on copious amounts of synthetic data! competitive with Gemma 27B IT; MMLU: 73.5, MATH: 64.7
  6. Tsingua KEG released CogVideoX 2B – an Open AI SORA like Text to Video model, generates up to 6 seconds of video with 8 frames per second, pretty decent quality!

And.. a lot more happened, PyTorch released FlexAttention, aiola dropped Whisper Medusa (150% faster inference), Maxime released an frankenmerge 1 trillion token Llama 3.1, etc

Author: Vaibhav (VB) Srivastav

Read related news:


Posted

in

by

Tags: