- GPT Weekly
- Posts
- 🎙️ Meta's Voicebox is Paused
🎙️ Meta's Voicebox is Paused
PLUS: 🖼️SDXL vs Midjourney, 📜AI Compliance & EU Act and more
This is a recap covering the major news from last week.
🔥Meta’s VoiceBox Paused, SDXL 0.9 and Open AI vs EU Act
🗞️10 Interesting reads: GPT-4’s huge size, AI programming and teaching and more.
🧑‍🎓Transformers, RHLF and Interactive Notebooks
🔥Top 3 AI news in the past week
1. Meta's Voicebox: Release Pause
Meta, just like OpenAI, is on a roll. They released introduced a speech generative model called Voicebox. It can perform a range of speech-generation tasks it wasn't specifically trained for.
It's like generative systems for images and text, capable of crafting a variety of styles. It can even modify provided samples. It's multilingual, covering six languages, and can remove noise, edit content, convert styles, and generate diverse samples.
Why is it important? Before Voicebox, each speech AI task required individual training with curated data. This game-changing model learns from raw audio and corresponding transcriptions. In contrast to previous autoregressive audio models, Voicebox can adjust any part of a sample, not merely the tail end.
What’s next? Meta has just “introduced” Voicebox without a proper release. As per them Voicebox model is ripe for misuse.
Considering last week’s promise of free to use LLMs, this seems like a step back. This might be a reaction to pushback due to Llama or maybe there are unseen profits.
Though there is already a community implementation in progress:
2. SDXL vs. Midjourney: The Imaging Race
source: Stability Twitter
Stability announced SDXL 0.9 their new text to image model. They are now one step closer to a full 1.0 release.
Why is it important? Stable Diffusion is one of the text to image models which can be run on a consumer pc. At least one which has an Nvidia GeForce RTX 20 graphics card. This releases multiple features like using an image to generate variations, filling missing parts of an image and out-painting to extend images.
What’s next? Last week, Midjourney also released v5.2 which also has out-painting features and sharper images.
Stability is providing the SDXL 0.9 weights for research purposes. And they will be releasing 1.0 under the CreativeML license. Something to look forward to.
3. EU Act AI Compliance: Navigating the Future
Last week, we talked about the EU proposed legislation. An interesting study by Stanford shows that none of the leading models comply.
Source: Stanford
Why is it important? The EU AI Act governs the usage of AI for 450 million people. And often EU rule has a large outlying effect (See: Brussel’s effect)
Additionally, as per Time, Altman and OpenAI had lobbied for not putting GPT-3 models in the high risk category. “By itself, GPT-3 is not a high-risk system. But [it] possesses capabilities that can potentially be employed in high risk use cases.”
While OpenAI has escaped from being put in the high-risk category it is interesting to see the overall compliance to the law. The fines on non-compliance can go up to 4% of revenue.
As per the research OpenAI scores 25/48 or just above 50%. Anthropic’s Claue sits last with a 7/48 score.
What’s next? As per the researchers it is feasible for foundational models to comply with the EU AI Act. And policymakers should push for transparency. It remains to be seen how much lobbying and change happens on this law, especially regarding the transparency requirements.
🗞️10 AI news highlights and interesting reads
GPT-4 is just 8 GPT-3 inside a trenchcoat.
So, it is no wonder that Harvard’s famous Computer Science program - CS50 will have a chatbot teacher.
Source: Demystifying GPT Self-Repair for Code Generation
What kind of coding is the future? Self-healing code. Though self-repair effectiveness is only on GPT-4. Though it is best to use GPT-3.5 code -> GPT-4 repair -> Human Feedback. (See below on how RLHF works)
The OpenAI app store might be coming. I guess the idea will be to charge flat 30% revenue like the App store.
🧑‍🎓3 Learning Resources
The “T” in GPT stands for Transformers. Here’s an a Nvidia explainer on Transformers.
GPT-4 is trained using RLHF. Learn how RLHF actually work and why open source RHLF is difficult.
Interactive workbooks to combine Generative AI models in one document. I find interactive notebooks to be the best way to learn concepts in programming.
That’s it folks. Thank you for reading and have a great week ahead.