- GPT Weekly
- Posts
- New Pricing, Models, & Functions: OpenAI's New Updates
New Pricing, Models, & Functions: OpenAI's New Updates
PLUS: Meta's Free Model, EU's Bold Regulation, AI Ouroboros and more
Happy Monday!
This week we’ve got:
🔥OpenAI’s updates, Meta’s upcoming free LLM and EU Regulation.
🗞️Interesting reads - PSA about protecting your keys, The GPT ouroboros, Reddit - OpenAI’s moat, and more..
🧑🎓Learning includes a Step-by-step guide from a non-technical founder who launched his MVP, Chatbot for your Gdrive and more
Let’s get cracking.
🔥Top 3 AI news in the past week
1. OpenAI: New Pricing, Models, & Functions
OpenAI has been on a roll. Last week we saw the release of OpenAI best practice on using GPT. This week we saw some amazing updates. Three major buckets were:
First, the price decreases for both embeddings and GPT-3.5 tokens.
Second, new models for gpt-4 and gpt-3.5. A new longer context model for gpt-3.5.
Third, a new function calling capability.
Why is it important? Previously, the output from OpenAI was all text. So, calling an external API from GPT was quite difficult. You had to parse the text data and things were often incorrect. Langchain created the Agents and Tools feature to tackle this problem. It was still unreliable and prone to issues.
Now you get native support to generate a fixed format output. You can use the output to generate functional calls and also pass functions which need to be called. For example, if your app has multiple API endpoints then you can use GPT to generate the API calls with parameters. You can also pass the endpoints as function calls to ensure the correct function is executed.
This functionality can further be used to generate structured data (JSON) out of GPT. So, you can generate data from GPT and load it into your backend.
What’s next? This functionality allows turning natural language responses into structured data. This can be used to create “intelligent” backends using LLMs. We might see implementations in no-code tools to allow more robust and natural-language tools for non-technical folks.
The structured data process goes both ways. You can also feed structured data into GPT for better responses.
This feature also has its share of issues. Function calling suffers from the same prompt injection issues. Malicious actors can pass malicious code in function or the responses. For example, creation of queries using functions might contain malicious code to delete data. Without proper user validation this code will be executed automatically and delete data. So, using LLM as the back-end layer needs proper security implementation.
2. Meta's LLM: Commercial Use Ahead
Llama has been a boon for the open source community. Many of the open source models rely on Llama. The issue is that Llama is research-only and cannot be used commercially. So, no one can use it to build any product.
Meta is now working on the next version of the model. This model will be available for commercial use. This is in stark contrast to both OpenAI and Google. Both safe-guarde their models and make it available through API.
Why is it important? Certain industries cannot use LLM APIs because of strict restrictions on data privacy. These companies would want to run their own instance of a foundational model.
A commercially available foundational model is also going to help people who want to keep their “API call” costs next to 0.
A commercially available free-for-all model will also help push the open source community further. Just like Llama.
What’s next? Sam Altman has said OpenAI didn’t release GPT-3 as open-source because they didn’t think people would be able to run it. Now OpenAI is working on an open-source model. This is going to be weaker than GPT-4.
Let the battle of LLMs begin.
3. EU's Proposed Legislation and Its Impact on AI Usage
The EU parliament voted to move ahead with the E.U. AI Act. This act aims to ensure consumer protection against the dangers of AI.
Why is it important? OpenAI and Sam Altman want regulations for models. They have proposed a IAEA-type of agency to stop the proliferation of LLM models. As per OpenAI, all models should be regulated and monitored. The suggestion of a license based regulation has led to significant backlash. Many people have called it “regulatory capture” - with the aim of shutting down competing LLMs.
The EU is approaching regulation from a different angle. It doesn’t focus on how models are developed. Rather focuses on how AI will/can be used. They have broken down use cases into 4 categories - unacceptable (prohibited), high, medium and low risk. For example,
Building a Pre-Crime software to predict crimes? Building a Social credit system? Unacceptable.
Using tools to influence elections or recommendation algorithms? High (Highly regulated).
Using generative AI tools to create text or images on news sites? Medium (Add label that the content is AI generated)
AI providers also need to disclose their training source.
To me this sounds like good legislation. What do you guys think?
But, OpenAI has warned that EU regulations might force them to pull out completely.
What’s next? The disclosure requirements might help various publishing companies. AI and media companies are in talks to pay for training data. Google has been leading the charge.
Additionally, OpenAI and Deepmind will open their models for safety and research purposes to the UK government.
🗞️10 AI news highlights and interesting reads
PSA: If you are using Repl to write code, you might want to check your OpenAI API keys. If you have left them embedded then people can pirate and steal the keys.
LLMs rely on human annotation or human feedback to learn. And one way to generate human annotation is crowdsourcing. But what if the crowdsource human annotators use LLMs? Research shows 33-46% workers used LLMs. So, basically we go from Human -> AI -> Human -> AI. The AI ouroboros. Researchers also say generated data to train models might cause serious issue.
All the talks about moats - Reddit might be OpenAI’s *future* moat. Given the amount of complaints about how Google search experience has deteriorated during the blackout, this might be true?
Doctors are using ChatGPT but not to diagnose. Rather to be more empathetic. We discussed this just a month ago. And guess where the data for this study came from? Reddit AskDocs. Moat FTW?!
Beatles to make a comeback…using Generative AI.
Large context lengths are important for better GPT experience. The secret sauce for 100k context length.
There is a lot of bad AI research out there. Some border on snake oil. Most AI “research” should be double checked and challenged. A new research on huggingface said that GPT-4 can ace MIT curriculum. Now someone is replicating the results and say that GPT-4 can’t beat MIT.
Are we seeing peak AI? Especially when people from Deepmind and Meta are involved? Mistral AI raised $113 million in seed round with no product. Some might say this funding is for the team and the team is really solid. The issue though is whether the valuation is justified when OpenAI and Google already have a head start.
The AI Hype Wall of Shame. - Collection of articles which mislead people about AI.
🧑🎓3 Learning Resources
Building and Launching a company using GPT-4 with prompts. (The author didn’t know how to code but created and launched the MVP in a month).
Chatbot for your Gdrive - https://www.haihai.ai/gpt-gdrive/
Building ChatGPT plugin using Supabase - https://supabase.com/blog/building-chatgpt-plugins-template
That’s it folks. Thank you for reading and have a great week ahead.