Episode Details
Back to Episodes
From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team
Description
Congrats to Damien on successfully running AI Engineer London! See our community page and the Latent Space Discord for all upcoming events.
This podcast came together in a far more convoluted way than usual, but happens to result in a tight 2 hours covering the ENTIRE OpenAI product suite across ChatGPT-latest, GPT-4o and the new o1 models, and how they are delivered to AI Engineers in the API via the new Structured Output mode, Assistants API, client SDKs, upcoming Voice Mode API, Finetuning/Vision/Whisper/Batch/Admin/Audit APIs, and everything else you need to know to be up to speed in September 2024.
This podcast has two parts: the first hour is a regular, well edited, podcast on 4o, Structured Outputs, and the rest of the OpenAI API platform. The second was a rushed, noisy, hastily cobbled together recap of the top takeaways from the o1 model release from yesterday and today.
Building AGI with Structured Outputs — Michelle Pokrass of OpenAI API team
Michelle Pokrass built massively scalable platforms at Google, Stripe, Coinbase and Clubhouse, and now leads the API Platform at Open AI. She joins us today to talk about why structured output is such an important modality for AI Engineers that Open AI has now trained and engineered a Structured Output mode with 100% reliable JSON schema adherence.
To understand why this is important, a bit of history is important:
* June 2023 when OpenAI first added a "function calling" capability to GPT-4-0613 and GPT 3.5 Turbo 0613 (our podcast/writeup here)
* November 2023’s OpenAI Dev Day (our podcast/writeup here) where the team shipped JSON Mode, a simpler schema-less JSON output mode that nevertheless became more popular because function calling often failed to match the JSON schema given by developers.
* Meanwhile, in open source, many solutions arose, including
* Instructor (our pod with Jason here)
* LangChain (our pod with Harrison here, and he is returning next as a guest co-host)
* Outlines (Remi Louf’s talk at AI Engineer here)
* Llama.cpp’s constrained grammar sampling using GGML-BNF
* April 2024: OpenAI started implementing constrained sampling with a new `tool_choice: required` parameter in the API
* August 2024: the new Structured Output mode, co-led by Michelle
* Sept 2024: Gemini shipped Structured Outputs as well
We sat down with Michelle to talk through every part of the process, as well as quizzing her for updates on everything else the API team has shipped in the past year, from the Assistants API, to Prompt Caching, GPT4 Vision, Whisper, the upcoming Advanced Voice Mode API, OpenAI Enterprise features, and why every Waterloo