Episode Details

Back to Episodes

Tiny model, huge benchmarks & Million-token open-source coding model - AI News (Jun 18, 2026)

Tiny model, huge benchmarks & Million-token open-source coding model - AI News (Jun 18, 2026)

Published 4 days, 19 hours ago

Description

Please support this podcast by checking out our sponsors:
- Prezi: Create AI presentations fast - https://try.prezi.com/automated_daily
- KrispCall: Agentic Cloud Telephony - https://try.krispcall.com/tad
- Lindy is your ultimate AI assistant that proactively manages your inbox - https://try.lindy.ai/tad

Support The Automated Daily directly:
Buy me a coffee: https://buymeacoffee.com/theautomateddaily

Today's topics:

Tiny model, huge benchmarks - Sina Weibo’s VibeThinker-3B posts standout reasoning scores (AIME 2026) despite only 3B parameters, fueling debate about benchmark validity, post-training, and real-world reliability.

Million-token open-source coding model - Z.ai releases GLM-5.2 under an MIT license, targeting stable 1M-token context for long-horizon coding agents, with new training focused on messy, hours-long engineering workflows.

Agent tooling inside the browser - OpenAI adds Chrome DevTools Protocol support to Codex browser-use, letting agents read console logs, network traffic, and page state—key for debugging web apps with AI assistance.

Voice AI gets truly interactive - OpenAI is reportedly preparing a new bidirectional voice model (GPT-Bidi-1) designed for natural interruptions and real-time conversation, pushing voice toward a primary AI interface.

Anthropic pauses agent billing shift - Anthropic pauses its planned token-based billing shift for the Claude Agent SDK after developer backlash, highlighting rising sensitivity around agent usage costs and pricing models.

Windows local AI on RTX - Microsoft experiments with running Phi Silica locally on Windows using Nvidia RTX GPUs, expanding on-device AI development beyond NPUs while exposing uneven feature tiers across hardware.

NVIDIA Blackwell tops MLPerf - NVIDIA’s Blackwell platform leads MLPerf Training 6.0 with fastest time-to-train across workloads, influencing data-center buying decisions for frontier-scale AI training.

Android 17 becomes agent-friendly - Google ships Android 17 to Pixel devices and AOSP, expanding AppFunctions for agent-discoverable actions and enforcing adaptive-first UI rules for foldables, tablets, and desktop mode.

Durable streaming to stop re-billing - A proposed ‘durable buffer’ between agents and LLM providers can resume streaming after crashes, preventing duplicate token charges and improving reliability for long-running workflows.

Discipline replaces vibe coding - Charity Majors argues AI makes code cheap, so teams must invest in specs, invariants, tests, observability, and continuous evaluation—turning 2026 into a ‘return to discipline.’

AI trust gap in America - A Pew survey finds Americans are pessimistic about AI’s long-term impact and distrust regulation and corporate safety, even as daily chatbot use and AI-generated summaries rise.

Wearables as next AI platform - Qualcomm pitches AI wearables—glasses, pins, earbuds—as the post-smartphone platform, launching Snapdragon Reality Elite to bring more on-device AI to mixed-reality devices.

Language-driven robot world models - Qwen-RobotWorld introduces a language-con

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.