Episode Details
Back to Episodes
🎙️ThursdAI - LLM Finetuning deep dive, current top OSS LLMs (Platypus 70B, OrctyPus 13B) authors & what to look forward to
Description
Brief outline for your convenience:
[00:00] Introduction by Alex Volkov[06:00] Discussing the Platypus models and data curation process by Ariel, Cole and Nathaniel[15:00] Merging Platypus with OpenOrca model by Alignment Labs* Combining strengths of Platypus and OpenOrca* Achieving state-of-the-art 13B model[40:00] Mixture of Experts (MOE) models explanation by Prateek and Far El[47:00] Ablation studies on different fine-tuning methods by Teknium
Full transcript is available for our paid subscribers 👇 Why don’t you become one?
Here’s a list of folks and models that appear in this episode please follow all of them on X:
* ThursdAI cohosts - Alex Volkov, Yam Peleg, Nisten Tajiraj
* Garage Baind - Ariel, Cole and Nataniel (platypus-llm.github.io)
* Alignment Lab - Austin, Teknium (Discord server)
* SkunkWorks OS - Far El, Prateek Yadav, Alpay Ariak (Discord server)
I am recording this on August 18th, which marks the one month birthday of the Lama 2 release from Meta. It was the first commercially licensed large language model of its size and quality, and we want to thank the great folks at MetaAI. Yann LeCun, BigZuck and the whole FAIR team. Thank you guys. It's been an incredible month since it was released.
We saw a Cambrian explosion of open source communities who make this world better, even since Lama 1. For example, LLaMa.Cpp by Georgi Gerganov is such an incredible example of how open source community comes together and this one guy in the weekend Took the open source weights and made it run on CPUs and much, much faster.
Mark Zuckerberg even talked about this, how amazing the open source community has adopted LLAMA, and that Meta is also now adopting many of those techniques and developments back to run their own models cheaper and faster. And so it's been exactly one month since LLAMA 2 was released.
And literally every ThursdAI since then, we have covered a new state of the art open source model all based on Lama 2 that topped the open source model charts on Hugging Face.
Many of these top models were fine tuned by Discord organizations of super smart folks who just like to work together in the open and open source their work.
Many of whom are great friends of the pod.
Nous Research, with whom we've had a special