Episode Details

Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI

Published 2 years, 1 month ago

Description

Our first ever demo day aimed for 15-20 people and ended up ballooning to >200 and covered in the news. We are now running the 2024 edition in SF on Feb 23: Latent Space Final Frontiers, a startup and research competition in “The Autonomous Workforce”, ”Beyond Transformers & GPUs”, and “Embodied AI”.

RSVP here! You can find all LS online/IRL events on our new calendar. Super Early Bird tickets have just gone on sale for AI Engineer World’s Fair, June 25-27!

Today we have the honor of hosting two of Together AI’s co-founders: Ce Zhang (CTO) and Vipul Ved Prakash (CEO). This is a rare opportunity to recap the history of the company since our last check-in with Tri Dao (Chief Scientist), some of their big releases, and do a deep dive into the state of the AI inference market.

Together has emerged as one of the most consequential new startups in the new AI summer, last announcing a ~$100m Series A raise in November (at a ~$360-565m valuation).

Note from future: about a week after this pod was published, rumors were confirmed that Salesforce had led another $100m Series B at a $1b valuation.

But there are at least three Togethers - Together the Research Lab, Together the Fine Tuning & Inference platform, and Together the custom models service. As we clarify on the pod, the overarching philosophy of Together is the ability to improve on all these fronts simultaneously by being “full stack”, from the lowest level kernel and systems programming to the highest level mathematical abstractions driving new model architectures and inference algorithms.

Bringing Research and Industry Together

In just one year, Together has been behind some of the most exciting research in AI:

* RedPajama, a fully open source dataset for model pre-training which mirrored the Llama1 recipe. Then followed by RedPajama2, a 30T tokens dataset of filtered and de-duplicated tokens.

* RedPajama-INCITE-3B and 7B, which were SOTA in a few benchmarks at the time of release.

* FlashAttention-2, developed by Together’s Chief Scientist Tri Dao. We covered FA-2 in a previous episode with him.

* Mamba-3B, the most promising transformer-alternative model that they released in collaboration with Cartesia.

Episode Details

Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI

Description

Listen Now

Love PodBriefly?