Yannic Kilcher Videos (Audio Only)

176 episodes - English - Latest episode: 6 months ago - ★★★★★ - 1 rating

I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.

Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Technology
Homepage Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Episodes

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

October 17, 2023 12:07 - 32 minutes - 29.7 MB

#llm #ai #chatgpt How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks - an interesting phenomenon where the token at position 0 acts as an absorber of "extra" attention. OUTLINE: 0:00 - Introduction 1:20 - What is the problem? 10:30 - The hypothesis: Attention Sinks 15:10 - Experimental evidence 18:4...

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

October 17, 2023 12:06 - 46 minutes - 42.8 MB

#ai #promptengineering #evolution Promptbreeder is a self-improving self-referential system for automated prompt engineering. Give it a task description and a dataset, and it will automatically come up with appropriate prompts for the task. This is achieved by an evolutionary algorithm where not only the prompts, but also the mutation-prompts are improved over time in a population-based, diversity-focused approach. OUTLINE: 0:00 - Introduction 2:10 - From manual to automated prompt engine...

Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

October 05, 2023 14:07 - 28 minutes - 26 MB

#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. OUTLINE: 0:00 - Intro 2:40 - The impossible triangle 6:55 - Parallel vs sequential 15:35 - Retention mechanism 21:00 - Chunkwise and multi-scale retention 24:10 - Comparison to other architectures 26:3...

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

October 05, 2023 14:05 - 53 minutes - 48.6 MB

#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO. Paper: https://arxiv.org/abs/2308.08998 Abstract: Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning the...

[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

August 28, 2023 13:42 - 44 minutes - 40.4 MB

#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning. References: https://twitter.com/ylecun/status/1681336284453781505 https://ai.meta.com/llama/ https://about.fb.com/news/2023/07/llama-2-statement-of-support/ https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-media-platform-ranking-the-worlds-largest-networking-sites/4/ https://github.com/Alpha-VLLM/LLaMA2-Accessory https://together.ai/blog/llama-2-7b-32k?s=09&utm_source=pocket_s...

How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

August 28, 2023 13:40 - 29 minutes - 26.7 MB

#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime. https://threatmap.checkpoint.com/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me,...

Recipe AI suggests FATAL CHLORINE GAS Recipe

August 28, 2023 13:39 - 7 minutes - 6.5 MB

#llm #safety #gpt4 A prime example of intellectual dishonesty of journalists and AI critics. Article: https://gizmodo.com/paknsave-ai-savey-recipe-bot-chlorine-gas-1850725057 My Recipe AI: https://github.com/yk/recipe-ai Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me,...

DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors)

August 28, 2023 13:38 - 53 minutes - 49 MB

#ai #diffusion #stabilityai An interview with DeepFloyd members Misha Konstantinov and Daria Bakshandaeva on the release of the model IF, an open-source model following Google's implementation of Imagen. References: https://www.deepfloyd.ai/deepfloyd-if https://huggingface.co/DeepFloyd https://twitter.com/_gugutse_ https://twitter.com/_bra_ket Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter....

[ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released

August 28, 2023 13:36 - 31 minutes - 28.4 MB

#gpt4 #mit #ai A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper... OUTLINE: 0:00 - ChatGPT gives out Windows 10 keys 0:30 - MIT exam paper 2:50 - Prompt engineering 5:30 - Automatic grading 6:45 - Response by other MIT students 8:30 - Unsolvable questions 10:50 - Duplicates 13:30 - Cascading the heuristics 22:40 - Other problems 29:25 - OpenLLaMA 13B published ...

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)

August 28, 2023 13:35 - 35 minutes - 32.7 MB

#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It is a very promising technique with applications potentially beyond watermarking itself. OUTLINE: 0:00 - Introduction & Overview 1:30 - Why Water...

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

August 28, 2023 13:33 - 1 hour - 57 MB

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs. Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc OUTLINE: 0:00 - Introduction 1:50 - Fully Connected In-Person Conference in SF June 7th 3:00 - Transformers vs RNNs 8:00 - RWKV: Best of both worlds 12:30 - LSTMs 17:15 - Evolution of RWKV's Linear Attention 30:40 - RWKV's Layer Structure 49:15 - Time-Parallel vs Sequence Mode 53:55 - Experim...

Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

August 28, 2023 13:31 - 29 minutes - 27 MB

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting. OUTLINE: 0:00 - Introduction 1:20 - From Chain-of-Thought to Tree-of-Thought 11:10 - Formalizing the algorithm 16:00 - Game of 24 & Creative writ...

OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)

August 28, 2023 13:29 - 16 minutes - 14.8 MB

#ai #openai #gpt4 US Senate hearing on AI regulation. MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4 Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (com...

[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

August 28, 2023 13:28 - 39 minutes - 35.8 MB

#google #openai #mlnews Updates from the world of Machine Learning and AI Great AI memes here: https://twitter.com/untitled01ipynb OUTLINE: 0:00 - Google I/O 2023: Generative AI in everything 0:20 - Anthropic announces 100k tokens context 0:35 - Intro 1:20 - Geoff Hinton leaves Google 7:00 - Google memo leaked: we have no moat 11:30 - OpenAI loses 540M 12:30 - Google AI: Product first 15:50 - Ilya Sutskever on safety vs competition 18:00 - AI works cannot be copyrighted 19:40 - OpenAI tri...

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

August 28, 2023 13:25 - 24 minutes - 22.5 MB

#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are. OUTLINE: 0:00 - Intro 2:15 - Transformers on long sequences 4:30 - Tasks considered 8:00 - Recurrent Memory Transformer 19:40 - Experiments on scaling and attention maps 24:00 - Conclusion Paper: https://arxiv.org/abs/2304.11062 Abstract: This technical report presents the ...

OpenAssistant RELEASED! The world's best open-source Chat AI!

August 28, 2023 13:23 - 21 minutes - 19.3 MB

#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chat Homepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1 Code: https://github.com/LAION-AI/Open-Assistant Paper (temporary): https://ykilcher.com/oa-paper Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www....

OpenAssistant First Models are here! (Open-Source ChatGPT)

August 28, 2023 13:19 - 16 minutes - 15.5 MB

#openassistant #chatgpt #gpt4https://open-assistant.io/chathttps://huggingface.co/OpenAssistanthttps://github.com/LAION-AI/Open-Assistant Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financ...

The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more)

August 28, 2023 13:13 - 41 minutes - 37.6 MB

#mlnews #gpt4 #copilot Your weekly news all around the AI world Check out W&B courses (free): https://wandb.courses/ OUTLINE: 0:00 - Intro 0:20 - GPT-4 announced! 4:30 - GigaGAN: The comeback of Generative Adversarial Networks 7:55 - ChoppedAI: AI Recipes 8:45 - Samsung accused of faking space zoom effect 14:00 - Weights & Biases courses are free 16:55 - Data Portraits 18:50 - Data2Vec 2.0 19:50 - Gated Models on Hugging Face & huggingface.js 22:05 - Visual ChatGPT 23:35 - Bing crosses 10...

GPT-4 is here! What we know so far (Full Analysis)

August 28, 2023 13:07 - 34 minutes - 31.3 MB

#gpt4 #chatgpt #openai References: https://openai.com/product/gpt-4https://openai.com/research/gpt-4https://cdn.openai.com/papers/gpt-4.pdf Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me fi...

This ChatGPT Skill will earn you $10B (also, AI reads your mind!)

August 28, 2023 12:52 - 43 minutes - 39.8 MB

#mlnews #chatgpt #llama ChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner! ERRATA: It's a 4090, not a 4090 ti 🙃 OUTLINE: 0:00 - Introduction 0:20 - GTC 23 on March 20 1:55 - ChatGPT API is out! 4:50 - OpenAI becomes more business-friendly 7:15 - OpenAI plans for AGI 10:00 - ChatGPT influencers 12:15 - Open-Source Prompting Course 12:35 - Flan UL...

LLaMA: Open and Efficient Foundation Language Models (Paper Explained)

August 28, 2023 12:41 - 41 minutes - 37.6 MB

#ai #meta #languagemodel LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer on more data and show that something like gpt-3 can be outperformed by significantly smaller models when trained like this. Meta also releases the trained models to the research community. OUTLINE: 0:00 - Introduction & Paper Overview 4:30 - Rant on Open-Sourcing 8:05 - Training Data 12:40 - Training Hyperparameters 14:50 - Architecture Modifications 17:...

Open Assistant Inference Backend Development (Hands-On Coding)

August 28, 2023 12:39 - 1 hour - 74.5 MB

#ai #huggingface #coding Join me as I build streaming inference into the Hugging Face text generation server, going through cuda, python, rust, grpc, websockets, server-sent events, and more... Original repo is here: https://github.com/huggingface/text-generation-inference OpenAssistant repo is here: https://github.com/LAION-AI/Open-Assistant (see inference/) Check out https://www.wandb.courses/ for free MLOps courses! Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/m...

OpenAssistant - ChatGPT's Open Alternative (We need your help!)

August 28, 2023 12:32 - 35 minutes - 32.8 MB

#openassistant #chatgpt #ai Help us collect data for OpenAssistant, the largest and most open alternative to ChatGPT. https://open-assistant.io OUTLINE: 0:00 - Intro 0:30 - The Project 2:05 - Getting to Minimum Viable Prototype 5:30 - First Tasks 10:00 - Leaderboard 11:45 - Playing the Assistant 14:40 - Tricky Facts 16:25 - What if humans had wings? 17:05 - Can foxes be tamed? 23:45 - Can zebras be tamed? 26:15 - Yo (spam) 27:00 - More tasks 29:10 - Entitled Emails 34:35 - Final Words Li...

ChatGPT: This AI has a JAILBREAK?! (Unbelievable AI Progress)

January 02, 2023 08:05 - 31 minutes - 29.5 MB

#chatgpt #ai #openai ChatGPT, OpenAI's newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:40 - Sponsor: Weights & Biases 3:20 - ChatGPT: How does it work? 5:20 - Reinforcement Learning from Human Feedback 7:10 - ChatGPT Origins: The GPT-3.5 Series 8:20 - OpenAI's strategy: Iterative Refinement 9:10 - ChatGPT's amazi...

[ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving

November 30, 2022 07:16 - 41 minutes - 38.8 MB

#ai #mlnews #gpt4 Your weekly news from the AI & Machine Learning world. OUTLINE: 0:00 - Introduction 0:25 - AI reads brain signals to predict what you're thinking 3:00 - Closed-form solution for neuron interactions 4:15 - GPT-4 rumors 6:50 - Cerebras supercomputer 7:45 - Meta releases metagenomics atlas 9:15 - AI advances in theorem proving 10:40 - Better diffusion models with expert denoisers 12:00 - BLOOMZ & mT0 13:05 - ICLR reviewers going mad 21:40 - Scaling Transformer inf...

CICERO: An AI agent that negotiates, persuades, and cooperates with people

November 30, 2022 07:07 - 1 hour - 56.5 MB

#ai #cicero #diplomacy A team from Meta AI has developed Cicero, an agent that can play the game Diplomacy, in which players have to communicate via chat messages to coordinate and plan into the future. Paper Title: Human-level play in the game of Diplomacy by combining language models with strategic reasoning Commented game by human expert: https://www.youtube.com/watch?v=u5192bvUS7k OUTLINE: 0:00 - Introduction 9:50 - AI in cooperation games 13:50 - Cicero agent overview 25:00 - A...

[ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming

November 23, 2022 11:18 - 22 minutes - 21.2 MB

#mlnews #ai #mlinpl Your news from the world of Machine Learning! OUTLINE: 0:00 - Introduction 1:25 - Stable Diffusion Multiplayer 2:15 - Huggingface: DOI for Models & Datasets 3:10 - OpenAI asks for more funding 4:25 - The Stack: Source Code Dataset 6:30 - Google Vizier Open-Sourced 7:10 - New Models 11:50 - Helpful Things 20:30 - Prompt Databases 22:15 - Lexicap by Karpathy References: Stable Diffusion Multiplayer https://huggingface.co/spaces/huggingface-projects/stable-dif...

The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.)

November 23, 2022 11:15 - 27 minutes - 25.8 MB

#ai #stablediffusion #license So-called responsible AI licenses are stupid, counterproductive, and have a dangerous legal loophole in them. OpenRAIL++ License here: https://www.ykilcher.com/license OUTLINE: 0:00 - Introduction 0:40 - Responsible AI Licenses (RAIL) of BLOOM and Stable Diffusion 3:35 - Open source software's dilemma of bad usage and restrictions 8:45 - Good applications, bad applications 12:45 - A dangerous legal loophole 15:50 - OpenRAIL++ License 16:50 - This has ...

ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)

November 23, 2022 11:10 - 1 hour - 60.1 MB

#ai #language #knowledge Large Language Models have the ability to store vast amounts of facts about the world. But little is known, how these models actually do this. This paper aims at discovering the mechanism and location of storage and recall of factual associations in GPT models, and then proposes a mechanism for the targeted editing of such facts, in form of a simple rank-one update to a single MLP layer. This has wide implications both for how we understand such models' inner worki...

Neural Networks are Decision Trees (w/ Alexander Mattick)

October 23, 2022 16:41 - 31 minutes - 29.5 MB

#neuralnetworks #machinelearning #ai Alexander Mattick joins me to discuss the paper "Neural Networks are Decision Trees", which has generated a lot of hype on social media. We ask the question: Has this paper solved one of the large mysteries of deep learning and opened the black-box neural networks up to interpretability? OUTLINE: 0:00 - Introduction 2:20 - Aren't Neural Networks non-linear? 5:20 - What does it all mean? 8:00 - How large do these trees get? 11:50 - Decision Trees v...

This is a game changer! (AlphaTensor by DeepMind explained)

October 23, 2022 16:36 - 55 minutes - 51 MB

#alphatensor #deepmind #ai Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. DeepMind goes a step further and creates AlphaTensor, a Deep Reinforcement Learning algorithm that plays a sin...

[ML News] Stable Diffusion Takes Over! (Open Source AI Art)

October 23, 2022 16:24 - 27 minutes - 25.4 MB

#stablediffusion #aiart #mlnews Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this... Sponsor: NVIDIA GPU Raffle: https://ykilcher.com/gtc OUTLINE: 0:00 - Introduction 0:30 - What is Stable Diffusion? 2:25 - Open-Source Contributions and Creations 7:55 - Textual Inversion 9:30 - OpenAI vs Open AI 14:20 - Journalists be outraged 16:20 - AI Ethics be even more outraged 19:45 - Do we need a new social contra...

How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit

October 23, 2022 16:11 - 50 minutes - 46.6 MB

#ai #sparsity #gpu Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in terms of applications, and why sparsity should play a much larger role in the Deep Learning community. Sponsor: AssemblyAI Link:...

More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)

September 15, 2022 10:28 - 1 hour - 61.6 MB

#ai #interview #research Jacob Steinhardt believes that future AI systems will be qualitatively different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think. OUTLINE: 0:00 Introduction 1:10 Start of Interview 2:10 Blog posts series 3:56 More Is Different for AI (Blog Post) 7:40 Do you think this emergence is m...

The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)

September 07, 2022 04:24 - 19 minutes - 18.2 MB

#huggingface #pickle #exploit Did you know that something as simple as loading a model can execute arbitrary code on your machine? Try the model: https://huggingface.co/ykilcher/total... Get the code: https://github.com/yk/patch-torch-save Sponsor: Weights & Biases Go here: https://wandb.me/yannic OUTLINE: 0:00 - Introduction 1:10 - Sponsor: Weights & Biases 3:20 - How Hugging Face models are loaded 5:30 - From PyTorch to pickle 7:10 - Understanding how pickle saves data 13:00 -...

The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi)

August 29, 2022 09:46 - 1 hour - 57.2 MB

#ai #selforganization #emergence Read Sebastian's article here: https://sebastianrisi.com/self_assemb... OUTLINE: 0:00 - Introduction 2:25 - Start of Interview 4:00 - The intelligence of swarms 9:15 - The game of life & neural cellular automata 14:10 - What's missing from neural CAs? 17:20 - How does local computation compare to centralized computation? 25:40 - Applications beyond games and graphics 33:00 - Can we do away with goals? 35:30 - Where do these methods shine? 43:30 -...

The Man behind Stable Diffusion

August 29, 2022 05:37 - 25 minutes - 23.8 MB

#stablediffusion #ai #stabilityai An interview with Emad Mostaque, founder of Stability AI. OUTLINE: 0:00 - Intro 1:30 - What is Stability AI? 3:45 - Where does the money come from? 5:20 - Is this the CERN of AI? 6:15 - Who gets access to the resources? 8:00 - What is Stable Diffusion? 11:40 - What if your model produces bad outputs? 14:20 - Do you employ people? 16:35 - Can you prevent the corruption of profit? 19:50 - How can people find you? 22:45 - Final thoughts, let's des...

[ML News] BLOOM: 176B Open-Source | Chinese Brain-Scale Computer | Meta AI: No Language Left Behind

August 03, 2022 05:16 - 14 minutes - 13 MB

#mlnews #bloom #ai Today we look at all the recent giant language models in the AI world! OUTLINE: 0:00 - Intro 0:55 - BLOOM: Open-Source 176B Language Model 5:25 - YALM 100B 5:40 - Chinese Brain-Scale Supercomputer 7:25 - Meta AI Translates over 200 Languages 10:05 - Reproducibility Crisis Workshop 10:55 - AI21 Raises $64M 11:50 - Ian Goodfellow leaves Apple 12:20 - Andrej Karpathy leaves Tesla 12:55 - Wordalle References: BLOOM: Open-Source 176B Language Model https://bigsc...

JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained)

July 10, 2022 09:20 - 59 minutes - 55.2 MB

Yann LeCun's position paper on a path towards machine intelligence combines Self-Supervised Learning, Energy-Based Models, and hierarchical predictive embedding models to arrive at a system that can teach itself to learn useful abstractions at multiple levels and use that as a world model to plan ahead in time. OUTLINE: 0:00 - Introduction 2:00 - Main Contributions 5:45 - Mode 1 and Mode 2 actors 15:40 - Self-Supervised Learning and Energy-Based Models 20:15 - Introducing latent variab...

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos (Paper Explained)

June 28, 2022 15:01 - 32 minutes - 30.1 MB

#openai #vpt #minecraft Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a small set of contractor data in order to pseudo-label a giant corpus of scraped footage of gameplay. The pre-trained model is highly capable in...

Parti - Scaling Autoregressive Models for Content-Rich Text-to-Image Generation (Paper Explained)

June 28, 2022 14:51 - 34 minutes - 32.4 MB

#parti #ai #aiart Parti is a new autoregressive text-to-image model that shows just how much scale can achieve. This model's outputs are crips, accurate, realistic, and can combine arbitrary styles, concepts, and fulfil even challenging requests. OUTLINE: 0:00 - Introduction 2:40 - Example Outputs 6:00 - Model Architecture 17:15 - Datasets (incl. PartiPrompts) 21:45 - Experimental Results 27:00 - Picking a cherry tree 29:30 - Failure cases 33:20 - Final comments Website: https:/...

Did Google's LaMDA chatbot just become sentient?

June 20, 2022 12:02 - 22 minutes - 20.7 MB

#lamda #google #ai Google engineer Blake Lemoine was put on leave after releasing proprietary information: An interview with the chatbot LaMDA that he believes demonstrates that this AI is, in fact, sentient. We analyze the claims and the interview in detail and trace how a statistical machine managed to convince at least one human that it is more than just an algorithm. OUTLINE: 0:00 - Whistleblower put on leave 4:30 - What is a language model? 6:40 - The prompt is the key 10:40 - Wh...

[ML News] DeepMind's Flamingo Image-Text model | Locked-Image Tuning | Jurassic X & MRKL

May 16, 2022 13:52 - 24 minutes - 22.5 MB

Your updates directly from the state of the art in Machine Learning! OUTLINE: 0:00 - Intro 0:30 - DeepMind's Flamingo: Unified Vision-Language Model 8:25 - LiT: Locked Image Tuning 10:20 - Jurassic X & MRKL Systems 15:05 - Helpful Things 22:40 - This AI does not exist References: DeepMind's Flamingo: Unified Vision-Language Model https://www.deepmind.com/blog/tacklin... https://storage.googleapis.com/deepmi... https://twitter.com/Inoryy/status/152... LiT: Locked Image Tuning ht...

[ML News] Meta's OPT 175B language model | DALL-E Mega is training | TorToiSe TTS fakes my voice

May 12, 2022 12:21 - 19 minutes - 18 MB

#mlnews #dalle #gpt3 An inside look of what's happening in the ML world! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 1:40 - Meta AI releases OPT-175B 4:55 - CoCa: New CLIP-Competitor 8:15 - DALL-E Mega is training 10:05 - TorToiSe TTS is amazing! 11:50 - Investigating Vision Transformers 12:50 - Hugging Face Deep RL class launched 13:40 - Helpful Things 17:00 - John Deere's driverless tractors References: Meta AI re...

This A.I. creates infinite NFTs

May 12, 2022 12:03 - 18 minutes - 17.4 MB

#nft #gan #ai Today we build our own AI that can create as many bored apes as we want! Fungibility for everyone! Try the model here: https://huggingface.co/spaces/ykilcher/apes or here: https://ykilcher.com/apes Files & Models here: https://huggingface.co/ykilcher/apes/tree/main Code here: https://github.com/yk/apes-public (for the "what's your ape" app, look for the file interface_projector.py) This video is sponsored by BrightData, use this link for free credits: https://brightdata....

Author Interview: SayCan - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

May 12, 2022 11:54 - 58 minutes - 54.2 MB

#saycan #robots #ai This is an interview with the authors Brian Ichter, Karol Hausman, and Fei Xia. Original Paper Review Video: https://youtu.be/Ru23eWAQ6_E Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of these plans are feasible or appropriate. SayCan combines the semantic capabilities of language models with a bank of low-level skills, which ar...

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan - Paper Explained)

May 02, 2022 16:29 - 28 minutes - 26.6 MB

#saycan #robots #ai Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of these plans are feasible or appropriate. SayCan combines the semantic capabilities of language models with a bank of low-level skills, which are available to the agent as individual policies to execute. SayCan automatically finds the best policy to execute by considering a trade-off...

Author Interview - ACCEL: Evolving Curricula with Regret-Based Environment Design

May 02, 2022 16:22 - 57 minutes - 53.4 MB

#ai #accel #evolution This is an interview with the authors Jack Parker-Holder and Minqi Jiang. Original Paper Review Video: https://www.youtube.com/watch?v=povBD... Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and drawbacks. This paper presents ACCEL, which takes the next step into the direction of constructing curricula for multi-capable agents. ACCEL c...

ACCEL: Evolving Curricula with Regret-Based Environment Design (Paper Review)

May 02, 2022 16:13 - 44 minutes - 40.8 MB

#ai #accel #evolution Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and drawbacks. This paper presents ACCEL, which takes the next step into the direction of constructing curricula for multi-capable agents. ACCEL combines the adversarial adaptiveness of regret-based sampling methods with the capabilities of level-editing, usually found in Evolutionary Method...

LAION-5B: 5 billion image-text-pairs dataset (with the authors)

April 25, 2022 18:57 - 58 minutes - 53.7 MB

#laion #clip #dalle LAION-5B is an open, free dataset consisting of over 5 billion image-text-pairs. Today's video is an interview with three of its creators. We dive into the mechanics and challenges of operating at such large scale, how to keep cost low, what new possibilities are enabled with open datasets like this, and how to best handle safety and legal concerns. OUTLINE: 0:00 - Intro 1:30 - Start of Interview 2:30 - What is LAION? 11:10 - What are the effects of CLIP filtering? ...

Twitter Mentions

@ykilcher 163 Episodes

@giffmana 4 Episodes

@yoavgo 3 Episodes

@sama 3 Episodes

@osanseviero 2 Episodes

@hardmaru 2 Episodes

@huggingface 2 Episodes

@bhutanisanyam1 2 Episodes

@metaai 2 Episodes

@deepmind 2 Episodes

@gdb 2 Episodes

@sentdex 2 Episodes

@ylecun 2 Episodes

@homehttps 1 Episode

@cyrilzakka 1 Episode

@parafactual 1 Episode

@arankomatsuzaki 1 Episode

@wenlong_huang 1 Episode

@warvito 1 Episode

@patrickmineault 1 Episode