Andrej Karpathy @karpathy
🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥 karpathy.ai Stanford Joined April 2009-
Tweets9K
-
Followers970K
-
Following899
-
Likes11K
Money can't buy happiness. Just like an H100. H100 = happiness.
🔥llm.c update: Our single file of 2,000 ~clean lines of C/CUDA code now trains GPT-2 (124M) on GPU at speeds ~matching PyTorch (fp32, no flash attention) github.com/karpathy/llm.c… On my A100 I'm seeing 78ms/iter for llm.c and 80ms/iter for PyTorch. Keeping in mind this is fp32,…
Consider being a labeler for an LLM. The prompt is “give me a random number between 1 and 10”. What SFT & RM labels do you contribute? What does this do the network when trained on? In subtle way this problem is present in every prompt that does not have a single unique answer.
The history of computing is repeating in an echo, except replace computers that do precise arithmetic on bytes with computers that do statistical arithmetic on tokens.
# scheduling workloads to run on humans Some computational workloads in human organizations are best "run on a CPU": take one single, highly competent person and assign them a task to complete in a single-threaded fashion, without synchronization. Usually the best fit when…
🧠: “Let’s but this (text)book! Nice and now… instead of reading it… let’s buy another one!” 💡 All of the dopamine is generated only at the point of resolving to read something. After that there is no juice left 😅
A few new CUDA hacker friends joined the effort and now llm.c is only 2X slower than PyTorch (fp32, forward pass) compared to 4 days ago, when it was at 4.2X slower 📈 The biggest improvements were: - turn on TF32 (NVIDIA TensorFLoat-32) instead of FP32 for matmuls. This is a…
torch.compile is cool but LLM compile: takes your .py repo as string and outputs a brand new, custom, from scratch, minimal code repository directly running your network in highly optimized CUDA
Okay I did a first quick pass of naive CUDA kernels for the forward pass of GPT-2 and pushed everything to one file in llm.c, Still only ~1000 lines of code: github.com/karpathy/llm.c… Current per iteration timings on my Lambda box <3 A100 40GB PCIe, B=4, T=1024: - llm.c: 111ms -…
Btw writing the llm.c training code would imo be a very interesting, impressive, self-contained and very meta challenge for LLM agents. The prompt is: Take the PyTorch code train_gpt2.py And write, compile and unit test a single .c file that reproduces the training: train_gpt2.c…
I added a quick crappy tutorial on how PyTorch layers are moved to C, with a few possibly helpful pointers: github.com/karpathy/llm.c…
Yann LeCun @ylecun
710K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.François Chollet @fchollet
469K Followers 770 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Rob Maurer @TeslaPodcast
292K Followers 106 Following Rob Maurer hosts Tesla Daily - news and analysis on Tesla, Inc., published every weekday.James Stephenson @ICannot_Enough
203K Followers 717 Following @ElonMusk and I own Tesla (along with several other people).AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxSebastian Raschka @rasbt
266K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Pierre Ferragu @p_ferragu
93K Followers 60 Following Commenting on technology - no investment recommendations.Google DeepMind @GoogleDeepMind
942K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.K10✨ @Kristennetten
158K Followers 6K Following At the foot of the mountain .. though not verifiable 🚘 Tesla is ♥️ Earth 🍀Spacex 🚀 • Third Row Founder • FSDBeta Tester 2020Warren Redlich - Chas.. @WR4NYGov
73K Followers 2K Following Dad, tech enthusiast & libertarian Florida Man retraining my nomad neural nets on the Thai and Asia datasetsDirty Tesla @DirtyTesLa
98K Followers 1K Following YouTube and Merch: https://t.co/xDxdMQtRrJ MS, Biology; Anti-sleep AiMathias Føns @FonsDK
31K Followers 335 Following Strategy @Shift4. Passionate about Space, EVs & Integrated Commerce. Opinions are my own, NFAChuck Cook @chazman
47K Followers 1K Following USNavy CDR S3 Viking, A321 Captain, OG FSDβ, @NavalAcademy Aero Engr, Developer, Arborist, Beekeeper, Aquaponics, Keto Coach, see my https://t.co/G841drcL9GALEX @ajtourville
58K Followers 347 Following Physics engineer here for the progress of disruptive innovations and ventures Elon Musk picked my 𝕏 logo design for 𝕏 All-in $TSLA since 08/2019AI at Meta @AIatMeta
531K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Matt Smith @MatchasmMatt
39K Followers 513 Following Equity Analysis at Rebellionaire. Finance and energy nerd. Giga-dad of 6 young kids. Catholic. Tweets are not financial advice.JPR007 @jpr007
49K Followers 7K Following Global Citizen Providing Intelligent Information Please use the JPR Library to support our activity Go to https://t.co/dye7rb3Z8kNot @Notjnot
1 Followers 49 FollowingHu Cang @HuCang2
46 Followers 315 FollowingRomy2316 @Romy23161950601
0 Followers 14 FollowingRudolf J @rudolfjurisic
62 Followers 19 FollowingStas Skavronski @SSZeb24
0 Followers 3 FollowingAiSpeculator @AiSpeculator
0 Followers 35 Followingsnoop2head @snoop2head
1 Followers 10 Following: D @jds2711
0 Followers 127 FollowingChéron Pierre-Antoin.. @pa_cheron
132 Followers 573 FollowingHassan arzoo @hssnarzoo
16 Followers 211 FollowingIndar Pakharia @Dipdumber
3 Followers 61 Following Industrial & Graphic Designer | Innovaton through Design x Engineering | Product Development & User Research - Actively seeking work opportunityGenuxAI @AiGenux63090
0 Followers 32 Following전용찬 @yongchanchun
0 Followers 10 FollowingElectronicsseeker @libertarian108
1 Followers 186 Following$tony $mith @stony_smithwola
23 Followers 528 Following Mukongo rebel who is searching ethic life culture lover and future Mr.Polymathice @icepeak486
14 Followers 63 FollowingRommy @Red_eyes_DMD
12 Followers 202 FollowingPrasann @AnitaPa21283455
12 Followers 426 Following IT Undergrad student at NSUT. Math lover and physics explorer.Chulhan Lee @chulhan_x_lee
24 Followers 12 FollowingThomas Anastaselos @tanastaselos
127 Followers 208 Following Tech, startups, traveling, running | Revenue operations at @ChartMogulLu Otero @luliotero
263 Followers 1K FollowingNicolas @Nicolas7358
77 Followers 460 FollowingAsuiketIsTourist @AusketIsTourisk
82 Followers 398 FollowingAngii Johnson @AngiiJohns4225
0 Followers 3 FollowingAlfonso Pidal @pidal233
120 Followers 359 FollowingNguyễn Kiên @NguynKin134
3 Followers 30 FollowingFeiticeiro @thebitwizard_
38 Followers 71 FollowingRak @RakR_105
0 Followers 10 FollowingBianca Rhys @BiancaRhys31282
4 Followers 133 FollowingNC @NC8089603155151
12 Followers 35 FollowingTanisha Rivera @TanishaRiv68355
3 Followers 20 FollowingMiles @miles155
81 Followers 187 Following❍rnaməntal @ornamental_ai
0 Followers 11 Following 🔗 https://t.co/M9TxvsElPG 🔗 https://t.co/LvII46ksAgitsjusttrash @its_justtrash
1 Followers 32 FollowingMatt @Kabsch_Tech
29 Followers 33 FollowingGregorio jose Ladera @Gregoriojo82243
154 Followers 3K FollowingVioletta D.🧋 @vecanqa
1K Followers 5K Following Communication and Media scientist. | GER & ENG | 'All my opinions are yours.' | Like, RT ≠ Zustimmung | Here for that #ZeroCovid lifestyle. #NoAFD 🇮🇱Jasvinder Ahuja @jasvinderahuja
397 Followers 467 Following Research: transmission of DNA from parents to offsprings. Views: they keep evolving!Yann LeCun @ylecun
710K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.François Chollet @fchollet
469K Followers 770 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxSebastian Raschka @rasbt
266K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.AI at Meta @AIatMeta
531K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Soumith Chintala @soumithchintala
185K Followers 877 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.PyTorch @PyTorch
379K Followers 77 Following Tensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundationJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordIlya Sutskever @ilyasut
370K Followers 2 Following towards a plurality of humanity loving AGIs @openaiLucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Hugging Face @huggingface
342K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhateEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pAndrew Trask @iamtrask
74K Followers 191 Following @openminedorg, @GoogleDeepMind ethics team, @OxfordUni phd candidate, @UN pet lab, @GovAI_, creator of #GrokkingDeepLearning, NALU, and sense2vecJeff Dean (@🏡) @JeffDean
296K Followers 6K Following Chief Scientist, Google DeepMind and Google Research. Co-designer/implementor of things like @TensorFlow, MapReduce, Bigtable, Spanner, Gemini .. (he/him)Tanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbKevin Patrick Murphy @sirbayes
42K Followers 334 Following Research Scientist at Google Brain / Deepmind. Interested in Bayesian Machine Learning.main @main_horse
8K Followers 474 Following AGI Believer. Haven't applied @OpenAI. Likes are not always endorsement.Austin Huang @austinvhuang
3K Followers 1K Following General intelligence as personal computing. Past: @GoogleDeepMind, MIT, Harvard, Berkeley.Tristan Hume @trishume
6K Followers 330 Following Performance optimization lead @AnthropicAI. Profiling, distributed systems, dev tools, interpretability. [email protected]Zhuohan Li @zhuohan123
3K Followers 685 Following CS PhD Student 👨🏻💻 @ UC Berkeley 🌁 🤖️ Machine Learning SystemsJustine Tunney @JustineTunney
32K Followers 269 Following I built a C library that lets you compile 12kb static binaries that run natively on Linux, Mac, Windows, FreeBSD, OpenBSD, NetBSD and BIOS using just GCC/Clang.Wing Lian (caseus) @winglian
9K Followers 2K Following @axolotl_ai dev. OpenAccess AI Collective founder. Alignment Labs. AI/ML tinkerer. Building tools for everyone.Jim Keller @jimkxa
33K Followers 135 Following CEO @tenstorrent, Cofounder @atomic_semi @BayaSystems and FlexAI board member. Fan of 2x2 matrixes, books, refactoring and creative tensionWoosuk Kwon @woosuk_k
2K Followers 343 Following PhD student at @Berkeley_EECS building @vllm_projectUnsloth AI @UnslothAI
3K Followers 247 Following Making AI & LLMs more accessible + faster for everyone! 🦥 Github: https://t.co/2kXqhhvLsb Discord: https://t.co/1Gmc1SDEljinigo quilez @iquilezles
47K Followers 40 Following * Math & art. Product Manager. Demoscener. * Created Quill, Shadertoy, Pixar's Wondermoss, Memix, ... * Videos: https://t.co/r50AVR0ITo * Tutos: https://t.co/MVjsGVRH5sPhillip Lippe @phillip_lippe
2K Followers 426 Following PhD student at @UvA_Amsterdam (@quvalab), @GoogleDevExpert JAX/Flax | Prev.: Intern @GoogleDeepMind, @MSFTResearchBeff — e/acc ⏩ @BasedBeffJezos
101K Followers 2K Following chief accelerator & founder @ e/acc // thermodynamic priest // Kardashev gradient climber // memetic warlord // building @extropic_aiYi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼Daniel Han @danielhanchen
7K Followers 932 Following Building @UnslothAI. Finetune LLMs 30x faster https://t.co/aRyAAgKOR7. Prev ML at NVIDIA. Hyperlearn used by NASA. I like maths, making code go fastHao Liu @haoliuhl
4K Followers 155 Following machine learning, neural networks. phd student @berkeley_ai. https://t.co/ZNJawlrerSFigure @Figure_robot
71K Followers 1 Following Figure is an AI Robotics company building the world's first commercially viable autonomous humanoid robot.Matei Zaharia @matei_zaharia
39K Followers 1K Following CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZArc Institute @arcinstitute
22K Followers 24 Following A new scientific institution for curiosity-driven biomedical science and technology.Pessimists Archive @PessimistsArc
91K Followers 65 Following Exploring technophobia and moral panic through the ages. A litany of shameful cynicism and spite. Curated by @louisanslowMisha Laskin @MishaLaskin
8K Followers 174 Following Staff Research Scientist @DeepMind. Previously @berkeley_ai. YC alum.Lianmin Zheng @lm_zheng
4K Followers 437 Following CS Ph.D. @ UC Berkeley. Creator of Alpa, Vicuna, and Chatbot Arena. @lmsysorgSholto Douglas @_sholtodouglas
15K Followers 856 Following Scaling Gemini @Deepmind - working towards intelligence too cheap to meterJulia Bauman @JuliaBauman2
8K Followers 455 Following PhD student at @Stanford Genetics in @LarsMSteinmetz lab | Prev @broadinstitute | Explaining cool biotech to the world here and @ 60_SecondScience on TikTokWaymo @Waymo
91K Followers 222 Following Creating a new way forward with Waymo One in #PHX, #SF, #LA, & #ATX 🤖🚘 Hear from our co-CEOs: @dmitri_dolgov and @TechTekedra. Download the Waymo One app: 👇LM Studio @LMStudioAI
16K Followers 182 Following Download & run local/open LLMs on your computer 👾 • App: https://t.co/YS5uiRQ7TI (Mac/Windows/Linux)LLM Security @llm_sec
8K Followers 297 Following Research, papers, jobs, and news on large language model security. Got something relevant? DM / tag @llm_secFern @hi_tysam
2K Followers 199 Following I make tiny, speedy neural networks and community-funded open source research. I also do consulting! Often holds the CIFAR10 speed record ( ;) ). she/they ❤️:')Teknium (e/λ) @Teknium1
29K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsJohann Rehberger @wunderwuzzi23
3K Followers 630 Following Hacking neural networks so that we don’t get stuck in the matrix. Red Team Director @ Electronic Arts. Entrepreneur. Builder and Breaker. Opinions are my own.Erik Voorhees @ErikVoorhees
682K Followers 4K Following Toward peace, markets, and Bitcoin. Founder of https://t.co/vPo8SbPo6QThe Commons @thesfcommons
7K Followers 3 Following A fourth place for inner-outer curiosity, co-created play & collective flourishing in SF ☕ join 👉 https://t.co/inNZekso0eSumanth Hegde @sumanthrh
784 Followers 18 Following MS CS @UCSanDiego. Previously, GenAI @C3_AI. EE @iitmadras. Machine Learning and Systems. Intensity is all you need.Stas Bekman @StasBekman
7K Followers 268 Following Toolmaker. Software creator, optimizer and harmonizer. Makes things work and fly at @ContextualAI Training LLM/RAG/Generative AI/Machine Learning/ScalabilityHyung Won Chung @hwchung27
18K Followers 229 Following Research Scientist @OpenAI. Past: @Google Brain / PhD @MITAidan Clark @_aidan_clark_
4K Followers 210 Following Research @OpenAI. Ex: @DeepMind, @BerkeleyDAGRS Hae sententiae verbaque mihi soli suntAngela Jiang @jiangelaa
8K Followers 380 Following 🌎 Global affairs @OpenAI ⏮️ Prev: @OpenAI product, @CarnegieMellon CS PhD 🙋🏻♀️ 💬 💜 Still prefers humans over computersPika @pika_labs
116K Followers 53 Following Video on command. Website: https://t.co/G5bjmrMQsx Discord: https://t.co/bX68ThPTQH About: https://t.co/atvdcgbe9SWow! I want one!
Challenge accepted @natfriedman! The quietest way to remove leaves from @sheeprobotics
I've just released llamafile v0.8 which features LLaMA3, Mixtral 8x22b, and Grok support. It goes 25x faster than ollama at running LLaMA3 70B on CPU. My new tensor multiplication kernels let llamafile eval MoE models 2x faster than llama.cpp github.com/Mozilla-Ocho/l…
I've spent the past ~2 weeks building a GPU from scratch with no prior experience. It was way harder than I expected. Progress tracker in thread (coolest stuff at the end)👇
When the beverage companies start getting in on it, it's a bubble.
Today, @CocaColaCo and @Microsoft announced an expanded partnership, including a $1.1B commitment from Coca-Cola to the Microsoft Cloud and its generative AI capabilities to help “quench their thirst” for innovation. Learn more: msft.it/6019YH1QR
I think the hardest thing for me the last few years has been seeing so many talented scientists who obviously belong in the academy turn into tech company middle managers or startup founders.
We have finally done it. After all this time and due to countless requests from our users, we've shipped what I think is our most important and revolutionary feature yet. You can now interrupt Claude's yapping with our new stop generation button!
what the fuck is going on?! - nearly 1 billion dollar valuation - $252M raised - 'Augment is using fine-tuned “industry-leading” open models of some sort.' - collection of random executives with no background in this - 50 employees!?
One year ago, I left Google Brain (now DeepMind) to join a very early startup. We had fewer than 10 people at that time, and have grown many times since. Today, I am extremely proud to share our milestone. We are Augment. You can read about us here. techcrunch.com/2024/04/24/eri…
New video: MambaByte. Argues that w/o attention, byte models are competitive with tokenized models at training. Decoding can be sped-up by token-level speculation and low-entropy parallel verification. youtube.com/watch?v=kcd0BT…
A glimpse at what is cooking for CUDA MODE's upcoming talk this Friday at 12pm Pacific! Jake Hemstad and @g_evtushenko from NVIDIA will discuss how to convert @karpathy llm.c to @nvidia CUDA C++. The link to attend is in the next thread.
PyTorch 2.3 is here 😎🔥 PyTorch 2.3 offers support for user-defined Triton kernels in torch.compile, allowing for users to migrate their own Triton kernels from eager without experiencing performance regressions or graph breaks. Details: hubs.la/Q02tYcYq0
I know it just released, but I don't see many people talking about the Phi-3 tokenizer! 👀 Here's the full list of added special tokens... what do you notice? 🤯 <|assistant|> <|step|> <|function_output|> <|tag|> <|function_call|> <|system|> <|end|> <|raw|> <|continue|> <|user|>…
Cool new work from some colleagues at Apple: more accurate LLMs with fewer parameters and fewer pre-training tokens. Also has MLX support out of the box! Code here: github.com/apple/corenet/…
Apple presents OpenELM An Efficient Language Model Family with Open-source Training and Inference Framework The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and
Phi-3's sliding window is 2048 and not 2047! So not an odd number! Glad it got resolved quickly! Also looks like in fact Phi-3 (3.8b) uses sliding window attention like Mistral. 2048 context length, but SWA up to 4096. Link to PR: huggingface.co/microsoft/Phi-…
Phi 3 (3.8B) got released! The paper said it was just a Llama arch, but I found some quirks while adding this to @UnslothAI: 1. Sliding window of 2047? Mistral v1 4096. So does Phi mini have SWA? (And odd num?) Max RoPE position is 4096? 2. Upcasted RoPE? Like Gemma? 3. Dynamic…
Another vintage UNIX workstation running LLMs with llama2.c This time it's SPARC: a Sun Ultra 45 from 2006 running Solaris 10. 4.55 tok/s on TinyStories 15M, not bad. I reused the simple big-endian port I did for the SGI Indigo2 with no mods, should I submit a PR @karpathy
I’m excited to share that I’m working on a new book about building applications with foundation models! AI Engineering builds upon Machine Learning Systems Design, but with a focus on large scale, ready made models. The book covers: - The new AI stack (e.g. how it differs from…
Just one more H100, please. I just need one more and I will be fixed, please help
Money can't buy happiness. Just like an H100. H100 = happiness.
@karpathy @andromeda74356 yes it's very interesting. we’re working on a blog post to analyze topic distribution/quality on the 10K votes and quantitative analysis. will share more soon!