Mario Tormo @mt0rm0
AI Engineer / Senior Data Scientist. Appasionate about #Math, #Jazz, #Cinema and #Books. He/him. #AI #NLP #DataScience #MLOps Berlin, Germany Joined September 2020-
Tweets1K
-
Followers1K
-
Following1K
-
Likes2K
The Top ML Papers of the Week (April 15 - April 21): - Llama 3 - Mixtral 8x22B - A Survey on RAG - How Faithful are RAG Models? - Emerging AI Agent Architectures - Chinchilla Scaling: A replication attempt ...
Highly amusing update, ~18 hours later: llm.c is now down to 26.2ms/iteration, exactly matching PyTorch (tf32 forward pass). We discovered a bug where we incorrectly called cuBLAS in fp32 mathmode 🤦♂️. And ademeure contributed a more optimized softmax kernel for very long rows…
Highly amusing update, ~18 hours later: llm.c is now down to 26.2ms/iteration, exactly matching PyTorch (tf32 forward pass). We discovered a bug where we incorrectly called cuBLAS in fp32 mathmode 🤦♂️. And ademeure contributed a more optimized softmax kernel for very long rows… https://t.co/Orz0KUznrP
Very cool project -- a Gemma model fine-tuned to turn text into mind map summaries in dotfile format: kaggle.com/models/toshik/…
What if you care about finetuning LLMs for classification? Are encoder-style transformers like BERT and RoBERTa, etc, still the way to go? The recent "Label Supervised LLaMA Finetuning" paper shows that you can get really good performance by finetuning decoder-style models such…
Freshly merged: you can now evaluate LLMs in LitGPT with just one command (thanks to the great LLM Eval Harness)! github.com/Lightning-AI/l…
NYU Artificial Intelligence, Spring ‘24 This course concerns the study of rational behaviour that guides the creation of intelligent machines. In the first half of the semester we covered problem-solving, logical, and probabilistic agents.
Fun LLM challenge that I'm thinking about: take my 2h13m tokenizer video and translate the video into the format of a book chapter (or a blog post) on tokenization. Something like: 1. Whisper the video 2. Chop up into segments of aligned images and text 3. Prompt engineer an LLM…
GenAI is starting to look like Typhoid Mary. Last May, the celebrated 54-year-old LexisNexis touted hallucination-free legal citations produced by Generative AI. Instead, it is making up cases — from 2025 and 2026!!! Talk about torching one’s reputation on the altar of GenAI. —…
@karpathy > And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. 💯
@karpathy 💯for technical or ML content, a rule I use is that I need to code something along with what I am learning or reading. Short videos are automatically filtered out because of this. Long form content is the best way to learn and go deeply into a topic. This is why I prefer to…
# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are…
Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is…
Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is…
Congratulations @konstantdobler! Best Paper Award at #NeurIPS2023 Workshop on Advancing Neural Network Training is amazing!
Congratulations @konstantdobler! Best Paper Award at #NeurIPS2023 Workshop on Advancing Neural Network Training is amazing!
🎉🎉🎉 Best Paper Award at #NeurIPS2023 Workshop on Advancing Neural Network Training Congratulations! 🎉🎉🎉 @Aleph__Alpha @HPI_DE @johannes_hage @weinsam @konstantdobler Maximilian Schall
🎉🎉🎉 Best Paper Award at #NeurIPS2023 Workshop on Advancing Neural Network Training Congratulations! 🎉🎉🎉 @Aleph__Alpha @HPI_DE @johannes_hage @weinsam @konstantdobler Maximilian Schall
In collaboration with @robusthq, yesterday we shared "Tree of Attacks" a method than can jailbreak @OpenAI GPT-4 like 90% of the times. It was just covered in @WIRED arxiv.org/pdf/2312.02119…
In collaboration with @robusthq, yesterday we shared "Tree of Attacks" a method than can jailbreak @OpenAI GPT-4 like 90% of the times. It was just covered in @WIRED arxiv.org/pdf/2312.02119… https://t.co/5Jp7GybLQP
The source code for our work on evaluating and calibrating with uncertain ground truth has finally made it to GitHub - here is a thread on what’s included 🧵: github.com/google-deepmin…
So you're wondering what this Q-Learning thing is and now you want to learn about RL. Here are some useful resources! 1. If you're into video courses, youtu.be/2pWv7GOvuf0?fe… or youtu.be/SupFHGbytvA?fe… are both great 2. Textbooks: can't go wrong with incompleteideas.net/book/the-book-… 1/2
Insanely Fast Whisper Transcribe 300 minutes (5 hours) of audio in less than 98 seconds github.com/chenxwh/insane…
Just gave talk w a hypothetical example of a false positive being worse than a false negative: when an industrial rubbish robot confuses a loved one with rubbish. Turns out it wasn't so hypothetical.
Just gave talk w a hypothetical example of a false positive being worse than a false negative: when an industrial rubbish robot confuses a loved one with rubbish. Turns out it wasn't so hypothetical.
Dan | Machine Learnin.. @DanKornas
53K Followers 502 Following 🤖 ML Engineer 🔬 AI Educator 💻 I help you build AI skills through project-based learning ➡️ https://t.co/lC2UKMtRjjJack Raifer Baruch @JackRaifer
1K Followers 732 Following #DataScience / Head of Data Science @ADAIntelligence / #AI #Ethics / #66DaysOfData / Writer / Data Instructor / http://jackraiferbaruch.medium.cTuringPost @TheTuringPost
62K Followers 16K Following Newsletter exploring AI & ML - Weekly trends - LLM/FM insights - Unicorn spotlights - Global dynamics - History Led by @kseniase_ Elevate your AI game 👇🏼David Miller 🧮 @thedavescience
11K Followers 275 Following Accountant → Data Scientist → SMB Owner Showing you how to use data and automation to grow your local business.Aurimas Griciūnas @Aurimas_Gr
20K Followers 772 Following 🔨 Chief Product Officer @neptune_ai 📖 Tweeting about #LLM, #GenAI, #DataEngineering, #MachineLearning and #Data ✍️ Author of SwirlAI Newsletter.Anna Kondratenko 👩.. @anacoding
4K Followers 590 Following 🤖 Self-studying AI, Big Data, Data Science • TensorFlow, Scikit-learn, Keras, Python • @numerai newbie • #100DaysOfCode #100DaysOfDefi challenges •DJ Castro @_datajunkie
1K Followers 3K Following 35,🇵🇭 | Product Data Analyst | Founder/Content Creator of @ElectionMapsPHJames Gingerich #B2B .. @jamesvgingerich
144K Followers 122K Following #DigitalTransformation #Cloud #ContentServices #Manufacturing #Robotics #Automation #HealthTech #InsurTech #FinTech #BigData #IoT #IIoT #History #FutureOfWorkAstridLuke @tC34DWr83k86f2R
0 Followers 72 FollowingMcSoyne @McSoyne20103
0 Followers 158 FollowingYahnik @YahnikRohse
16 Followers 78 FollowingMandyBethune @9IwvB9es6XX56
0 Followers 71 FollowingJuliet Jarencio @jarencio49060
86 Followers 5K FollowingAnne-marie Defee @defee_mari
69 Followers 5K FollowingArnaud Wanet @Arnaud_Wanet
430 Followers 5K Following Software engineer (Golang) interested in AGI, distributed systems, fintech and geopolitics.Soumen Pramanik, DS, .. @soumen_eclectic
710 Followers 2K Following Enthusiastic about ML & https://t.co/nJEroRWvo5’s not who has the best algorithm that wins, but it’s who has the most data. Trying to harness C++ skillMia-rose Diederichs @Diederichs62873
95 Followers 5K Followingbecomingcalgarian @StephanB683239
6 Followers 80 FollowingAnand @Anand44719958
10 Followers 749 FollowingMignon Holiman @HolimanMig10659
94 Followers 5K FollowingMarianna Dudik @MariannaDu45983
84 Followers 5K FollowingLeona Rudisill @LeonaRudis59942
67 Followers 5K FollowingAmber Hammes @hammes_am
69 Followers 5K FollowingJoa @sgustv
136 Followers 2K Following A knowledge path! Ready to begin! #Science #technology #STEM #Softwaredevelopmentmachine learning @Mlearning_ai
6K Followers 408 Following https://t.co/36eFDHoV3o 🟠 https://t.co/rjCZMWrRMK #Art tools for creative economythankgod_egbe @thanktuaspp
455 Followers 4K Following Machine Learning Engineer | Passionate about AI 💻Soumyadip Bhattacharj.. @SoumyadipBhat19
7 Followers 51 Following Master's Student at Techno India University | ML Engineering | Computer Vision | Generative AIDeborah 🔞 @Deborah_K2835
31 Followers 954 Following Unlеаsh yоur desires alоngsidе а wоman whо embоdies insatiablе pаssion🔭 Jorge Filho @copernicanvm
150 Followers 4K Following Não há certezas, apenas oportunidades. (V de Vingança)kdiwakar reddy @KdiwakarR
1 Followers 71 FollowingAnkit Srivastava @ankitsrihbti
116 Followers 1K Following Lead Data Scientists | Artificial Intelligence | Big Data | NLP | Chat GPT | Robotics | Python | R | GCP | AWS | Azure | Kubernets | Docker | Tensor flowAyush Kumar Singh @SinghAyush2811
71 Followers 613 Following Student at IIIT Ranchi | Love to learn and code | Interested in Data science and AerospaceS-Square Systems, Inc.. @S_Squaresystems
3K Followers 4K Following S-Square delivers excellence for customers through technology & innovation. From complex cloud computing to intelligent business applications to infrastructure.TradeMonday @TradeMonday
1K Followers 4K Following AI powered Retail Experimental Platform turns consumer’s transaction, social, in-store behaviour into simulation for actionable retail insightNadia Allen @nadia_alle75635
20 Followers 846 FollowingBriana Gabriel @GabrielBri15386
82 Followers 5K FollowingMalik.shamzz @ShamzzMalik
10 Followers 133 Following WordPress Developer | Shopify Developer | Seo Expert | Digital marketer | Social media influencer | Ecommerce | SEM | Content Creater | Automations (Saas) | ppcBibhabasu Mohapatra @bibhabasuM1610
150 Followers 539 Following @airamatrix | PyTorchian... *KaggleTeam Tyler Durden*weyn @WeynGuo
12 Followers 905 FollowingPrabin Kumar Nayak @PrabinKNayak
51 Followers 573 Following A homo sapein who can communicate with human( in lang: Odiya, Hindi and English) and computers( in lang: Java and Python).Adaline Distaffen @AdalinDistaff
56 Followers 5K FollowingHaresh Vadali @haresh_vadali
71 Followers 708 Following Data Scientist @maestrotech_inc | Machine Learning and Deep Learning Enthusiast💻 | Ex intern @ Emplay Analytics | Fact :🤷🏼♂️ 50% idk and 50% idc 😵💫Raul andres @andrewolf80550
66 Followers 594 Following Parapentista, bombero aeronautico, Ingeniero mecatronico, Master en IACarly Cowder @CCowder1692
69 Followers 5K FollowingMark Tenenholtz @marktenenholtz
114K Followers 544 Following Head of AI @PredeloHQ. XGBoost peddler, transformer purveyor.Bojan Tunguz @tunguz
187K Followers 8K Following Machine Learning ex Nvidia. Kaggle Quadruple Grandmaster. Data Scientist. Physicist. Catholic. Husband. Father. Stanford Alum. e/xgb. XGBoost.eth. AMDG.Andrew Ng @AndrewYNg
1.0M Followers 913 Following Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCsShubham Saboo @Saboo_Shubham_
41K Followers 449 Following AI Products @tenstorrent 📕 Author of books on GPT-3 & Neural Search in Prod ✍️ Tweets about LLMs & Prompt Engineering 📩 DMs open for collabSebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.elvis @omarsar0
189K Followers 486 Following Building with LLMs @dair_ai • Prev: Meta AI, Galactica LLM, PapersWithCode, Elastic, PhD • Creator of the Prompting Guide (~4M learners)Dan | Machine Learnin.. @DanKornas
53K Followers 502 Following 🤖 ML Engineer 🔬 AI Educator 💻 I help you build AI skills through project-based learning ➡️ https://t.co/lC2UKMtRjjPatrick Loeber @patloeber
55K Followers 888 Following Software Engineer • YouTube 250K+ • Helping you to learn Python and Machine Learning • AI Developer Advocate @AssemblyAI • @python_engineer founderSumanth @Sumanth_077
47K Followers 862 Following Simplifying the concepts of Python, LLMs, Machine Learning & Data Science! • ML Developer Advocate @clarifai • Data Scientist • Building with AIYann LeCun @ylecun
711K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.François Chollet @fchollet
470K Followers 769 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Pau Labarta Bajo @paulabartabajo_
45K Followers 222 Following The Real-World ML guy | Learn to build real-world ML apps at https://t.co/xWr8Hm8zI5Jack Raifer Baruch @JackRaifer
1K Followers 732 Following #DataScience / Head of Data Science @ADAIntelligence / #AI #Ethics / #66DaysOfData / Writer / Data Instructor / http://jackraiferbaruch.medium.cTowards Data Science @TDataScience
224K Followers 2K Following A Medium publication sharing concepts, ideas, and codes. Share your insights and projects with our global audience: https://t.co/Mh1ZLme1o4.Matt Harrison @__mharrison__
158K Followers 892 Following Python 🐍 + Data Science 🚀 trainer @__metasnake__ 🦜 Speaker ✍ Author 👨🏫 Instructor (@Stanford) 📣 DM for SponsorshipCharly Wargnier @DataChaz
112K Followers 31K Following 🥑 DevRel @Streamlit @SnowflakeDB 🪶 𝕏 about #AI, #LLMs, #DataScience, #WebApps, #SEO 💕 My heart is open source 🌍 Nature Lover 👀 My views!Avi Kumar Talaviya @avikumart_
30K Followers 396 Following Data science and AI | Content writer | ML/community @OmdenaAI, @streamlit and Analytics Vidhya | Sharing insights and ideas at the intersection of data and AIKirk Borne @KirkDBorne
447K Followers 6K Following Advisor to startups. Freelancer. Global Speaker. Founder @LeadershipData. Top influencer in #BigData #DataScience #AI #IoT #ML #B2B. PhD Astrophysics @CaltechCliff Pickover @pickover
184K Followers 52 Following Increase your sense of wonder. (Author of 50 books & 800 patents. Yale Ph.D.) "Pickover contemplates realms beyond our known reality." ~NY TimesAndrej Karpathy @karpathy
979K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Soumen Pramanik, DS, .. @soumen_eclectic
710 Followers 2K Following Enthusiastic about ML & https://t.co/nJEroRWvo5’s not who has the best algorithm that wins, but it’s who has the most data. Trying to harness C++ skillmachine learning @Mlearning_ai
6K Followers 408 Following https://t.co/36eFDHoV3o 🟠 https://t.co/rjCZMWrRMK #Art tools for creative economyDoudna Lab @doudna_lab
71K Followers 88 Following News from Jennifer Doudna's lab at @UCBerkeley @igisci. Tweets from lab members and not Jennifer Doudna unless signed JD. Tweets represent personal views only.Erik Meijer @headinthebox
27K Followers 0 FollowingCarlos Santiago Bañ�.. @csbanon
290 Followers 608 Following AI/ML at @USEncryption • Let’s talk tech, AI, photography, music, culture. •🇻🇦• 🇪🇸🇵🇷🇺🇸rhizome @aapp952
198 Followers 2K FollowingAlexander Han @AlexanderHanBVT
77 Followers 404 FollowingHarshita Krishna @HarshitaKrishn6
64 Followers 437 Following #vGHC'20| @UCSanDiego| MS| Deep Learning| @Amazon| @Qualcomm| She/HerBach Dao @BachDao91
14 Followers 526 Following大东家 @xiaohelong
70 Followers 2K Following Computer Science,Software Developer From China Mainland 来自中国大陆的程序员Gary Marcus @GaryMarcus
145K Followers 7K Following “A beacon of clarity”. Spoke at US Senate AI Oversight committee. Founder/CEO Geometric Intelligence (acq. by Uber). Rebooting AI & Taming Silicon Valley.Vincent Koc @koconder
6K Followers 4K Following Technologist and Futurist | Artificial Intelligence Engineering | Data Leadership | Lecturer @MIT | Ex @Qantas | #buildinpublic Views are my ownMitchell Hashimoto @mitchellh
113K Followers 136 Following Working on a new terminal. 👻 Prev: founded @HashiCorp. Created Vagrant, Terraform, Vault, and others. Passionate about indie software.aaaaaaaaaaaaaaaaaaaaa @mabhvebh
28 Followers 193 FollowingRakesh Varma @RakeshVarma01
148 Followers 2K Following If you want it, work for it !!! That’s Simple...Rudi Ranck, PhD @rudiranck
447 Followers 3K Following AI Research Scientist & Entrepreneur — I delve into the depths of data, discovering hidden patterns and unlocking the secrets that lie within. ✨XrhoPiY @XRhoPiY
242 Followers 4K FollowingPrashant Dixit @Prashant_Dixit0
171 Followers 775 Following AI/Computer Vision/LLM Researcher | Open-source ML | Building cool and exciting Stuff Connect- https://t.co/8wrqNPc2kPSean Astin @SeanAstin
449K Followers 2K Following Answers to Mikey, Rudy, Sam & Bob…also hey you, dad and we’ll be right with you sir - If ya’d like a personalized video message: https://t.co/Xm6Lj5PxsMInformation MDPI @InformationMDPI
1K Followers 2K Following Information (ISSN 2078-2489, #Scopus, #ESCI, #EI Compendex) is an open access journal of information science and technology, data, knowledge and communication.Johnson @ToluwaniJohnson
214 Followers 698 Following Data Scientist #BusinessIntelligence #DataAnalytics #MachineLearningPascal Pfeiffer @pa_pfeiffer
365 Followers 40 FollowingYauhen Babakhin @ybabakhin
318 Followers 71 Following Principal Data Scientist @h2oai | Kaggle Grandmaster https://t.co/u32pwLHRNCEhteshamoddin Siddiqu.. @ehteshamoddin
6 Followers 339 FollowingNeuroNet AI @DeepNeuroNet
38 Followers 643 FollowingM G @140rocks
90 Followers 1K Followingjose Ruiz @joru1000
195 Followers 2K Following C-level Technology lead, strongly focused on Generative AI. Researching on practical production use cases across the Enterprise (yes... as everybody else)Lenny Bogdonoff @rememberlenny
13K Followers 4K Following Optimist. Working with @natfriedman + @danielgross; to invest in technology startups, support portfolio companies, and incubate new projects.𝚝𝚊𝚗𝚓𝚎�.. @tanjents_
51 Followers 649 Following trying in: (Python, GO, Julia R, MatLab) B.S in CS & Math. Looking a PhD program w/ a focus in (R/D) - Learning https://t.co/c0vnQeJgFSSURESH BEEKHANI @SureshBeekhan
72 Followers 314 Following Co-Founder Insight Solution | Data Science | AI/ ML | Generative AI | LLM | Retail Industry |Signals MDPI @Signals_MDPI
262 Followers 1K Following Signals (ISSN 2624-6120) is an international peer-reviewed #openaccess journal of #signals and #signalprocessing published quarterly online by @MDPIOpenAccess.Ghost phoenix @jacobki04299329
140 Followers 586 Following python programmer, artificial intelligence Enthusiast, backend developer, tech enthousiast #Peace #ArtificialintelligenceSeidu Mohammed @liveoncode
721 Followers 745 Following Software Engineer, Data Analyst(Excel, Power BI, Tableau, SQL, Python) Writes about tech and productivity. Dm for gigs. https://t.co/UuKSV5C1onKonstantin Dobler @konstantdobler
74 Followers 113 Following PhD student @ELLISforEurope @hpi_de in NLP, prev @sapManuel Bartual @ManuelBartual
256K Followers 2K Following Guionista y entusiasta. Creador de Biotopía. Cocreador de Santuario, Blum y Titania. A veces me pasan cosas raras. #ElOtroManuel #RedMonkeyBetz @allen_brutus
1K Followers 2K Following I like to both build things and think. Physics BSc (Energy), ex policy jock, ex SF startup, now building things out of atoms by handUn-Nisha Liaquat Usha @unnisha_usha
356 Followers 668 Following Executive- Research & Development || BSc. in EEE || Dept. of EEE || Chittagong University of Engineering & Technology ||Anime~Music~Illustration~Photography❤ ||Muhammad irshad @Irshad3010
15 Followers 80 FollowingArmand Brunelle @Sirlupinwatson
2K Followers 1K Following @DeepMLShop @CloudOneID #DataScientist #Cybersecurity #ArtificialIntelligence #MachineLearningAnthony G @Anthony42599730
295 Followers 3K Following Econometrics Grad Student | Learning Data Engineering | Learning Machine Learning.Gabriel Rufián @gabrielrufian
991K Followers 5K Following Uno más. Portavoz de Esquerra en el Congreso. Para todos todo, para nosotros nada. Instagram: https://t.co/EIfG2iwhMG TikTok: @gabrielrufianPoornima Devi @Poornim14
82 Followers 2K Following Machine Learning Engineer | Deep learning | Natural Language ProcessingBrian Douglas @brianbdouglas
10K Followers 221 Following Engineer and educator at MathWorks. Tweets are my own. Personal website at https://t.co/HsL4ZlJq9q @[email protected]Abdul Majeed 🇬🇭.. @senior_majeed
272 Followers 2K Following Junior Data Analyst, Tech Enthusiast, AI, ML, AWS, Social Commentator, Anti-African Politics, Chelsea FC Fan , GWS Fan🏀The Top ML Papers of the Week (April 15 - April 21): - Llama 3 - Mixtral 8x22B - A Survey on RAG - How Faithful are RAG Models? - Emerging AI Agent Architectures - Chinchilla Scaling: A replication attempt ...
Re-implemented @karpathy 's llama.c in C++ and modularized a lot of that code in a pure C++ transformer library. github.com/AmeyaWagh/llam…
Highly amusing update, ~18 hours later: llm.c is now down to 26.2ms/iteration, exactly matching PyTorch (tf32 forward pass). We discovered a bug where we incorrectly called cuBLAS in fp32 mathmode 🤦♂️. And ademeure contributed a more optimized softmax kernel for very long rows…
A few new CUDA hacker friends joined the effort and now llm.c is only 2X slower than PyTorch (fp32, forward pass) compared to 4 days ago, when it was at 4.2X slower 📈 The biggest improvements were: - turn on TF32 (NVIDIA TensorFLoat-32) instead of FP32 for matmuls. This is a…
@SokobanHero @jeremyphoward you guys don’t know nothing about Brazil lol stop shouting about it as if you knew something
@BuiltAThing @SokobanHero Yes that's right. The previous deal was they pay me for helping with their engagement through posting. The new deal is less attractive.
Very cool project -- a Gemma model fine-tuned to turn text into mind map summaries in dotfile format: kaggle.com/models/toshik/…
What if you care about finetuning LLMs for classification? Are encoder-style transformers like BERT and RoBERTa, etc, still the way to go? The recent "Label Supervised LLaMA Finetuning" paper shows that you can get really good performance by finetuning decoder-style models such…
Freshly merged: you can now evaluate LLMs in LitGPT with just one command (thanks to the great LLM Eval Harness)! github.com/Lightning-AI/l…
Hey Matt, every integrated circuit you use was made possible because of the pioneering work of Lynn Conway, and Sophie Wilson invented the ARM architecture.
What exactly have “transgender Americans” contributed?
NYU Artificial Intelligence, Spring ‘24 This course concerns the study of rational behaviour that guides the creation of intelligent machines. In the first half of the semester we covered problem-solving, logical, and probabilistic agents.
ML bible arrived today autographed by the authors 🙏
If you enjoy from-scratch implementations of self-attention and multi-head attention, I have compared and collected a few implementations for you here. For readability, I particularly appreciate the compact one in the lower left, which features combined QKV matrices (courtesy of…
Nice read on the rarely-discussed-in-the-open difficulties of training LLMs. Mature companies have dedicated teams maintaining the clusters. At scale, clusters leave the realm of engineering and become a lot more biological, hence e.g. teams dedicated to "hardware health". It…
Long overdue but here's a new blogpost on training LLMs in the wilderness from the ground up 😄🧐 In this blog post, I discuss: 1. Experiences in procuring compute & variance in different compute providers. Our biggest finding/surprise is that variance is super high and it's…
Fun LLM challenge that I'm thinking about: take my 2h13m tokenizer video and translate the video into the format of a book chapter (or a blog post) on tokenization. Something like: 1. Whisper the video 2. Chop up into segments of aligned images and text 3. Prompt engineer an LLM…
GenAI is starting to look like Typhoid Mary. Last May, the celebrated 54-year-old LexisNexis touted hallucination-free legal citations produced by Generative AI. Instead, it is making up cases — from 2025 and 2026!!! Talk about torching one’s reputation on the altar of GenAI. —…
Seeing as I published my Tokenizer video yesterday, I thought it could be fun to take a deepdive into the Gemma tokenizer. First, the Gemma technical report [pdf]: storage.googleapis.com/deepmind-media… says: "We use a subset of the SentencePiece tokenizer (Kudo and Richardson, 2018) of…
Introducing Gemma - a family of lightweight, state-of-the-art open models for their class, built from the same research & technology used to create the Gemini models. Blog post: blog.google/technology/dev… Tech report: goo.gle/GemmaReport This thread explores some of the…
"My benchmark for large language models" nicholas.carlini.com/writing/2024/m… Nice post but even more than the 100 tests specifically, the Github code looks excellent - full-featured test evaluation framework, easy to extend with further tests and run against many LLMs.…
Hi everyone yes, I left OpenAI yesterday. First of all nothing "happened" and it’s not a result of any particular event, issue or drama (but please keep the conspiracy theories coming as they are highly entertaining :)). Actually, being at OpenAI over the last ~year has been…
@karpathy > And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. 💯
@karpathy 💯for technical or ML content, a rule I use is that I need to code something along with what I am learning or reading. Short videos are automatically filtered out because of this. Long form content is the best way to learn and go deeply into a topic. This is why I prefer to…