-
Tweets1K
-
Followers56K
-
Following491
-
Likes5K
In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously. However, there’s no track record on how good someone’s intuition is. A fun way to do this is…
The first lecture of our @Stanford CS25 V4 Transformers course (cs25.stanford.edu) is now released! Check it out here: youtube.com/watch?v=fKMB5U…. We (the instructors) gave a brief intro and overview of the history of NLP, Transformers and how they work, and their impact. We…
nothing gets my heart rate up like waiting for eval results on new models to come in
Congrats @YiTayML on this launch. It is impressive that a small team can train a strong model so quickly. What I also like is that the PR is not full of unfounded hype. Just plainly states the model's benchmark scores and you can immediately try out the model yourself for free.
Congrats @YiTayML on this launch. It is impressive that a small team can train a strong model so quickly. What I also like is that the PR is not full of unfounded hype. Just plainly states the model's benchmark scores and you can immediately try out the model yourself for free.
Flan-2 is published in JMLR jmlr.org/papers/v25/23-…. I think it's a nice piece of history. The work scaled instruction tuning with respect to model size and finetuning tasks, which both improved performance. Our MMLU was 75%, SOTA when the paper came out in Oct 2022. Our…
In 2022, a model with 70%+ MMLU score, would cost 20 dollars per 1M tokens (instructGPT 3.5). Today it costs less than $1! It is perfectly reasonable to expect that in say five years, you will be able to use a model with 90%+ MMLU score for just a few cents per 1M tokens.
In 2022, a model with 70%+ MMLU score, would cost 20 dollars per 1M tokens (instructGPT 3.5). Today it costs less than $1! It is perfectly reasonable to expect that in say five years, you will be able to use a model with 90%+ MMLU score for just a few cents per 1M tokens.
This new hallucinations eval by GDM friends is in the right direction in many ways: 1. Tackles the scenario of extremely long-form responses, which is a harder but more realistic setting 2. Extracts the number of relevant facts, then browses to verify each individual fact 3.…
This new hallucinations eval by GDM friends is in the right direction in many ways: 1. Tackles the scenario of extremely long-form responses, which is a harder but more realistic setting 2. Extracts the number of relevant facts, then browses to verify each individual fact 3.…
Cheesy realization: studying history underscores how special this current moment in AI is. In past eras, the great powers of the world fought religious wars, sailed to unexplored lands, and built the first industrial cities. Now we will race to build artificial intelligence. So…
Had a bit of a fanboy moment today meeting @bryan_johnson, who has been super inspirational to me in prioritizing my health. I asked him about the best way to balance career and spending time on health. His advice is that while many people give up sleep to work more, sleeping…
My mental model of Sora is that it is the “GPT-2 moment” for video generation. GPT-2, which came out in 2018, could generate paragraphs of text that are coherent and grammatically correct. GPT-2 wasn’t able to write an entire essay without making mistakes like being inconsistent…
My typical day as a Member of Technical Staff at OpenAI: [9:00am] Wake up [9:30am] Commute to Mission SF via Waymo. Grab avocado toast from Tartine [9:45 am] Recite OpenAI charter. Pray to optimization Gods. Learn the Bitter Lesson [10:00am] Meetings (Google Meet). Discuss how to…
An incredible skill that I have witnessed, especially at OpenAI, is the ability to make “yolo runs” work. The traditional advice in academic research is, “change one thing at a time.” This approach forces you to understand the effect of each component in your model, and…
A key insight from chain-of-thought is around the idea of information density. Language models can only do so much with a single forward pass, and so the amount of compute the language model can use must be scaled proportional to how hard a prompt is to solve. What is…
One thing in AI research that I have finally recognized with clarity is the idea of “inertia bias”: continuing to do something when it’s not the best option. The most basic instance of inertia bias is the feeling of “I already spent time implementing X, so let me continue trying…
There’s no adrenaline rush like launching a massive gpu training
For most companies, hiring more people is strictly better. However, this is often not true in AI research. AI research is often bottlenecked by compute, and when this is the case, hiring more researchers can be counter-productive. I remember back at Google Brain, my manager once…
Yi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pRosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRHorace He @cHHillee
23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwFelix Hill @FelixHill84
9K Followers 777 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else'sTim Dettmers @Tim_Dettmers
29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Ethan Caballero is bu.. @ethanCaballero
8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMindThomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Behnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingThéophane Vallaeys @webalorn
10 Followers 102 Following Meta research Intern (FAIR Paris) ML student (ENS Paris)Alexander Kurucz @AlexanderKurucz
1K Followers 3K Following All opinions my own. I sometimes make mistakes but I try not to lie.Nick Burns @nickdaleburns
550 Followers 1K Following Data scientist, avid squash player, coffee drinker and ML Engineer at CarbonCrop. Working on self-supervised methods for remote sensing.Marijan @Marijan01766043
40 Followers 175 Followingtejas nagendra @tejasnagendra
4 Followers 148 FollowingJaehee Kim @Jaehee_kim_NLP
0 Followers 34 FollowingSuransh Chopra @SuranshC
18 Followers 119 FollowingMing Chen @ming8924
1 Followers 80 Following歪门正道 @bushiwu5
1 Followers 12 FollowingElectronicsseeker @libertarian108
4 Followers 543 FollowingAndres Algaba @AndresAlgaba1
73 Followers 446 Following @FWOVlaanderen Postdoctoral Researcher in AI, Coordinator in Generative AI, and Guest Professor at @VUBrussel and @DataLabBE | Member @JongeAcademieYan Meng @vivian_yanmy
77 Followers 185 Following PhD student at Language Technology Lab, University of AmsterdamJuno KIM @junokim_ai
4 Followers 35 FollowingMichael Zolotov @mzolotov_alt
5 Followers 83 FollowingAlex @Alexenia0
42 Followers 52 Following Tech enthusiast, digital nomad, not a bot, no filter… shaping the future through Web3, Blockchain, NFTs and Quantum Research. Sr Manager at TechinMotion events.Idunno @ayoo1067
1 Followers 12 FollowingGuinan Su @guinansu
10 Followers 47 Followingdev potatopotato @devpotatopotato
0 Followers 136 Following CS student in Seoul National University. Passionate about AGI.humuih @RKvmEUTwHpns66C
9 Followers 51 FollowingFangru Lin @FangruLin99
111 Followers 188 Following DPhil NLP student @UniofOxford; Clarendon Scholar; Ex SDE intern @Microsoft; Computational Linguist; know how and know whyearthworm @earthwo71796162
20 Followers 192 FollowingCalvin @Kim1551214158
0 Followers 266 FollowingEason @Eason89506821
43 Followers 274 FollowingVikram Voleti @VikramVoleti
772 Followers 1K Following PhD candidate @Mila_Quebec @UMontreal, currently a Research Intern at @Meta. Ex-Intern at @Google, @Unity. In the job market. Computer vision + Deep learningYe Yint Htoon @Htoon_yeyint
11 Followers 298 Following彭盛 @ITSpexcimer
0 Followers 26 FollowingAshant Chalasani @ashant
87 Followers 140 FollowingConrad Clement @oconradh
19 Followers 3K FollowingLannister @Lannister998
4 Followers 99 FollowingShukant Pal शुक.. @ShukantP
360 Followers 1K Following Machine Learning @getlindy. @getfacade. @PixiJS. @UTAustin. @OhioState. Previously @getTeamflow.Tractor @__trackertor__
33 Followers 220 Followingrose lin @roselin86992969
3 Followers 71 FollowingRoger @RogerXAI
2 Followers 674 FollowingXiang Ying @Xiang_Yingapple
59 Followers 138 Following Co-founder & Lead Technologist at Mindverse | Driving innovation with MindOS | An investor in the making - learning from failures, aiming for future success.Marc Bhargava @marcbhargava
3K Followers 2K FollowingYi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pRosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRAnthropic @AnthropicAI
261K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Christopher Manning @chrmanning
126K Followers 115 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwFelix Hill @FelixHill84
9K Followers 777 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else'sTim Dettmers @Tim_Dettmers
29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Ethan Caballero is bu.. @ethanCaballero
8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMindThomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Dwarkesh Patel @dwarkesh_sp
54K Followers 699 Following Being pretrained Host of Dwarkesh Podcast https://t.co/3SXlu7fy6N https://t.co/rEhnfYywXY https://t.co/hQfIWdM1UnJared Quincy Davis @jaredq_
643 Followers 308 Following Founder and CEO, Foundry. @mlfoundry Orchestrating Compute. Fmr Research Scientist @DeepMind, Deep Learning Team. CS PhD @Stanford. ML, Distributed SystemsJason Weston @jaseweston
9K Followers 568 Following Research @MetaAI+NYU. Pretrain+FT: NLP from Scratch (2011). Multilayer attention+position embed+LLM: MemNets (2015). Recent (2023+):Sys 2 Attn, Self-Rewarding..Brian Ichter @brian_ichter
1K Followers 178 Following Research Scientist at Google Brain, interested in robotics and AIHelen Qu @_helenqu
226 Followers 66 Following supernovae / cosmology / machine learning ✨ incoming research fellow @FlatironCCA, prev: PhD @physatpenn ‘24, BSE @CIS_Penn '17NetMind.AI @NetmindAi
29K Followers 92 Following NetMind Power is a decentralized platform aimed at democratizing AI computing power. Telegram: https://t.co/cYOXxXdzRT ; Discord: https://t.co/YStJyP1T1iKai Zou @anMe_kz
4K Followers 40 Following Founder and CEO at https://t.co/YPVwP0HF5C, https://t.co/QRm3Mj3azx, https://t.co/rG5uII6TfJIan Osband @IanOsband
8K Followers 365 Following Research scientist at OpenAI working on decision making under uncertainty.Ben Kuhn @benskuhn
7K Followers 289 Following Care a lot and try hard • making language models safer @AnthropicAI • prev CTO @WaveSenegal 🐧❤️Richard Socher @RichardSocher
101K Followers 971 Following CEO @youSearchEngine Investing at @aixventuresHQ Before: Stanford Adj Prof in AI/NLP, Chief Scientist at Salesforce, MetaMindYutong Bai @YutongBAI1002
3K Followers 397 Following EECS Rising Star, 2023 Apple Scholar, Visiting PhD @berkeley_ai, Intern @GoogleAI Brain team @MetaAI (FAIR Labs), CS PhD @JHUCompSciWei Xu @cocoweixu
9K Followers 1K Following CS professor @GeorgiaTech @gtcomputing @ICatGT @mlatgt. Natural language processing, machine learning, social media research.Keisuke Sakaguchi @KeisukeS_
1K Followers 437 Following Assoc. Prof. at Tohoku University, Sendai 🇯🇵. Natural Language Processing, Machine Learning, Psycho&Neurolinguistics. ex. @allen_ai @jhuclsp @NAIST_MAIN_ENYoung @yjkim362
342 Followers 262 Following Principal Researcher, Large language models, NLP, @Microsoft GenAICade Gordon @CadeGordonML
755 Followers 587 Following Working at the intersection of Bio x ML🧬 | @BerkeleyML Prev: @BigHatBio | LAION-5B & open_CLIP | ML Intern @CohereAI | research @UICCS,Chuck Ganapathi @chuckganapathi
1K Followers 461 Following President & COO @GainsightHQ | former Founder & CEO, @tact_ai | former SVP & GM Products at @salesforceJinYeong Bak @NoSyu
706 Followers 669 Following 박진영/JinYeong Bak/朴秦永 Leader @ https://t.co/fOvQXsEjo6 Conversation Modeling Researcher who is not good at talkingLawrence H. Summers @LHSummers
326K Followers 706 Following Charles W. Eliot University Professor and President Emeritus at Harvard. Secretary of the Treasury for President Clinton and Director of NEC for President ObamaPika @pika_labs
116K Followers 53 Following Video on command. Website: https://t.co/G5bjmrMQsx Discord: https://t.co/bX68ThPTQH About: https://t.co/atvdcgbe9SDemi Guo @demi_guo_
22K Followers 693 Following Co-founder & CEO @pika_labs | ex @StanfordAILab @HarvardChelsea Sierra Voss @csvoss
10K Followers 1K Following engineeress ✨ Member of Technical Staff @openai serious play // notice your curiosityYash Dagade @YashDagad
17 Followers 187 Following Tinkerer(always). Reader(sometimes). Writer(rarely). Funny?Marin-Llobet @Arnauya
514 Followers 1K Following Hi, this is Arnau. I’m PhDing @hseas @harvard !! brain machine interfaces, neuroai, neuromorphics… previously @UPCTelecos.Liv Boeree @Liv_Boeree
254K Followers 495 Following Looking for the win/wins in life. Not a fan of Moloch traps. Brand new podcast out now, link below👇Bret Taylor @btaylor
139K Followers 2K Following Co-Founder @SierraPlatform. Board @OpenAI @Shopify.Kevin Scott @kevin_scott
28K Followers 692 Following Chief Technology Officer @Microsoft; Host of #BehindTheTech podcast https://t.co/05oKfZqU3e; Author of "Reprogramming the American Dream"Rosie @RosieCampbell
6K Followers 869 Following Forever expanding my nerd/bimbo Pareto frontier. Policy Frontiers team lead @OpenAI.Voyage AI @Voyage_AI_
2K Followers 164 Following Building embedding/vectorization models, customized for your domain and company, for better retrieval quality https://t.co/MEAhTpBQqdAleksander Madry @aleks_madry
31K Followers 165 Following Head of Preparedness at OpenAI and MIT faculty (on leave). Working on making AI more reliable and safe, as well as on AI having a positive impact on society.Tri Dao @tri_dao
18K Followers 364 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.Nat McAleese @__nmca__
3K Followers 306 Following Superalignment by models helping humans help models help humans at OpenAI. Previously @DeepMind. Views my own.Tao Xu @txhf
6K Followers 888 Following Learning Machine at OpenAI, previously Airbnb, Quora, Facebook and Microsoft.Giambattista 'Gb' Par.. @giambattista92
2K Followers 443 Following Reasoning about reasoning to understand understanding Research scientist at @OpenAI ML PhD @MPI_IS & @ETH Zurich Prev also @DeepMind and Google XIt feels like once you hyper specialise in AI and engineering/science you somehow lose all ability to reason with the finance/business like tax laws. They both require similar reasoning skills sometimes but it's almost as though my brain just shuts off and cannot process. 🫠
Bad sleep for me is usually from: 1. Eating late 2. Too much/wrong kinds of food 3. Skipping 30 min wind down before bed 4. Stimulants too late in the day 5. Bed/room temp too hot/cold 6. Disruptions: noise, others
No one will remember what you tweeted; they will remember what you built.
Bets are basically experiment preregistration
In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously. However, there’s no track record on how good someone’s intuition is. A fun way to do this is…
@RuiboLiu @_jasonwei @hwchung27 damn i wish i never left google so i could still find out how wrong i was at predicting LLM chess.
Reminder that your bedtime is your most important appointment of the day. Respect yourself and be on time.
@YiTayML @_jasonwei I just checked that doc. The most closest guess was from @hwchung27 actually. Everyone else was just so wrong ...
Cool piece from the Financial Times comparing hallucinations in LLMs to hallucinations in humans! People often complain about how LLMs frequently hallucinate, but it’s easy to forget that humans hallucinate a lot as well. For example, if you read some article and then later tell…
phi is a good litmus test to tell who understands LLMs and who doesn't.
people think UL2 is an encoder-decoder. if you think it is, you haven't read the paper. The UL2 objective is agnostic to architecture.
the problem of being a xoogler is that you refer to RSUs as GSUs all the time.
Flash is OP!
サイズを制限したなかで特に良い汎用モデル ①Google:Gemini 1.5 Pro ②Meta:LLaMA 3(8B、70B) ③Anthropic:Claude 3(Haiku、Sonnet) ④Reka:Reka Flash(21B)
openai: don't use GPT in your model name! 🥺 meta: use llama3 in your model name or else 😠
What's the consequence of publishing a model from Llama3 without putting "meta-Llama-3" in the front? I have already seen a few people doing it.
The first lecture of our @Stanford CS25 V4 Transformers course (cs25.stanford.edu) is now released! Check it out here: youtube.com/watch?v=fKMB5U…. We (the instructors) gave a brief intro and overview of the history of NLP, Transformers and how they work, and their impact. We…
First @nvidia DGX H200 in the world, hand-delivered to OpenAI and dedicated by Jensen "to advance AI, computing, and humanity":
not true, especially for language. if you trained a large & deep MLP language model with no self-attention, no matter how much data you'll feed it you'll still be lacking behind a transformer (with much less data). will it get to the same point? i don't think so. your tokens…
The dataset is everything. Great read: nonint.com/2023/06/10/the…