Yi Tay @YiTayML

Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶 yitay.net mixture-of-locations Joined October 2016

Tweets

3K
Followers

28K
Following

97
Likes

7K

Yi Tay @YiTayML

a day ago

instead of evaluating models, we can start to evaluate researchers instead! 😀 i've always had this floating idea of giving people transformer configs and asking them to predict configurations that works better. could be data mix, architectures, hparams whatever. would be a fun…

Jason Wei @_jasonwei

2 days ago

18 34 427 131K 202

3 8 89 24K 46

Reka @RekaAILabs

4 days ago

🔥Newly updated scores for Reka Core, Flash and Edge on MMMU leaderboard: mmmu-benchmark.github.io.

1 14 74 10K 20

Download Image

lmsys.org @lmsysorg

5 days ago

Yes, check out @RekaAILabs's strong Flash-21B model!

Mikel Artetxe @artetxem

a week ago

Yes, check out @RekaAILabs's strong Flash-21B model!

2 6 57 41K 18

Download Image

4 11 79 22K 13

AK @_akhaliq

a week ago

Reka Core, Flash, and Edge A Series of Powerful Multimodal Language Models We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio

3 52 230 30K 88

Download Image

foam shazeer @foamshazeer

a week ago

I heard the key to @RekaAILabs's success is a new algorithm called AgiHi-PPO

1 2 21 8K 5

Yi Tay @YiTayML

a week ago

"To be frontier you first need to be be pareto-frontier". ~ First law of LLM training. 😃

20 39 411 66K 80

Download Image

Yi Tay @YiTayML

2 weeks ago

Our @RekaAILabs Tech Report / Paper is out! 🔥 Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃 We tried…

11 56 415 48K 204

Download Image

Yi Tay @YiTayML

a week ago

One year since I posted this so here's an update! Adding @donovanOng_ to the list of notable Singaporean researchers/engineers doing great work in AI and LLMs. He helped train Reka's (@RekaAILabs) series of OP models (Core, Flash, Edge) so he deserves to be on this list! 🔥

Yi Tay @YiTayML

a year ago

27 38 326 120K 123

2 4 26 16K 8

Yi Tay @YiTayML

2 weeks ago

It's been a wild ride. Just 20 of us, burning through thousands of H100s over the past months, we're glad to finally share this with the world! 💪 One of the goals we’ve had when starting Reka was to build cool innovative models at the frontier. Reaching GPT-4/Opus level was a…

Reka @RekaAILabs

2 weeks ago

44 224 1K 631K 443

Download Video

66 91 930 190K 296

Karim @KarimBhalwani

a week ago

It's inspiring to see what a small team can accomplish in such a short period of time. @RekaAILabs, an enterprise multimodal LLM company, has only had access to 90% of their compute for the past 4 months, but that hasn't stopped the brilliant team of 20 to go head-to-head in…

1 13 57 22K 28

Yi Tay @YiTayML

2 weeks ago

Didn't get much chance to share this yesterday with everything else going on with the Reka core launch but here's the most non-cherry picked showcase of Reka Core vs GPT-4 vs Claude Opus on multimodal chat tasks. 👇 We put together this showcase with examples our team created.…

5 13 89 18K 23

Download Image

Reka @RekaAILabs

2 weeks ago

Meet Reka Core, our best and most capable multimodal language model yet. 🔮 It’s been a busy few months training this model and we are glad to finally ship it! 💪 Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body…

44 224 1K 631K 443

Download Video

Teortaxes▶️ @teortaxesTex

2 weeks ago

Feels legit. I might prefer Reka Core's multimodal performance to 1.5 too.

Piotr Padlewski @PiotrPadlewski

2 weeks ago

Feels legit. I might prefer Reka Core's multimodal performance to 1.5 too.

3 4 19 6K 2

Download Image

1 3 18 5K 6

小猫遊りょう（たかにゃし・りょう） @jaguring1

2 weeks ago

現時点でトップクラスの言語モデルを作成できた組織 ① OpenAI（GPT-4） ② Google（Gemini Ultra、Gemini 1.5 Pro） ③ Anthropic（Claude 3 Opus） ④ Inflection AI（Inflection 2.5） ⑤ Reka（Reka Core） ⑥ xAI（Grok-1.5） ⑦ Mistral（Mistral large） Metaは次のLLaMA 3で加わる可能性あり

5 75 408 50K 201

Yi Tay @YiTayML

2 weeks ago

research is an immensely taxing endeavour. hours spend doing IC work, debugging and what not. a paper is a canvas for researchers to express themselves after all the hard work, at the end of the day. it's my art. at least let me paint the way i want to paint. The reason why i am…

Teortaxes▶️ @teortaxesTex

2 weeks ago

1 6 71 55K 56

Download Image

7 19 240 48K 104

Aran Komatsuzaki @arankomatsuzaki

95K Followers 78 Following @TeraflopAI

Jason Wei @_jasonwei

56K Followers 490 Following ai researcher @openai

Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as lb@sigmoid.social

Lucas Beyer (bl16) @giffmana

56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]

Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Delip Rao e/σ @deliprao

46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

Cofounded & running @ml_collective.
Host of Deep Learning Classics & Trends.
Research at Google DeepMind.
DEI/DIA Chair of ICLR & NeurIPS.
Writing https://t.co/IbycyGfnDR

Rosanne Liu @savvyRL

33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDR

Akari Asai @AkariAsai

11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃‍♀️🧗‍♀️🍳

Graham Neubig @gneubig

31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.

a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

Kyunghyun Cho @kchonyc

61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.

rohan anil @_arohan_

12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.

Shane Gu @shaneguML

28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep running

Yao Fu @Francis_YAO_

13K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep running

Aakanksha Chowdhery @achowdhery

7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to change

Denny Zhou @denny_zhou

9K Followers 420 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.

Ethan Caballero is bu.. @ethanCaballero

8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMind

Colin Raffel @colinraffel

30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlp

Behnam Neyshabur @bneyshabur

18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpacking

Douwe Kiela @douwekiela

10K Followers 378 Following @ContextualAI CEO, @Stanford Adjunct Prof.

AI researcher @UWaterloo @GoogleAI @VectorInst. Interested in natural language processing, diffusion models. I direct TIGER-Lab at UWaterloo.

Wenhu Chen @WenhuChen

11K Followers 520 Following AI researcher @UWaterloo @GoogleAI @VectorInst. Interested in natural language processing, diffusion models. I direct TIGER-Lab at UWaterloo.

VIGNESH @YaadavVignesh

22 Followers 339 Following Caress of the stories I failed to write

libingchen @libingchen13619

12 Followers 37 Following

elbert @elbert866777443

11 Followers 40 Following

Sonal garg @phoenix__28

25 Followers 253 Following Contributing a bit to the nation 🇮🇳

Diving deep into AI creativity, focusing on video & image generation. Exploring cutting-edge tech and tools. Discovering new dimensions of visual storytelling.

🎥 Aiography @aiography_ai

12 Followers 67 Following Diving deep into AI creativity, focusing on video & image generation. Exploring cutting-edge tech and tools. Discovering new dimensions of visual storytelling.

Humam @Humam35676679

12 Followers 411 Following

Sanjana Prasad @sanjanpra2k01

Haoyuan Huang @HaoyuanHuang22

2 Followers 47 Following

Kyle @ksaieng

33 Followers 79 Following

Nir Peled @_nir_peled

73 Followers 313 Following

Secosoez @secosoez25339

0 Followers 54 Following

renAI Lab @renAI_Lab101

0 Followers 15 Following renAI Lab

SivaKesava @___skesava

0 Followers 1K Following

mixedsignal @mixedsignal

5 Followers 35 Following

Ali Naqvi @1NaqviAli

2 Followers 12 Following First-year MSc student at McMaster University studying evolutionary computation and ml.

shubhang @s_bhatnagar_tw

34 Followers 193 Following Computer Vision PhD Student @ECEILLINOIS, Undergrad @iitbombay

Riz @mriyaz

419 Followers 1K Following Entrepreneur • Generative AI • LLMs • GPTs • AI Agents

John Thilén @JohnThilen

2 Followers 344 Following

CG @ecomgal

110 Followers 303 Following we live in exciting times

ycao @ycao01

101 Followers 683 Following "It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness." Charles Dickens, A Tale of Two Cities

Kitsune @kitsunefolk

22 Followers 136 Following Just reposting AI news, papers, important info.

Trunkboy PeeZ @P10895Peez

4 Followers 204 Following

Eric Huang @EricHuang4312

1 Followers 52 Following

Jaehee Kim @Jaehee_kim_NLP

0 Followers 34 Following

David Tan @tanxw1995

0 Followers 145 Following

Steven Caliari @0B4Bq

1 Followers 50 Following

Swarup Dwivedy @swarup5662

9 Followers 43 Following

Gorsave @gorsave

2 Followers 58 Following doorbuster doorbuster

pactrd @pactrd

82 Followers 374 Following probably correct is good enough for me

Yellow Dot Cafe @yellowdotcafe

54 Followers 42 Following

Baiyang Dai @pmcesky

18 Followers 267 Following Graduate student @UChicago.

Ethan Chan @EthanCh05696449

10 Followers 104 Following

歪门正道 @bushiwu5

2 Followers 12 Following

Electronicsseeker @libertarian108

6 Followers 889 Following

ko @code_and_ram

21 Followers 138 Following If you hear me screaming bloody murder, there’s a good chance I’m just enjoying myself.

Miguel_Pedraza @CabezaDespejada

56 Followers 2K Following

Gabriel Fiastre @gabfstr

3 Followers 40 Following

Zeeshan khan @zeeshank95

41 Followers 261 Following PhD @Inria Willow and @ENS_ULM

Alan Akil @AlanAkil

59 Followers 400 Following PhD in Mathematics, @UHouston

Michael Zolotov @mzolotov_alt

8 Followers 83 Following

Jun Zeng @junzengx14

300 Followers 158 Following SDE @Cruise, Ph.D. in Robotics @UCBerkeley, X2014 @Polytechnique. Love mathematics, robotics and programming.

The Lord Keynes @TheLordKeynes

69 Followers 182 Following Number Go Up

Yihuai Hong @YihuaiH91773

26 Followers 136 Following CS Undergraduate interested in NLP research @SCUT Rearch Intern in @UCL

Roger Wang @rogerw0108

17 Followers 46 Following Flowers and friendship | ML Platform & Infra @Roblox

Akinropo Taiwo @taiwo_akinropo

488 Followers 1K Following Building @HeyfoodAfrica(https://t.co/ihx9UEkXhp)

Aravind Ramarathinam @aravr

71 Followers 200 Following Is experience like a comb that you get once you are bald? 🤔

Ashant Chalasani @ashant

88 Followers 140 Following

Le Yu @YuLe57423534941

37 Followers 59 Following https://t.co/o4DeUgRmpQ

Lannister @Lannister998

5 Followers 99 Following

mateorivera @mateorivera289

3 Followers 177 Following I Love Twitter all the time

Aran Komatsuzaki @arankomatsuzaki

95K Followers 78 Following @TeraflopAI

Jason Wei @_jasonwei

56K Followers 490 Following ai researcher @openai

Lucas Beyer (bl16) @giffmana

56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

Rosanne Liu @savvyRL

33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDR

Graham Neubig @gneubig

31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.

Kyunghyun Cho @kchonyc

61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

rohan anil @_arohan_

12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.

Shane Gu @shaneguML

28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

Yao Fu @Francis_YAO_

13K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep running

Aakanksha Chowdhery @achowdhery

7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to change

Denny Zhou @denny_zhou

9K Followers 420 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.

Colin Raffel @colinraffel

30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlp

Sewon Min @sewon__min

7K Followers 642 Following PhD student at @uwcse @uwnlp

I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.

Sara Hooker @sarahookr

39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.

Thomas Wolf @Thom_Wolf

68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-science

William Fedus @LiamFedus

12K Followers 920 Following @OpenAI

Head of NLP, CTO office, @Bloomberg. (he/him)

Generating natural language, one word at a time. Also making sense of that language afterwards. views my own

Sebastian Gehrmann @sebgehr

5K Followers 2K Following Head of NLP, CTO office, @Bloomberg. (he/him) Generating natural language, one word at a time. Also making sense of that language afterwards. views my own

Mahesh Sathiamoorthy @madiator

9K Followers 930 Following LLMs and Data. Discuss about data for LLMs: https://t.co/x4iAft5cHV Ex-GoogleDeepMind

Andrej Karpathy @karpathy

978K Followers 904 Following 🧑‍🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥

Noam Shazeer @NoamShazeer

5K Followers 12 Following Engineer

Research Engineer at Google Brain. Interested in Science, Psychology, Investing, Design and generally almost everything.

Good Thoughts, Good Words, Good Deeds.

Afroz Mohiuddin @afrozenator

1K Followers 5K Following Research Engineer at Google Brain. Interested in Science, Psychology, Investing, Design and generally almost everything. Good Thoughts, Good Words, Good Deeds.

Research Scientist @GoogleDeepMind (Gemini). Pioneering LLM Research 🔥. Instruction tuning, Factuality, Reasoning and next gen Product.
Opinions my own.

Swaroop Mishra @Swarooprm7

5K Followers 893 Following Research Scientist @GoogleDeepMind (Gemini). Pioneering LLM Research 🔥. Instruction tuning, Factuality, Reasoning and next gen Product. Opinions my own.

Piotr Padlewski @PiotrPadlewski

1K Followers 319 Following Chief Meme Officer @ https://t.co/CtBrcKmliI, ex-Google Deepmind/Brain Zurich

Max Bain @maxhbain

2K Followers 498 Following multimodal @RekaAILabs | prev: phd @Oxford_VGG hardwork-pilled

Steven Zheng @HuaixiuZheng

171 Followers 60 Following Trained in quantum computing and quantum physics, LLM research in Google DeepMind

Che Zheng @xvblack

110 Followers 155 Following Member of Technical Staff at @RekaAILabs . Past: @Google, @Official_Kwai

Cofounder @RekaAILabs, Assistant Professor @HKUniversity Past: @DeepMind, FAIR (@MetaAI), @MSFTResearch, PhD @UniofOxford

Qi Liu @leuchine

381 Followers 402 Following Cofounder @RekaAILabs, Assistant Professor @HKUniversity Past: @DeepMind, FAIR (@MetaAI), @MSFTResearch, PhD @UniofOxford

Karina Nguyen @karinanguyen_

12K Followers 648 Following AI research & eng @AnthropicAI, prev. intern @nytimes, @square, @dropbox

An AI research and product company 🫠. We are a team of scientists and engineers building state-of-the-art multimodal language models 😻

Reka @RekaAILabs

11K Followers 13 Following An AI research and product company 🫠. We are a team of scientists and engineers building state-of-the-art multimodal language models 😻

maths, visualisations, conversational AI. currently @RekaAILabs - previously: @Apple AI/ML, @PolyAI, @GoogleAI, MSc @EdinburghUni, PhD @Cambridge_Eng

Matt Henderson @matthen2

79K Followers 2K Following maths, visualisations, conversational AI. currently @RekaAILabs - previously: @Apple AI/ML, @PolyAI, @GoogleAI, MSc @EdinburghUni, PhD @Cambridge_Eng

secondary account, hardcore fans only.
friend of @agikoala the great researcher, main account: @yitayml
warning: hot takes.

yi 🦛 @agihippo

3K Followers 81 Following secondary account, hardcore fans only. friend of @agikoala the great researcher, main account: @yitayml warning: hot takes.

jason @agikoala

2K Followers 24 Following secondary account (main is @_jasonwei) @agihippo is a buddy of mine

Cofounder & Chief Scientist https://t.co/hLfvKLkFHd (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @Polytechnique

Guillaume Lample @GuillaumeLample

37K Followers 648 Following Cofounder & Chief Scientist https://t.co/hLfvKLkFHd (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @Polytechnique

principal researcher, @googledeepmind. ex director of emea at fair @metaai. mostly work on open projects: fasttext, dino, llama, gemma.

Armand Joulin @armandjoulin

4K Followers 344 Following principal researcher, @googledeepmind. ex director of emea at fair @metaai. mostly work on open projects: fasttext, dino, llama, gemma.

Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ;

Working on ML, DL, RL, LLMs, and their theory.

Tengyu Ma @tengyuma

25K Followers 512 Following Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ; Working on ML, DL, RL, LLMs, and their theory.

Jerry Wei @JerryWeiAI

5K Followers 261 Following 🧐 Improving and aligning large language models 🧠 Research Engineer @GoogleDeepMind ⏰ Past: @Stanford, @Google Brain

Adam Roberts @ada_rob

7K Followers 646 Following ai researcher @ Google DeepMind :: ♫ (MusicVAE, NSynth, MusicLM, SingSong) & 📝 (T5, PaLM) & :: t5x & seqio // recovering comp biologist

Co-founder @RekaAILabs and Honorary Researcher @IxaGroup (University of the Basque Country) | Past: Research Scientist @AIatMeta (FAIR)

Mikel Artetxe @artetxem

6K Followers 221 Following Co-founder @RekaAILabs and Honorary Researcher @IxaGroup (University of the Basque Country) | Past: Research Scientist @AIatMeta (FAIR)

Derek Zhiyuan Cheng @infolaber

491 Followers 847 Following Principle Engineer / Research Director at Google DeepMind. Formerly Google Brain, Pinterest, and Texas A&M.

Researcher @ Google DeepMind
| ML for Systems
| Systems for ML
| Computer Architecture PhD @ UT Austin🤘
| Opinions stated here are my own.

Dan Zhang @DZhang50

2K Followers 780 Following Researcher @ Google DeepMind | ML for Systems | Systems for ML | Computer Architecture PhD @ UT Austin🤘 | Opinions stated here are my own.

Maarten Bosma @MaartenBosma

1K Followers 892 Following Microsoft AI, ex @InflectionAI, @GoogleBrain

Le Hou @Hou_Le

1K Followers 135 Following Computer Sciencer, Transformer, StarCrafter.

Stanford CS PhD @StanfordCRFM
@StanfordNLP @StanfordAILab @StanfordHAI

Advisers: @percyliang @jurafsky
Previous: @CornellCIS @clairecardie
#FoundationModels

rishi @RishiBommasani

4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModels

David R. So @david_r_so

263 Followers 55 Following Research @ Google Brain.

Shayne Longpre @ShayneRedford

4K Followers 998 Following PhD @MIT. Prev: @Google Brain, @apple ML, @stanfordnlp. 🇨🇦 Interests: AI/ML/NLP, Data-centric AI, transparency & societal impact

Xinyun Chen @xinyun_chen_

4K Followers 840 Following Research Scientist at @GoogleDeepMind. PhD from @Berkeley_EECS.

Stephanie Chan @scychan_brains

3K Followers 2K Following Senior Research Scientist at DeepMind. Artificial and biological brains 🤖 🧠 Views are my own

Luke Zettlemoyer @LukeZettlemoyer

8K Followers 2K Following

Jiahui Yu @jhyuxm

2K Followers 777 Following Member of Technical Staff @OpenAI; previously Research Scientist at Google Brain/DeepMind.

Jeremiah Harmsen @JeremiahHarmsen

1K Followers 488 Following Creator of #TensorFlowHub and @TensorFlow Serving. Lead in Google Brain.

Siamak Shakeri @MaxSonate

314 Followers 263 Following Engineer at Google, Working on Language Models. Snowboarding and traveling when not working

Christian Szegedy @ChrSzegedy

32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.

Pang Wei Koh @PangWeiKoh

3K Followers 789 Following Assistant professor at @uwcse. Formerly @StanfordAILab @GoogleAI @Coursera. 🇸🇬

Xiaohua Zhai @XiaohuaZhai

3K Followers 206 Following Senior Staff Researcher @GoogleDeepMind team in Zürich

Barret Zoph @barret_zoph

10K Followers 880 Following @openai Past: Research Scientist at Google Brain.

Strategy, Programs & Product @GoogleAI , HCI Researcher. Ph.D @CityUniLondon Alumni @iift1963 @daiictofficial. Personal views.

Divy Thakkar @divy93t

5K Followers 2K Following Strategy, Programs & Product @GoogleAI , HCI Researcher. Ph.D @CityUniLondon Alumni @iift1963 @daiictofficial. Personal views.

Kristina Toutanova @toutanova

878 Followers 207 Following

Ashish Vaswani @ashVaswani

19K Followers 2K Following

Eric Jang @ericjang11

8 hours ago

if your transformers struggle with NaNs after a certain parameter size, you may be under a sophon lock. Keep pushing, don't let them win!

6 9 137 12K 17

Ruibo Liu @RuiboLiu

a day ago

@YiTayML @_jasonwei I just checked that doc. The most closest guess was from @hwchung27 actually. Everyone else was just so wrong ...

1 0 2 457 0

Jimmy Sticks @loss_gobbler

a day ago

@YiTayML IIRC Bridgewater used to do stuff vaguely similar to this

0 0 3 1K 0

Yao Fu @Francis_YAO_

2 days ago

Cannot agree more. My intuition is that FFN is for storing knowledge (this is why most knowledge editing are on FFNs) and Attention is for implementing algorithms (this is why most mechanistic interpretability, e.g., induction heads, are on Attn). Additionally, it seems that…

Yi Tay @YiTayML

2 days ago

not true, especially for language. if you trained a large & deep MLP language model with no self-attention, no matter how much data you'll feed it you'll still be lacking behind a transformer (with much less data). will it get to the same point? i don't think so. your tokens…

32 61 645 192K 298

0 26 140 21K 97

Jason Wei @_jasonwei

2 days ago

In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously. However, there’s no track record on how good someone’s intuition is. A fun way to do this is…

18 34 427 131K 202

Yuze @YuzeMa5

2 days ago

@YiTayML Congrats! Looking forward to the apps building on top of it!

1 0 1 25 0

Lucas Beyer (bl16) @giffmana

2 days ago

> be me > on vacation > kid asleep, wife away > but I'm not tired! > whip out colab > load my model > import new benchmark > try my model > tfw sota, sota by far > double-check for bugs or leaks > no bug found > no leak found idk man, probably a bug. Also, twitter is reddit now.

10 5 206 20K 20

Download Gif

Arturo Deza @ArtDeza

2 days ago

@YiTayML Exactly. I don't see OpenAI or any other company training a 2 layer fully connected neural network with SGD to do Vision and throwing it a trillion data points "because data is all you need".

0 0 3 336 0

yi 🦛 @agihippo

2 days ago

Happy to have 4 papers accepted to idgaf 2024. 🎉

0 0 26 2K 1

Hamid R. Darabi @_hdarabi

2 days ago

@YiTayML I was about to say you should not demean others' papers like that 😜

1 0 1 66 0

Ruibo Liu @RuiboLiu

2 days ago

@YiTayML The hot take version of this is: Google does the real architecture research, while other companies take it for granted. All these companies are basically "data companies".

0 0 9 2K 1

ëugene kharitonov 🏴‍☠️ @n0mad_0

2 days ago

I am too uncomfortable w/ this "data is everything" maximalism. Not all archs have favourable scaling laws, easy to train at large scale, etc etc

Yi Tay @YiTayML

2 days ago

32 61 645 192K 298

0 0 3 595 0

Vivek Raghunathan @vivek7ue

2 days ago

More wisdom from @YiTayML

Yi Tay @YiTayML

2 days ago

32 61 645 192K 298

1 0 3 4K 1

Jason Phang @zhansheng

3 days ago

Feeling cute, might burn money on random tech gadget

1 0 6 703 0

Kaizhao Liang @KyleLiang5

2 days ago

@YiTayML Absolutely, I tried a version of mlpmixer on language hoping to find something different from self-attention, the performance was horrible and it lacks basic abilities to generalize even on the simplest associative recall tasks…

1 0 7 2K 0

Fuzhao Xue @XueFz

2 days ago

I always strongly suggest people to read this work (arxiv.org/abs/2207.10551) by @YiTayML and @m__dehghani when discussing the model architecture. It almost takes up to 50% pages of the literature survey Chapter in my PhD thesis. It is so visionary to study this in 2022. I can…