lmsys.org @lmsysorg

Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtm lmsys.org US Joined March 2023

Tweets

369
Followers

32K
Following

170
Likes

536

lmsys.org @lmsysorg

6 days ago

Yes, check out @RekaAILabs's strong Flash-21B model!

Mikel Artetxe @artetxem

a week ago

Yes, check out @RekaAILabs's strong Flash-21B model!

2 6 58 41K 18

Download Image

4 11 80 22K 13

карельский .. @fakemath

616 Followers 675 Following ангедонист ангедон ангедоныч

Anette Ellie Larsson @zero09070907

28 Followers 149 Following Life is beautiful. Love is cruel.

badboy @badboyjohnny2

36 Followers 72 Following You wanna do what you just did, but you already did it, so you can't do that

Jacob Allen Guffey (G.. @checkmate812

1K Followers 7K Following 𝐼𝓉 𝓅𝓊𝓉𝓈 𝓉𝒽𝑒 𝓁𝒾𝓂𝑒 𝒾𝓃 𝓉𝒽𝑒 𝒸𝑜𝒸𝑜𝓃𝓊𝓉

Shaobo Zhang @shaobo76

40 Followers 61 Following SJTU EEer, Trader

Alexandr Feoktistov @FeoktistovNet

6 Followers 27 Following Entrepreneur

Mikhail Matveenko @MikhailMaTV

16 Followers 428 Following

Ubon-Obong Jeremiah U.. @UbObong2341

305 Followers 2K Following Optimist|| Electrical/Electronics Engineering Student || Mathematics/Science Tutor ||

irfreeman @_mutaga

293 Followers 781 Following ~ Verilog, C, and Rust at @yale Searching for the better Noryve? Follow @kaganwa_musore

wadrian psicólogo @wadrianHPsico

239 Followers 2K Following Psicólogo ❤️

adam ber @adamber11

64 Followers 303 Following Full time freelancer🙌🏻 Click the link to learn how to quit your 9-5 ✨

Tim Vogel @TimVogelTesla

28 Followers 221 Following hi, I'm Tim and Samele are donating for a Telsa for my family Paypal: https://t.co/inyaUVuUhY

Shedrach Stephen @StephenShe68907

57 Followers 205 Following

Khalid AlQahtani @KhalidAlqh55901

0 Followers 267 Following

Alexis Urusoff @elurusoff

179 Followers 2K Following Professional Passionist Inflamer. Cordobés. Hijo de española y ruso-guaraní. @narkocibernético Krishna leads.

Robertomixaudio @robertomix60433

3 Followers 45 Following

Al Quran @RoommiClassroom

40 Followers 50 Following voice of Quran

Ambient Earth - Improve focus, relaxation and tranquility with VirtuScapes that harness the soothing sounds of the earth. 🌿🎶 #ambientearth #focus #rest #peace

Ambient Earth @AmbientEarthv

16 Followers 77 Following Ambient Earth - Improve focus, relaxation and tranquility with VirtuScapes that harness the soothing sounds of the earth. 🌿🎶 #ambientearth #focus #rest #peace

BanKai @MaguahBankai

165 Followers 3K Following Adult Content Creator.(NSFW)

Shishir Joshi @theshishirjoshi

15 Followers 518 Following

Jidenna @Jidennajidenna

27 Followers 785 Following Never met Jidena

kouseinen @kouseinen_real

2K Followers 2K Following Prompt Engineer / Manager / Product Manager ITの大企業PM→大企業の子会社PdM→ベンチャーでPdM&経営企画→IT上場企業でプロンプトエンジニア Produce @AInews_trend #AI #generative #ChatGPT #prompt

Vanderley Nichelle @UnincoUniver

42 Followers 415 Following Polimento é coisa do passado!

Daniel Neburagho @DNeburagho12135

45 Followers 159 Following

Amjad @Ad31571973

76 Followers 266 Following „Dein Verstand ist grenzenlos, es sind die Zweifel, die dich limitieren.“

@hugoiarce @hugoiarce

21 Followers 386 Following Ceo y Fundador de https://t.co/3wdMl6qO2I +54 9 3434158123

pretoria2020 @pretoria2020

346 Followers 2K Following Si vis pacem, para bellum If you want peace, prepare for war #NoJusticeNoPeace #Resistance2021

Rohit kumar barada @Rohit_ku_1

11 Followers 132 Following travel enthusiast & web developer

Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.

AI at Meta @AIatMeta

532K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.

Hyperbolic @hyperbolic_labs

3K Followers 43 Following Realize your vision for AI with open access to more than just compute. Join our discord: https://t.co/SaGT3y9AtE

Lisa Dunlap @lisabdunlap

487 Followers 154 Following PhD student & vibe curator @berkeley_ai and Sky Computing Lab -- for the love of god look at your data

CS PhD @UCBerkeley | Projects - R2E, LiveCodeBench, Chatbot-Arena Coding, RAFT, Data Quality | Past: @AWS @MSFTResearch @iitbombay

Naman Jain @StringChaos

901 Followers 896 Following CS PhD @UCBerkeley | Projects - R2E, LiveCodeBench, Chatbot-Arena Coding, RAFT, Data Quality | Past: @AWS @MSFTResearch @iitbombay

Working on System for ML @ucbrise. Happy to get in touch: https://t.co/ACIbL2HqBr at https://t.co/FWFXdUDDMp (ex-@anyscalecompute)

Simon Mo @simon_mo_

339 Followers 303 Following Working on System for ML @ucbrise. Happy to get in touch: https://t.co/ACIbL2HqBr at https://t.co/FWFXdUDDMp (ex-@anyscalecompute)

Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.

Hao AI Lab @haoailab

356 Followers 136 Following Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.

Chief Scientist, Google DeepMind and Google Research. Co-designer/implementor of things like @TensorFlow, MapReduce, Bigtable, Spanner, Gemini .. (he/him)

Jeff Dean (@🏡) @JeffDean

296K Followers 6K Following Chief Scientist, Google DeepMind and Google Research. Co-designer/implementor of things like @TensorFlow, MapReduce, Bigtable, Spanner, Gemini .. (he/him)

Zhiqiang @Zhiqiang_Xie

423 Followers 390 Following Ph.D. Student @Stanford CS

Banghua Zhu @BanghuaZ

2K Followers 802 Following PhD @Berkeley_EECS, statistics, info theory, LLM, RL, Human-AI Interactions.

Liangsheng Yin @lsyincs

47 Followers 153 Following Undergraduate in SJTU, ACM Honor Class 2021. Interested in mlsys | machine learning | distributed systems

Logan Kilpatrick @OfficialLoganK

92K Followers 2K Following Lead product for @Google AI Studio and working on the Gemini API, helping developers build with AI, my views!

Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ;

Working on ML, DL, RL, LLMs, and their theory.

Tengyu Ma @tengyuma

25K Followers 512 Following Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ; Working on ML, DL, RL, LLMs, and their theory.

Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋

Christopher Manning @chrmanning

126K Followers 115 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋

@Berkeley_EECS Ph.D. with Mike Jordan/Jitendra Malik. Conformal prediction, distribution-free uncertainty quantification, vision/imaging. Former @stanford_ee.

Anastasios Nikolas An.. @ml_angelopoulos

3K Followers 784 Following @Berkeley_EECS Ph.D. with Mike Jordan/Jitendra Malik. Conformal prediction, distribution-free uncertainty quantification, vision/imaging. Former @stanford_ee.

Daniel Gross @danielgross

94K Followers 1 Following https://t.co/NZsHpnOzcn

Arthur Mensch @arthurmensch

40K Followers 872 Following Co-founder and CEO @MistralAI. Apply https://t.co/yHGRZAtjcx

Ce Zhang @ce_zhang

2K Followers 1K Following CTO @ Together @togethercompute

@Stanford professor. Chan-Zuckerberg investigator. Sloan Fellow. AI for biotech + health. Making AI more trustworthy, reliable and human compatible.

James Zou @james_y_zou

10K Followers 59 Following @Stanford professor. Chan-Zuckerberg investigator. Sloan Fellow. AI for biotech + health. Making AI more trustworthy, reliable and human compatible.

Created this account to keep an eye of all things #ML
Part time researcher of Natural stupidity in Artificial Intelligence

σ(W_hx * x_t + W_hh .. @QNixSynapse

94 Followers 85 Following Created this account to keep an eye of all things #ML Part time researcher of Natural stupidity in Artificial Intelligence

Fireworks AI @FireworksAI_HQ

5K Followers 65 Following 🎆 Generative AI Platform built for developers

Haotian Liu @imhaotian

6K Followers 396 Following building intelligence @xAI, creator of #LLaVA, cs @UWMadison, prev @MSFTResearch

Mistral AI @MistralAI

90K Followers 0 Following Fast, open-source and secure language models. Join us https://t.co/INALdNGvCP

ML for Art and Creativity, working @HuggingFace (apolinario@multimodal.art)

apolinario (multimoda.. @multimodalart

10K Followers 376 Following ML for Art and Creativity, working @HuggingFace ([email protected])

Allen Institute for A.. @allen_ai

54K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfL

Yang Song @DrYangSong

10K Followers 886 Following Leading the Strategic Explorations team @OpenAI. Score-Based Models. Diffusion Models. Consistency Models.

Xuechen Li @lxuechen

2K Followers 900 Following Building intelligence @xai. PhD @Stanford. Undergrad @UofT. Worked at @GoogleAI @MSFTResearch @Vectorinst. I go by Chen.

Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs

Andrew Ng @AndrewYNg

1.0M Followers 912 Following Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs

Sam Altman @sama

2.8M Followers 891 Following AI is cool i guess

Greg Brockman @gdb

667K Followers 51 Following President & Co-Founder @OpenAI

Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz music

John Schulman @johnschulman2

39K Followers 609 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz music

Aravind Srinivas @AravSrinivas

86K Followers 943 Following CEO @perplexity_ai

Yi-01.AI @01AI_Yi

5K Followers 8 Following A global company building AI 2.0 platform and applications

Shiyi Cao @shiyi_c98

396 Followers 361 Following PhD student @UCBerkeley, MSc @ETH, B.S @sjtu1896, systems, ml, and hpc

martin_casado @martin_casado

50K Followers 2K Following GP @ a16z ... questionable heuristics in a grossly underdetermined world

Presidential Young Professor at @NUSingapore. @Forbes 30 under 30. Ph.D. from @UCBerkeley. Founder, President and Chairman of @HPCAITech and Colossal-AI.

Yang You @YangYou1991

8K Followers 386 Following Presidential Young Professor at @NUSingapore. @Forbes 30 under 30. Ph.D. from @UCBerkeley. Founder, President and Chairman of @HPCAITech and Colossal-AI.

Assoc. Prof. @MIT, Distinguished Scientist @NVIDIA, cofounder of DeePhi (now part of AMD) and OmniML (now part of NVIDIA). PhD @Stanford. Efficient AI computing

Song Han @songhan_mit

6K Followers 144 Following Assoc. Prof. @MIT, Distinguished Scientist @NVIDIA, cofounder of DeePhi (now part of AMD) and OmniML (now part of NVIDIA). PhD @Stanford. Efficient AI computing

#AI Expert, CEO of @01ai_yi and Chairman of 创新工场 @sinovationvc, former President of Google China, Author of AI 2041 and NYT Bestseller AI Superpowers

Kai-Fu Lee @kaifulee

1.5M Followers 658 Following #AI Expert, CEO of @01ai_yi and Chairman of 创新工场 @sinovationvc, former President of Google China, Author of AI 2041 and NYT Bestseller AI Superpowers

Kaichun Mo @KaichunMo

3K Followers 879 Following Research Scientist at NVIDIA Seattle Robotics Lab; Previously CS Ph.D. from Stanford

Philipp Schmid @_philschmid

16K Followers 651 Following Tech Lead and LLMs at @huggingface 👨🏻‍💻 🤗 AWS ML Hero 🦸🏻 | Cloud & ML enthusiast | 📍Nuremberg | 🇩🇪 https://t.co/l1ppq3q3hk

Tianyi Zhang @Tianyi_Zh

1K Followers 613 Following iterating ... I used to train more language models but am working on agents now

Eric Wallace @Eric_Wallace_

6K Followers 1K Following Researcher at OpenAI working to make language models more trustworthy, secure, and private.

Research Scientist and Senior Manager in Meta AI (FAIR). AI-guided Optimization and Representation Learning. Novelist in spare time. PhD in @CMU_Robotics.

Yuandong Tian @tydsh

16K Followers 801 Following Research Scientist and Senior Manager in Meta AI (FAIR). AI-guided Optimization and Representation Learning. Novelist in spare time. PhD in @CMU_Robotics.

Zhijian Liu @zhijianliu_

695 Followers 600 Following PhD Student at @MIT. Focusing on efficient algorithms and systems for deep learning.

Guido Appenzeller @appenz

7K Followers 198 Following At a16z investing in AI & Infra. 2x founder & CEO. CTO at Intel & VMware. CPO at Yubico. Tweets are my own.

Zhuang Liu @liuzhuang1234

3K Followers 931 Following Research Scientist @MetaAI (FAIR, at NYC). machine learning, computer vision, neural networks. PhD from @Berkeley_EECS

Audrey Cheng @audreyccheng

500 Followers 126 Following CS PhD Student @ucbrise, undergrad @Princeton. Excited about transactions and databases in general!

Founder @leptonai. @UCBerkeley alumni. ex @google & @facebook. ex vp @AlibabaGroup. Open source work on caffe, @pytorch, @tensorflow, & @onnxai.

Yangqing Jia @jiayq

12K Followers 263 Following Founder @leptonai. @UCBerkeley alumni. ex @google & @facebook. ex vp @AlibabaGroup. Open source work on caffe, @pytorch, @tensorflow, & @onnxai.

Yann Dubois @yanndubs

4K Followers 1K Following PhD student @stanfordAILab | Prev: AI resident @metaai, @vectorinst, @CambridgeMLG

OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA

OpenAI @OpenAI

3.4M Followers 0 Following OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA

Lisa Dunlap @lisabdunlap

2 days ago

After a grueling few days of having to click accept to view the chatbot leaderboard, we put it back on HF :p huggingface.co/spaces/lmsys/c…

0 6 27 42K 4

Ahmad Al-Dahle @Ahmad_Al_Dahle

3 days ago

What a week since we released Llama 3! I couldn’t be more proud of the response. 🏆 Llama 3 70B is now the highest ranking open model on @lmsysorg leaderboard. 📈 1.2M+ downloads. 🤗 600+ derivative models on @huggingface. I'm excited for much more to come.

18 21 235 30K 19

Download Image

Xeophon @TheXeophon

4 days ago

LMsys added phi-3-128K into the arena. Got it in my comparisons. Excited to see where it’ll be placed

1 2 14 4K 2

rohan anil @_arohan_

4 days ago

Such a good service - open leaderboards!

Xeophon @TheXeophon

4 days ago

LMsys added phi-3-128K into the arena. Got it in my comparisons. Excited to see where it’ll be placed

1 2 14 4K 2

0 0 17 3K 0

Oriol Vinyals @OriolVinyalsML

5 days ago

Gemini 1.5 Pro has entered the (LMSys) Arena! Some highlights: -The only "mid" tier model at the highest level alongside "top" tier models from OpenAI and Anthropic ♊️ -The model excels at multimodal, and long context (not measured here) 🐍 -This model is also state-of-the-art…

lmsys.org @lmsysorg

6 days ago

More exciting news today -- Gemini 1.5 Pro result is out! Gemini 1.5 Pro API-0409-preview now achieves #2 on the leaderboard, surpassing #3 GPT4-0125-preview to almost top-1! Gemini shows even stronger performance on longer prompts, in which it ranks joint #1 with the latest…

36 184 925 428K 170

Download Image

8 51 273 93K 45

Logan Kilpatrick @OfficialLoganK

5 days ago

Gemini 1.5 Pro is now ranked #2 on @lmsysorg chat arena (and #1 for long context). More work to do, but excited we put this model into the hands of developers. The era of truly multimodal models has arrived 🚀

22 32 334 41K 39

Download Image

rohan anil @_arohan_

6 days ago

@profjoeyg @lmsysorg Something like this. But 1. Unsquish elo number and [-] 2. Order top ones at top and elo towards right 3. Given gold silver and bronze for good meaure and participation trophy for smallest model that shows up here

1 0 1 304 0

Download Image

rohan anil @_arohan_

6 days ago

@profjoeyg @lmsysorg I think maybe rotating the image by 90 degrees would make a drastic difference. The aligning towards the right instead of the left. You could also the not need to use icons. And you can color the groups based on clusters you see.

1 0 3 409 0

rohan anil @_arohan_

6 days ago

@profjoeyg @lmsysorg Plot above but with the labels for the elo not squished by the bar. And perhaps icons of the companies or teams along with the elo number since reading vertical text is hard.

1 0 3 222 0

Soumith Chintala @soumithchintala

6 days ago

Llama3-70B has settled at #5. With 405B still to come next... I remember when GPT-4 released in March 2023, it looked like it was nearly-impossible to get to the same performance. Since then, I've seen @Ahmad_Al_Dahle and the rest of the GenAI org in a chaotic rise to focus,…

lmsys.org @lmsysorg

6 days ago

Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…

30 164 1K 1.1M 306

Download Image

17 48 681 115K 88

Yi Tay @YiTayML

6 days ago

Good to see that on @lmsysorg, our feb version of Reka Flash 21B ⚡️from @RekaAILabs, despite only being 21B dense parameters has competitive performance to other much larger models like Mixtral 8x22 and Mistral medium 🚀🔥 We'll have much better Flash and Core soon! 🦾

3 14 114 12K 7

Download Image

@Toong @TianDatong

6 days ago

@lmsysorg A "hard" category in the leaderboard could be very good.😀

0 0 3 124 0

Braden Hancock @bradenjhancock

6 days ago

Such a welcome addition to the benchmarking landscape! - Creating benchmarks that correlate well with humans and separate top models has become increasingly hard as they’ve become increasingly capable. - And knowing it will be refreshed reduces the incentive for organizations…

lmsys.org @lmsysorg

7 days ago

Introducing Arena-Hard – a pipeline to build our next generation benchmarks with live Arena data. Highlights: - Significantly better separability than MT-bench (22.6% -> 87.4%) - Highest agreement to Chatbot Arena ranking (89.1%) - Fast & cheap to run ($25) - Frequent update…

19 122 638 119K 272

Download Image

0 0 8 2K 1

Philipp Schmid @_philschmid

6 days ago

New Benchmark by @lmsysorg! 🏆 Arena-Hard is a new benchmark to automatically evaluate LLMs on 500 real-world use cases. Arena-Hard matches 89% of human preferences from the LMSYS chatbot arena using LLM-as-a-Judge. 🤯 TL;DR: 🥇 Outperforms other benchmarks like MT-Bench and…

10 21 144 22K 67

Download Image

Mikel Artetxe @artetxem

a week ago

2 other models worth highlighting 😉 @RekaAILabs Flash 21B is very strong for its size! 💪

Philipp Schmid @_philschmid

a week ago

How good is @AIatMeta Llama 3 in real-world user scenarios?🤔 The early votes in @lmsysorg are in, and Llama-3 is the best open LLM, even outscoring @OpenAI GPT-4 (March) or @AnthropicAI Claude 3 Haiku! 👑 Llama 3 currently scores at 1199 in #7, only behind the latest @OpenAI…

11 28 166 56K 36

Download Image

2 6 58 41K 18

Download Image

Swaroop Mishra @Swarooprm7

7 days ago

@lmsysorg This is a cool initiative. How about you also introduce equitable evaluation and leaderboard customization since users of lmsys may have their own requirement too? An example implementation is here: arxiv.org/pdf/2106.05532…

1 2 4 1K 3

Download Image

Jim Fan @DrJimFan

a week ago

Llama-3 is closing the gap with GPT-4, but multimodal models gotta catch up. Vision capabilities of open models like LlaVA are far, far behind GPT-4V. Video models are even worse. They hallucinate all the time and fail to give detailed descriptions of complex scenes and actions.…

46 108 854 135K 179

rohan anil @_arohan_

a week ago

Give it a try!

lmsys.org @lmsysorg

a week ago

Congrats @GoogleDeepMind on shipping Gemini 1.5 Pro to public review! Upon capacity & latency testing, we have now brought Gemini 1.5 Pro up to the Arena🤖 Big improvement from Pro 1.0 to 1.5 across the board, and exceptionally strong long context understanding. Come test and…