Aran Komatsuzaki @arankomatsuzaki
@TeraflopAI arankomatsuzaki.wordpress.com/about-me/ Joined November 2016-
Tweets5K
-
Followers94K
-
Following78
-
Likes11K
Awesome to see @joespeez, AI Product Director, @Meta, mention our previous research, YaRN, on stage at the @weights_biases Fully Connected conference. We have another very exciting long-context release coming soon.
🚨New Paper🚨 We propose 1⃣CultureBank🌎 dataset sourced from TikTok & Reddit 2⃣An extensible pipeline to build cultural knowledge bases 3⃣Evaluation of LLMs’ cultural awareness 4⃣Insights into culturally-aware LLMs Project: culturebank.github.io Data: shorturl.at/hrtwP
Apple presents OpenELM - An efficient LM family with open-source training and inference framework - Performs on par with OLMo while requiring 2x fewer pre-training tokens repo: github.com/apple/corenet hf: huggingface.co/apple/OpenELM abs: arxiv.org/abs/2404.14619
SnapKV: LLM Knows What You are Looking for Before Generation - Automatically compresses KV caches - Consistent decoding speed with a 3.6x increase in generation speed and an 8.2x enhancement in memory efficiency repo: github.com/FasterDecoding… abs: arxiv.org/abs/2404.14469
Twelve Labs presents Pegasus-v1 - Presents a multimodal LM specialized in video content understanding and interaction through natural language - Achieves SotA in video QA and various other video tasks and outperforms Gemini 1.5 Pro proj: twelvelabs.io/blog/upgrading… abs:…
Microsoft presents Multi-Head Mixture-of-Experts Achieves notable improvements over the baseline MoE by using multiple MoE heads repo: github.com/yushuiwx/MH-MoE abs: arxiv.org/abs/2404.15045
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Perfect Reasoners Improves the performance of GPT4 on GSM8K from 94.6% to 97.1% with a three-stage prompting arxiv.org/abs/2404.14963
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation Presents a zero-shot human-video generation approach that can perform personalized video generation given single reference facial image without further training proj: id-animator.github.io abs:…
Many thanks to Aran for sharing! @arankomatsuzaki Links are here: Code: github.com/dwzhu-pku/Long… Paper Page: huggingface.co/papers/2404.12… Benchmark: huggingface.co/datasets/dwzhu… Model: huggingface.co/dwzhu/e5rope-b…
Many thanks to Aran for sharing! @arankomatsuzaki Links are here: Code: github.com/dwzhu-pku/Long… Paper Page: huggingface.co/papers/2404.12… Benchmark: huggingface.co/datasets/dwzhu… Model: huggingface.co/dwzhu/e5rope-b…
Microsoft presents LongEmbed: Extending Embedding Models for Long Context Retrieval - Presents the LongEmbed benchmark for long context retrieval - Releases the E5-Base-4k and E5-RoPE-Base models repo: github.com/dwzhu-pku/Long… abs: arxiv.org/abs/2404.12096
A few caveats about Phi-3: - The figure I attached at the beginning had some errors. Here's the updated one. - Phi-3-medium performs well on TriviaQA but noticeably underperforms rel. to GPT-3.5. We can guess that Phi-3 recipe doesn't magically make it understand more random…
A few caveats about Phi-3: - The figure I attached at the beginning had some errors. Here's the updated one. - Phi-3-medium performs well on TriviaQA but noticeably underperforms rel. to GPT-3.5. We can guess that Phi-3 recipe doesn't magically make it understand more random… https://t.co/qQKksyWGuD
ByteDance presents Hyper-SD Achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5 proj: hyper-sd.github.io abs: arxiv.org/abs/2404.13686
ByteDance presents Graphic Design with Large Multimodal Model Outperforms prior arts and establishes a strong baseline for the field of graphi design repo: github.com/graphic-design… abs: arxiv.org/abs/2404.14368
Better Synthetic Data by Retrieving and Transforming Existing Datasets repo: github.com/neulab/prompt2… abs: arxiv.org/abs/2404.14361
Let's give er' a go!
Small models excite me.
AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Kosta Derpanis @CSProfKGD
48K Followers 198 Following #CS Associate Prof @YorkUniversity, #ComputerVision Scientist Samsung #AI, @VectorInst Faculty Affiliate, TPAMI AE, #CVPR2024/#ECCV2024 Publicity Co-chairnear @nearcyan
45K Followers 883 Following https://t.co/IdaJwZJCXm partner @ https://t.co/9g1MIgjiqc dms openabhishek @abhi1thakur
81K Followers 662 Following 🤗 I build AutoTrain @huggingface 👨🏽💻 World's First 4x Grand Master @kaggle 🎥 YouTube 100k+: https://t.co/BHnem8fTu5 ⭐ GitHub StarPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistHorace He @cHHillee
23K Followers 448 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleTanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbYi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼Rosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRRoss Wightman @wightmanr
18K Followers 1K Following Computer Vision @ 🤗. Ex head of Software, Firmware Engineering at a Canadian 🦄. Currently building ML, AI systems or investing in startups that do it better.SynthLabs @synth_labs
12K Followers 43 Following AI Aligned with Your Vision. We’re doing cutting edge research for transparent, auditable AI alignment.Sander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Omar Sanseviero @osanseviero
31K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽journeyman coder @xgenaidev
6 Followers 55 Following organic intelligence(?) all tweets are generated using electric circuitsniko @niko1311117
18 Followers 151 FollowingElectronicsseeker @libertarian108
2 Followers 379 FollowingzouFeng @zouKunkka
12 Followers 154 FollowingLilianahlg @Lilianahlg
9 Followers 1K FollowingGuy Swann ⚡️| Act.. @TheGuySwann
80K Followers 3K Following Liberty is a technology problem • Host of @BitcoinAudible, @Ai_Unchained • Pro Memecraft • Audiobook NarratorQuarkstar @Quarkstar9
17 Followers 107 FollowingYin-Hong Cao @caoyinhong
68 Followers 890 Following Postdoc in Jiayang Li Lab, Institute of Genetics and Developmental Biology, CAS. Focus on the Multi-omics of the rice & dandelions🌾🌱🧬Rob Tiffany 🇺🇸 @RobTiffany
28K Followers 22K Following Military Advisor on Emerging Technologies • Author • Speaker • Inventor • US Navy Veteranitsjusttrash @its_justtrash
1 Followers 32 Followingzelan Luo @ZelanLuo187
2 Followers 39 Followingshanhai @shanhai95147186
0 Followers 587 FollowingKoolster @Koolster34
172 Followers 2K Followingsanderhahn @sanderhahn
0 Followers 45 FollowingTjaž Silovsek @silovsek
1 Followers 20 FollowingLW Owens @LOwens6923
116 Followers 923 Following For God gave His only begotten Son, that whosoever believeth in Him should not perish, but have everlasting life.”ibaadkhan @ibaadkhan229427
4K Followers 7K FollowingRamagu @Don_Ramoncillo
229 Followers 2K Following Padre y marido | Además, aprendiz de todo y experto de nada.Erlend Fiskerud @ephisx
6 Followers 16 FollowingItqdevs Softwares @itq_devs
4 Followers 357 Following Itqdevs is your one-stop service provider for all your business technology needs. Custom softwares, exceptional design services, data analytics & cybersecurityIon Mocanu @IonMocanuion18
7 Followers 198 Following Medtech innovator, virtual clinic@home. Screening, diagnostic, remote patient monitoring and tele-consultation, all cardiac patient journey in one place.fr13nz @fr13nz
118 Followers 2K Followingpromise eyo @promiseeyo60399
1 Followers 53 FollowingFalko Heinze @falko26
61 Followers 115 FollowingSascha Schmunk @sascha_schmunk
403 Followers 910 Following 🐘 Founder BLACK ELEPHANT COACHING ▫️Excellence Coach ▫️Transforming individuals & teams to EXCELLENCE: Character. Culture. Strengths. Grit. Habits. Emotions.Leo Kapatos @KapatosLeo53501
79 Followers 578 FollowingAlin Ciocan @AlinCiocan4
2 Followers 55 FollowingEcho @tony_kk121
72 Followers 1K FollowingXing Zhou @xingzhougmu
8 Followers 99 Followingcoffee & AI @realcoffeeAI
33 Followers 462 Following*Zakky'sLordIsJesus* @ZakkySJ
256 Followers 519 Following Because, if you confess with your mouth that Jesus is Lord and believe in your heart that God raised him from the dead, you will be saved. Romans 10:9Alex Lee @Boxcounter
12 Followers 47 Followingluis buera @luisbuera3
5 Followers 44 Followingnix @nix2liu
10 Followers 24 Following System Architect in Autonomous Driving @Li_Auto_ /// prev.@ZEEKRGlobal @AppleJay @jayloofah
38 Followers 95 FollowingYubin Kim @ybkim95_ai
5 Followers 33 Following Graduate student @MIT conducting research on Healthcare AI and Wearable Sensors with Personal Robots.Shrey Pandey @ShreyPandey1509
8 Followers 22 FollowingA.I.Warper @AIWarper
12K Followers 126 Following Sharing my creative AI experiments • Business Inquiries (consulting ONLY - no commission at this time): [email protected]SiMindLab @SiMindLab149369
1 Followers 11 Following 硅脑实验室,致力于让低功耗AI涌现智慧,飞入寻常百姓家。Silicon Mind/Brain Labs,committed to enabling low-power AI to emerge wisdom and benefiting everyone。(Email: [email protected])rockets💰💰💰�.. @Bighcbc
0 Followers 50 FollowingAK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.AI at Meta @AIatMeta
531K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Soumith Chintala @soumithchintala
185K Followers 877 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈near @nearcyan
45K Followers 883 Following https://t.co/IdaJwZJCXm partner @ https://t.co/9g1MIgjiqc dms openTanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbHugging Face @huggingface
342K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhateYi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Tim Dettmers @Tim_Dettmers
29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Sergey Levine @svlevine
79K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceOriol Vinyals @OriolVinyalsML
166K Followers 82 Following VP of Research & Deep Learning Lead, Google DeepMind. Gemini co-lead. Past: AlphaStar, AlphaFold, AlphaCode, WaveNet, seq2seq, distillation, TF.Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)NeurIPS Conference @NeurIPSConf
111K Followers 35 Following New Orleans, Dec 10-16, 23. https://t.co/ga8aOw615g Tweets to this account are not monitored. Please send feedback to [email protected].Rivers Have Wings @RiversHaveWings
31K Followers 225 Following AI/generative artist. Writes her own code. Absolute power is a door into dreaming.udio @udiomusic
27K Followers 0 FollowingTensoruAI @TensoruAI
19 Followers 3 FollowingJan Leike @janleike
44K Followers 322 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.DuckAI @TheDuckAI
621 Followers 8 Following An open-source ML research community at Discord: https://t.co/7YDTo6Mo1GLouis Castricato @lcastricato
3K Followers 477 Following Math @uwaterloo, RLHF @BrownCSDept, Goosefluencer. x-RS @aieleuther, x-Head of LLMs @stabilityai, x-lead @CarperAI. co-founder @synth_labs. We're hiring.Christian Szegedy @ChrSzegedy
32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.Microsoft Research @MSFTResearch
553K Followers 2K Following We advance science and technology to benefit humanity. https://t.co/kz0nARXbwT Register for Microsoft Research Forum on June 4 ⬇️ Get our newsletterAri Holtzman @universeinanegg
3K Followers 2K Following PI @UChicagoCS & @DSI_UChicago, leader of Conceptualization Lab https://t.co/BVCT3zdaNV, Post-doc @Meta. We don’t really know much about language models...yet.𝔊𝔴𝔢𝔯𝔫 @gwern
42K Followers 88 Following Internet besserwisser; pedantic, mean reply guy. 𝘞𝘢𝘵𝘢𝘴𝘩𝘪 𝘬𝘪𝘯𝘪𝘯𝘢𝘳𝘪𝘮𝘢𝘴𝘶! (Follow requests ignored due to terrible UI.)Patrick Lewis @PSH_Lewis
4K Followers 655 Following London-based AI/NLP Research Scientist. I co-lead the RAG & tool use team at Cohere w/ @s_hofstaetter. Previous Fundamental AI Research at Meta AI, FAIR, UCL AIMike Lewis @ml_perception
6K Followers 227 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.Madison May (e/ia) @pragmaticml
2K Followers 2K Following teaching machines @indicodata - professional noviceICML Conference @icmlconf
70K Followers 17 Following Int'l Conf on ML • July 21-27, 2024 (Vienna, Austria) • #icml2024 • Contact: https://t.co/6saHKWV01y • https://t.co/sFwmcQNWkEColin Raffel @colinraffel
30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpACL 2024 @aclmeeting
18K Followers 35 Following Association for Computational Linguistics | ACL 2024 conference | The 62nd Annual Meeting of the ACL Hashtags: #NLProc #ACL2024NLPAllen Institute for A.. @allen_ai
53K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfLSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Thang Luong @lmthang
20K Followers 100 Following Senior staff scientist @GoogleDeepMind. PhD @StanfordNLP. PI #AlphaGeometry. Co-lead #Bard Multimodality, now #Gemini. Co-founder #MeenaBot (later LaMDA).trieu @thtrieu_
2K Followers 241 Following thinking about thinking. created alphageometry, darkflow. prev: nyu, google brain/deepmindNal @nalkalc
35K Followers 259 Following Researcher in Deep Learning @GoogleDeepMind. Angel investor. Co-creator @GoogleAI Brain Amsterdam, Ex @DeepMind, Edu at Oxford, UvA and Stanford.Guillaume Lample @GuillaumeLample
37K Followers 648 Following Cofounder & Chief Scientist https://t.co/hLfvKLkFHd (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @PolytechniqueAllenNLP @ai2_allennlp
14K Followers 31 Following The AllenNLP team works on language-centered AI that equitably serves humanity. We deliver high-impact research and open-source tools to accelerate progress.Thomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceStat.ML Papers @StatMLPapers
20K Followers 0 Following Unofficial updates of statistical machine learning papers on arXivBenchmarks, especially long-context embedding benchmarks, are few and far between. Great work done by @dwzhu128 and collaborators. Great seeing Nomic Embed stack up well against other long context models! Following some debugging of the evals, we reran the evals and upstreamed…
Microsoft presents LongEmbed: Extending Embedding Models for Long Context Retrieval - Presents the LongEmbed benchmark for long context retrieval - Releases the E5-Base-4k and E5-RoPE-Base models repo: github.com/dwzhu-pku/Long… abs: arxiv.org/abs/2404.12096
I always strongly suggest people to read this work (arxiv.org/abs/2207.10551) by @YiTayML and @m__dehghani when discussing the model architecture. It almost takes up to 50% pages of the literature survey Chapter in my PhD thesis. It is so visionary to study this in 2022. I can…
not true, especially for language. if you trained a large & deep MLP language model with no self-attention, no matter how much data you'll feed it you'll still be lacking behind a transformer (with much less data). will it get to the same point? i don't think so. your tokens…
Thank you @_akhaliq @ClementDelangue @Thom_Wolf @arankomatsuzaki @pcuenq @awnihannun and others for sharing our work.
Awesome to see @joespeez, AI Product Director, @Meta, mention our previous research, YaRN, on stage at the @weights_biases Fully Connected conference. We have another very exciting long-context release coming soon.
A big thank you to @joespeez @Meta for mentioning our previous research, YaRN, at the @weights_biases Fully Connected conference. We have some exciting long-context releases coming up soon.
Awesome to see @joespeez, AI Product Director, @Meta, mention our previous research, YaRN, on stage at the @weights_biases Fully Connected conference. We have another very exciting long-context release coming soon.
In my humble opinion the recent Stream of Search paper (arxiv.org/abs/2404.03683) is truly outstanding. Everyone should give it a thorough read.
@natolambert @arankomatsuzaki @herbiebradley tbf, there are also a lot of “pythia”s, though not ones in the LLM subfield
Great to see others follow suit in releasing fully-open and documented LLMs!
Apple presents OpenELM - An efficient LM family with open-source training and inference framework - Performs on par with OLMo while requiring 2x fewer pre-training tokens repo: github.com/apple/corenet hf: huggingface.co/apple/OpenELM abs: arxiv.org/abs/2404.14619
🚨New Paper🚨 We propose 1⃣CultureBank🌎 dataset sourced from TikTok & Reddit 2⃣An extensible pipeline to build cultural knowledge bases 3⃣Evaluation of LLMs’ cultural awareness 4⃣Insights into culturally-aware LLMs Project: culturebank.github.io Data: shorturl.at/hrtwP
A modern version of Pythia? Curious how good the models are.
Apple presents OpenELM - An efficient LM family with open-source training and inference framework - Performs on par with OLMo while requiring 2x fewer pre-training tokens repo: github.com/apple/corenet hf: huggingface.co/apple/OpenELM abs: arxiv.org/abs/2404.14619
@haileysch__ @arankomatsuzaki @herbiebradley ooooooh noooooo github.com/CarperAI/OpenE…
@arankomatsuzaki heyyy, that name was taken already @herbiebradley
@letalvoj @arankomatsuzaki Hi Vojta, thanks for your comment. Actually, when working on the model, the first few cases we tested were on our own faces, which look quite good IMHO. Figure 6 in the paper also shows some ordinary faces. We are cleaning code for release and happy for you to test when ready
Let's go @Apple !! "Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including…
Apple presents OpenELM - An efficient LM family with open-source training and inference framework - Performs on par with OLMo while requiring 2x fewer pre-training tokens repo: github.com/apple/corenet hf: huggingface.co/apple/OpenELM abs: arxiv.org/abs/2404.14619
woah, cohere seriously cooked here
SnapKV: LLM Knows What You are Looking for Before Generation - Automatically compresses KV caches - Consistent decoding speed with a 3.6x increase in generation speed and an 8.2x enhancement in memory efficiency repo: github.com/FasterDecoding… abs: arxiv.org/abs/2404.14469
phi-3 TLDR: Model trained with default 4K context length but Long Rope training coming with 128K context length. Original size 1.8Gb. Able to run on an iPhone A16 bionic chip using 4bit quantization with a rate of 12t/sec Overall really good model with strong performance in…
Microsoft just released Phi-3 - phi-3-mini: 3.8B model trained on 3.3T tokens rivals Mixtral 8x7B and GPT-3.5 - phi-3-medium: 14B model trained on 4.8T tokens w/ 78% on MMLU and 8.9 on MT-bench arxiv.org/abs/2404.14219
@arankomatsuzaki Awesome stuff. Makes you wonder how many people'll be testing this thanks to the small size, can't wait to see how Phi-3 performs outside of benchmarks.