bandish @bandish
Engineer @MosaicML, I work on making DL efficient and accessible. San Francisco, CA Joined December 2008-
Tweets150
-
Followers208
-
Following413
-
Likes869
Want to talk AI research and best practices with the people working on it? The @DbrxMosaicAI research team is running meetups worldwide in May. linkedin.com/pulse/databric…
LLM evals are a mess! They are noisy, inconsistent, and contradictory. Scaling laws on the other hand have consistently held up to increasing scrutiny. Can we use the reliability of scaling laws to predict the quality of our eval benchmarks?
@SnowflakeDB Awesome work training such a big model with a permissive license! I think you had a mistake in your IFEval implementation, your reported number is less than 2x what we observe (though it does vary with inference server and sampling parameters). You should see in the high 60s
I'm co-organizing the inaugural research workshop on Compound AI Systems on June 13th: sites.google.com/view/compound-… . Send in your work on designing & optimizing such systems! Thrilled to have @RichardSocher, @MonicaSLam and @polynoamial as speakers, and host this at @Data_AI_Summit.
Llama’s are fast
@AravSrinivas I dont think there's an improvement in compression, it's just more flops... Like when you plot training flops vs benchmark accuracy, Llama-3-8b-base is much worse than it could have been with the same compute (compare to llama-2-70b-base which is iso flop) because the…
Hold on to your butts!
Congrats @AIatMeta team on the Llama3 release! This is a huge milestone for AI and the market...the performance of these models are a huge step compared to a year ago and give users so much more flexibility on how they want to use, deploy, and customize AI
Congrats @AIatMeta team on the Llama3 release! This is a huge milestone for AI and the market...the performance of these models are a huge step compared to a year ago and give users so much more flexibility on how they want to use, deploy, and customize AI
Fixed it for you, @code_star
Fixed it for you, @code_star https://t.co/jrc6k7dZmb
Super excited to be on this panel discussing the journey behind DBRX!
📢TOMORROW! Join some of our amazing research team (@bandish @abhi_venigalla @davisblalock @ajaysaini725) online for a deep dive on #DBRX - hosted by @databricks DevOps guru @dennylee. Register now!
📢TOMORROW! Join some of our amazing research team (@bandish @abhi_venigalla @davisblalock @ajaysaini725) online for a deep dive on #DBRX - hosted by @databricks DevOps guru @dennylee. Register now!
Excited to be on this super fun panel discussion with the amazing @abhi_venigalla @davisblalock @ajaysaini725 Demetrios and Denny! Make sure to tune in, we’re going to be sharing the journey that is DBRX!
Excited to be on this super fun panel discussion with the amazing @abhi_venigalla @davisblalock @ajaysaini725 Demetrios and Denny! Make sure to tune in, we’re going to be sharing the journey that is DBRX!
DBRX by @databricks ...it's REALLY good!! The New MoE 132b parameter model is open-source and costs $10 m to train. Thank you, Databricks, for your contribution to OS. Check out the full explanation and testing: 🎥👇
Domain-specific benchmarks matter for enterprise, glad to see DBRX working well. @JuliaANeagu building some interesting enterprise evals!
Domain-specific benchmarks matter for enterprise, glad to see DBRX working well. @JuliaANeagu building some interesting enterprise evals!
Hi all, a few updates on MegaBlocks 🧵 github.com/databricks/meg…
Many people kindly pointed out that the y-axes were scaled differently for each of the three sections of this plot. Here's a corrected version. Still glad we tried to create something prettier than matplotlib, but sorry we didn't get it perfectly 🙂
Many people kindly pointed out that the y-axes were scaled differently for each of the three sections of this plot. Here's a corrected version. Still glad we tried to create something prettier than matplotlib, but sorry we didn't get it perfectly 🙂 https://t.co/cmiht2GSNX
DBRX dropped less than 5 hrs ago.... the pace of the open community is incredible
DBRX dropped less than 5 hrs ago.... the pace of the open community is incredible
They hate us cause the ain’t us
Introducing DBRX! The best open source model trained on Databricks, the best enterprise AI platform! Super proud of the team on this huge accomplishment! 132B parameters MoE trained on over 12T tokens. Super high quality and it’s FAST!
Introducing DBRX! The best open source model trained on Databricks, the best enterprise AI platform! Super proud of the team on this huge accomplishment! 132B parameters MoE trained on over 12T tokens. Super high quality and it’s FAST!
FannyHoratio @Xq9yo68DpgL20e
0 Followers 227 FollowingYvetteBulwer @y22tpYuRVqm623J
3 Followers 431 FollowingNikhil Thorat @nsthorat
10K Followers 2K Following Co-founder of Lilac AI (@lilac_ai), now joining @databricks. Past: Co-created TensorFlow.js and Know Your Data. Google Brain // PAIR // Responsible AIMargueriteBird @SczRSoX9vG355DF
27 Followers 354 FollowingKimberleySaul @Af74xQAcD2lQ6n
20 Followers 484 FollowingFfadout @ffadout34412
0 Followers 461 FollowingTarewhiosh @tarewhiosh2427
2 Followers 447 FollowingCyber Guy @CyberGuy1555863
6 Followers 8 FollowingAlvaJordan @3FlDVhTvxRA3V
2 Followers 414 FollowingSarwreys @sarwreys6717
1 Followers 352 FollowingLetaneight @letaneight6922
20 Followers 1K Following Nice to meet you. My hobbies are reading, food and sports. I like cats😘 I like to meet new friends while traveling🎉🎉🎉Brett Larsen @_BrettLarsen
430 Followers 336 Following Sr. Research Scientist @DbrxMosaicAI | Guest Researcher @FlatironInst @NYU_CNS | Efficient deep learning + better algorithms for data sciencePrometheus @citnotcitta
214 Followers 2K Following Making dreams, stealing 🔥(memes), tinkering with 🤖👾 @UWaterloo | 🇨🇦Atharva Kshirsagar @anotherAtharva
93 Followers 751 Following MS CS @ucsd_cse. LLMsys @haoailab. Second Breakfast Acceptance Activist.Jeff Ma @18jeffreyma
43 Followers 703 FollowingElara @Elara0770566961
14 Followers 3K Following May I be square, and may you be warm for all meals and all seasons.Wei Shi @weishi
67 Followers 940 FollowingSara Hooker @sarahookr
39K Followers 8K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Cosmin Negruseri @cosminnegruseri
2K Followers 2K Following Chief Prompt Engineer at Stealth Startup, ex Pinterest Search / Homefeed, https://t.co/0VwMvjB9Xh, Altiscale, Google Ads, SearchAlpay Ariyak @AlpayAriyak
1K Followers 2K Following AI @RunPod_io | Lead: @OpenChatDev (600k+ downloads on HuggingFace🤗)Sparsh Jain @sparshjain21
64 Followers 786 Following Research Intern @AI4Bharat, IIT Madras || Ex- Data Science Intern @Culinda || Data Science || ML enthusiastJulia Neagu @JuliaANeagu
691 Followers 974 Following CEO & Co-Founder @QuotientAI ✨ formerly @GitHub @GitHubCopilot 🤖 reformed physicist 👩🔬 ~ opinions are my own ~Tessa @tessybarton
744 Followers 729 Following Exploration agent. Research scientist at @MosaicML. Prev: @NYTimesKyle Wiggers @Kyle_L_Wiggers
65K Followers 4K Following Technology journalist. Senior Enterprise Reporter @TechCrunch ([email protected]). Pronouns: he/him. Mastodon: https://t.co/wesC0GePagN N @mo13531
1 Followers 355 FollowingSHT @dbohler
215 Followers 2K FollowingAndrew Drozdov @mrdrozdov
2K Followers 1K Following RAG at @MosaicML x @Databricks 🧱 Prev: @UMass_NLP (PhD), @Google, @IBMVishal Goklani @vgoklani_ai
648 Followers 5K Following Twitter Nerd... Interested in Deep Learning (self-supervised learning & LLMs), Astrophysics (exoplanets), and Cosmology (CMB).... I like to build thingsijohn Ⓐ @john_whickins
268 Followers 2K Following Lost in the sea of life, sweet dreamer, making waves with AI. #lostbutnotfound #Revolution #AIenthusiastShreya Gupta @ShreyaByte
205 Followers 301 Following AI Evangelist & Entrepreneurial Mind 🚀 | Sharing the latest AI breakthroughs and business insights. Turn business ideas into profit with #AICurious @Curious999999
24 Followers 380 FollowingEnio Fernandes @bob_123456789__
39 Followers 696 FollowingHamel Husain @HamelHusain
23K Followers 2K Following Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb. @fastdotai core contributor.rajan agarwal ⁂ @_rajanagarwal
1K Followers 1K Following automating cars & trains • prev wearable ai & earthquake research • growing @uwaterlooSEJason Scalia @jason_scalia
101 Followers 625 Following Aspiring music journalist turned technologist | Music, Retro Computing, Naval News, etc. | Opinions are my own。★ ∴*。☆⋆ .. @mimi10v3
8K Followers 3K Following AI/Human Alliance with @foomagemindset 🫶 aspiring AI ecologist 🌞/acc & biophilia DC/VA/WV.yousra @aoudi21
267 Followers 3K Following Engineer, researcher, finance and founder of https://t.co/yHLNUHIWR9bilal2vec @bilaltwovec
2K Followers 794 Following ✨ se @uwaterloo • prev @googlebrain @cohere @dbrxmosaicaiNikhil Mehta @nikhilmehta_ai
80 Followers 1K FollowingCharletta Bullard @Chardesignstuff
351 Followers 3K FollowingAndrew Drozdov @mrdrozdov
2K Followers 1K Following RAG at @MosaicML x @Databricks 🧱 Prev: @UMass_NLP (PhD), @Google, @IBMNikhil Thorat @nsthorat
10K Followers 2K Following Co-founder of Lilac AI (@lilac_ai), now joining @databricks. Past: Co-created TensorFlow.js and Know Your Data. Google Brain // PAIR // Responsible AIDwarkesh Patel @dwarkesh_sp
56K Followers 703 Following Being pretrained Host of Dwarkesh Podcast https://t.co/3SXlu7fy6N https://t.co/rEhnfYywXY https://t.co/hQfIWdM1UnBrett Larsen @_BrettLarsen
430 Followers 336 Following Sr. Research Scientist @DbrxMosaicAI | Guest Researcher @FlatironInst @NYU_CNS | Efficient deep learning + better algorithms for data scienceEitan Turok @EitanTurok
192 Followers 954 Following AI research @DbrxMosaicAI. Sorting in exponential time, training on the test set, and praying for geometric revelations.Jim Keller @jimkxa
34K Followers 136 Following CEO @tenstorrent, Cofounder @atomic_semi @BayaSystems and FlexAI board member. Fan of 2x2 matrixes, books, refactoring and creative tensionTrevor Gale @Tgale96
1K Followers 250 Following Research Scientist @ Google DeepMind | PhD Candidate @ Stanford CSBoston Dynamics @BostonDynamics
316K Followers 0 Followingsridhar @RamaswmySridhar
26K Followers 620 Following CEO @snowflakedb; founder @neeva Ex-@GreylockVC Ex-@Google SVP of Ads Ex-@BellLabs.MatthewBerman @MatthewBerman
17K Followers 407 Following 🇺🇸✡️ Deep in AI. YouTuber. Ex-SaaS CEO/founder (acq). Investor. Love talking about artificial intelligence and the future. Builder.Modal @modal_labs
12K Followers 71 Following Run generative AI models, large-scale batch jobs, job queues, and much more.Julia Neagu @JuliaANeagu
691 Followers 974 Following CEO & Co-Founder @QuotientAI ✨ formerly @GitHub @GitHubCopilot 🤖 reformed physicist 👩🔬 ~ opinions are my own ~lmsys.org @lmsysorg
39K Followers 173 Following Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtmKandrej Arpathy @untitled01ipynb
15K Followers 329 Following Managing Director, Memetics and Advanced Shitposting Institute (hyperstitonal) || don't forget to hit the bell || AKA jupyter MeowbooksWill Knight @willknight
20K Followers 7K Following I write about AI and related stuff for WIRED. signal = wak.01 (no pr pitches pls). newsletter = https://t.co/qG4DExCEbSMLOps Community @mlopscommunity
10K Followers 347 Following The MLOps community is an open and transparent community where all are welcome to participate. It is a place where MLOps practitioners can collaborate and shareMLflow @MLflow
9K Followers 43 Following An open source machine learning platform for managing the complete ML lifecycleJohannes Hagemann @johannes_hage
3K Followers 2K Following co-founder @PrimeIntellect | prev Research Engineer, scaling LLMs @Aleph__Alpha | interested in building decentralized AI, longevity, techno-optimismArthur Mensch @arthurmensch
40K Followers 875 Following Co-founder and CEO @MistralAI. Apply https://t.co/yHGRZAtjcxCognition @cognition_labs
125K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqTessa @tessybarton
744 Followers 729 Following Exploration agent. Research scientist at @MosaicML. Prev: @NYTimesDimitris Papailiopoul.. @DimitrisPapail
12K Followers 982 Following prof @ wisconsin; thinking about transformers; learning in context; babas of Inez LilyAllie K. Miller @alliekmiller
49K Followers 2K Following #1 Most Followed Voice in AI Business (1.5M followers). Nat’l AAAS Ambassador. Former Amazon, IBM. Fortune 500 and startup AI advisor, public speaker.Max ⛅ @maxisawesome538
2K Followers 3K Following sup nerds @DbrxMosaicAI @CohereForAI @riversideulti @maxdoesresearch for purely research tweetsTeknium (e/λ) @Teknium1
29K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsGuillaume Lample @GuillaumeLample
37K Followers 648 Following Cofounder & Chief Scientist https://t.co/hLfvKLkFHd (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @PolytechniqueStella Biderman @BlancheMinerva
15K Followers 749 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herMistral AI @MistralAI
91K Followers 0 Following Fast, open-source and secure language models. Join us https://t.co/INALdNGvCPAditi Jha @aditi_jh
693 Followers 470 Following PhD Student at @Princeton with @jpillowtime | Former intern @MosaicML | Neuroscience, Machine learningSatnam Singh @satnam6502
14K Followers 3K Following Punjabi-Scottish-American Haskell hacker at @GroqInc, cook, cyclist, lost in music. ∃🇮🇳 ∧ ∀🇬🇧 ∧ ∃🇪🇺 ∧ ∀🇺🇸 #celiac ex-{Microsoft, Google, Facebook}Adam Conway @acb0t
2K Followers 334 Following SVP product @databricks. I tweet about data and AI and robots and sometimes @donkey_car. If you want tweets on just donkey car, follow me there.dennylee @dennylee
3K Followers 2K Following Sparkitect on Delta Force One (tweets are my own). @[email protected]Kara Swisher @karaswisher
1.5M Followers 2K Following “Vitriolic” and now “shrill”media lady, though dogs can hear me loud and clearNathan Benaich @nathanbenaich
51K Followers 32K Following solo member of investment staff @airstreet, brewing ambition @airstreetcafe, next token predictor @airstreetpressJim Fan @DrJimFan
232K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pPercy Liang @percyliang
50K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistOriol Vinyals @OriolVinyalsML
167K Followers 82 Following VP of Research & Deep Learning Lead, Google DeepMind. Gemini co-lead. Past: AlphaStar, AlphaFold, AlphaCode, WaveNet, seq2seq, distillation, TF.(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingMiles Brundage @Miles_Brundage
43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.Soumith Chintala @soumithchintala
187K Followers 888 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Gary Marcus @GaryMarcus
145K Followers 7K Following “A beacon of clarity”. Spoke at US Senate AI Oversight committee. Founder/CEO Geometric Intelligence (acq. by Uber). Rebooting AI & Taming Silicon Valley.Mustafa Suleyman @mustafasuleyman
132K Followers 536 Following CEO, Microsoft AI | Author: The Coming Wave | Past: Co-founder, @InflectionAI & @GoogleDeepMindJan Leike @janleike
44K Followers 322 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.Rest in peace to Robert Dennard, one the GOATs of the modern era. Very few know and respect how much his work propelled our civilization. He invented modern DRAM in 1967 at IBM, the backbone of most computing systems, and still the most prevalent memory technology. Of course…
Trying to be useful lately!
Interesting work exploring how much models were "overfitting" GSM8K. By creating a new GSM style dataset, that the models could not have seen before, the authors remeasured GSM accuracy on this new dataset. Care is taken that the new dataset is indeed similar to GSM8K (style,…
How much do LLMs overfit public benchmarks? Our team at @scale_AI SEAL lab studied this by creating a GSM8k-equivalent eval from scratch. The resulting performance gap reveals data contamination in some model families, while GPT, Claude, and Gemini show no signs of overfitting.…
Want to talk AI research and best practices with the people working on it? The @DbrxMosaicAI research team is running meetups worldwide in May. linkedin.com/pulse/databric…
Lot of AI companies out there. Few making money (and a real business!) like the folks at @DbrxMosaicAI @databricks
How much are companies spending on AI? Our latest analysis is revealing, based on aggregated, anonymous data from thousands of @tryramp customers. The big takeaway? AI tools have gone from experimental to operational. AI-related card transaction volume increased by an…
Another banger from @cHHillee thonking.ai/p/strangely-ma…
You know you are doing good work when it is covered by @AndrewYNg ! :). Great to see @SambaNovaAI being recognized by Andrew for the impact fast inference can have. Excited to see interesting agentic workflows being triggered with this capability
Much has been said about many companies’ desire for more compute (as well as data) to train larger foundation models. I think it’s under-appreciated that we have nowhere near enough compute available for inference on foundation models as well. Years ago, when I was leading teams…
Much has been said about many companies’ desire for more compute (as well as data) to train larger foundation models. I think it’s under-appreciated that we have nowhere near enough compute available for inference on foundation models as well. Years ago, when I was leading teams…
🚀 Exciting News from Log10! 🚀 I'm thrilled to share that Log10 has successfully raised $7.2 million in seed funding, led by @QuietCapital and TQ Ventures, with @EssenceVenture participating. This milestone is a strong endorsement of our vision and the hard work of our…
Can't believe it has been a year since our first logs and playgrounds made it into log10.io - but that's just the beginning! Now backed by world-class VCs and angel investors, we are equipped to take Log10 to the next level
🚀 Exciting News from Log10! 🚀 I'm thrilled to share that Log10 has successfully raised $7.2 million in seed funding, led by @QuietCapital and TQ Ventures, with @EssenceVenture participating. This milestone is a strong endorsement of our vision and the hard work of our…
This whole time DBRX turns out to be really Enterprise Intelligent!
@SnowflakeDB Awesome work training such a big model with a permissive license! I think you had a mistake in your IFEval implementation, your reported number is less than 2x what we observe (though it does vary with inference server and sampling parameters). You should see in the high 60s
Weekly LLM drops are cool, but real solutions are systems. Check out @matei_zaharia workshop on it!
I'm co-organizing the inaugural research workshop on Compound AI Systems on June 13th: sites.google.com/view/compound-… . Send in your work on designing & optimizing such systems! Thrilled to have @RichardSocher, @MonicaSLam and @polynoamial as speakers, and host this at @Data_AI_Summit.
Really good long read on why data quality matters for LLM pre-training. Highly encourage anyone who is interested in LLMs to understand and internalize this. Hence my excitement about FineWeb. We saw the impact of these ablations and what it can do to quality from models…
This take on the FineWeb release is one of the most interesting feedback and also a reason FineWeb is very different from even larger datasets like RedPajama-V2 (which is double its size!) Surprisingly, the size of the dataset of 15T tokens is not very important, what is much…
@SambaNovaAI @AIatMeta TIL that 16bits is full precision
In light of recent releases, how do we feel about 8Bs with the same performance as 70Bs?
Ok, but hear me out. a 7B model with the same performance as a 67B model is worth 7837x as much.