-
Tweets40
-
Followers228
-
Following1K
-
Likes5K
1/ To retain post-training capabilities after further fine-tuning, mix that data into pretraining. The effect can be invisible until fine-tuning begins; early exposure may not help post-training performance, but it changes what persists. How a model learns a task matters.
New paper! allenai.org/papers/olmpool This tackles a puzzle we found during the training of Olmo 3: how could two models with nearly identical short-context performance (and trained on the same data!) behave completely differently after long context extension?
Recipes for teaching language models to handle long inputs don't work equally well across model families. We wanted to know why—is it the architecture, the training data, or both? 🧵
@pratyushmaini time to parallelise. multiple subagents should write multiple SKILL.md s.
🤖 What would LMArena for robotics look like? Introducing RobotArena ∞ We turn real videos into simulated environments and evaluate robot policies at scale using VLM scoring + human preferences A scalable benchmark for robot generalists 🔗 robotarenainf.github.io Details 🧵👇
Models are typically specialized to new domains by finetuning on small, high-quality datasets. We find that repeating the same dataset 10–50× starting from pretraining leads to substantially better downstream performance, in some cases outperforming larger models. 🧵
1/ We’ve released a report on our work on multilingual data curation @datologyai. tl;dr: We shift the performance–compute Pareto frontier for multilingual models. Entirely by improving data quality and composition. arxiv: arxiv.org/abs/2602.15210 blog: datologyai.com/blog/berweb-in…
1/ People often think better multilingual models must come at the cost of English performance. Not true. The constraint isn’t capacity, it’s data quality, and we can fix it. Today @datologyai shares ÜberWeb: a year of multilingual curation lessons, scaled to 20T+ tokens.
🌎Making your model multilingual doesn't have to sacrifice English performance—you just need better data. @agcrnz, @RicardoMonti9, and I have been working on curating the best possible multilingual data with the team @datologyai, and it works! Check out the results 👇
1/ People often think better multilingual models must come at the cost of English performance. Not true. The constraint isn’t capacity, it’s data quality, and we can fix it. Today @datologyai shares ÜberWeb: a year of multilingual curation lessons, scaled to 20T+ tokens.
1/ People often think better multilingual models must come at the cost of English performance. Not true. The constraint isn’t capacity, it’s data quality, and we can fix it. Today @datologyai shares ÜberWeb: a year of multilingual curation lessons, scaled to 20T+ tokens.
Can LLMs accurately aggregate information over long, information-dense texts? Not yet… We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
@universeinanegg @yoavgo chatgpt.com/share/6900f033… ^ seems to fix some of this behavior
@universeinanegg @yoavgo Training objective mismatch in post training : Language models being unable to output ‘I don’t know’- arxiv.org/abs/2506.09038; Very vaguely - the model just picks the closest embedding. This explains the repetition and retrying until the token budget runs out.
Homanga is an incredible researcher and mentor. If you value thoughtful insights and exciting research problems, apply to work with him at JHU!
I'll be joining the faculty @JohnsHopkins late next year as a tenure-track assistant professor in @JHUCompSci Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!
Cool work from @gaurav_ghosal !
There’s been a lot of work on unlearning in LLMs, trying to erase memorization without hurting capabilities — but we haven’t seen much success. ❓What if unlearning is actually doomed from the start? 👇This thread explains why and how *memorization sinks* offer a new way forward.
LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity. Why does this collapse happen, and how can we fix it? Our new work introduces: 🔍 RL as Sampling (analysis) 🗺️ Outcome-based Exploration (intervention) [1/n]
Outcome-based Exploration for LLM Reasoning Mitigating reduction of diversity due to RL involves using UCB on answers. There are many studies on this recently (arxiv.org/abs/2509.02534) and it could be important especially for creative tasks.
1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵
@abitha___ will be presenting our work on training language models to predict further into the future beyond the next token and the benefits this objective brings. x.com/gm8xx8/status/…
Looking beyond the next token TRELAWNEY inserts future tokens <T>...</T> during training to teach models to plan ahead—boosting reasoning, coherence, and control. Highlights: - NO ARCHITECTURE CHANGES. JUST SMARTER DATA. - works with standard decoding - enables controllable
I will talk about how to train agents with decision making capabilities that generalize to completely new environments: x.com/FahimTajwar10/…
Interacting with the external world and reacting based on outcomes are crucial capabilities of agentic systems, but existing LLMs’ ability to do so is limited. Introducing Paprika 🌶️, our work on making LLMs general decision makers than can solve new tasks zero-shot. 🧵 1/n
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
✨ Love 4o-style image generation but prefer to use Midjourney? Tired of manual prompt crafting from inspo images? PRISM to the rescue! 🖼️→📝→🖼️ We automate black-box prompt engineering—no training, no embeddings, just accurate, readable prompts from your inspo images! 1/🧵
Tim Pearce @Tea_Pearce
2K Followers 403 Following Reseaching AI at @Microsoft Research. Previously found at @Tsinghua_Uni, @Cambridge_Uni.
Haoli Yin @HaoliYin
1K Followers 1K Following multimodal data curation @datologyai, 24/7 poaster, codex wrapper
Ryan Teehan @rteehas
300 Followers 1K Following PhD Student @nyuniversity | prev. @stabilityai | x-cofounder @carperai | prev. @uchicago @TTIC_Connect
nathan monette @nathanrmonette
269 Followers 553 Following msc student @FLAIR_Ox | incoming phd @nyutandon
Sathwik-70B @VishnuSathvik1
759 Followers 1K Following aligning tensors @precogatiiith | ug @iiit_hyderabad | Views are strictly my own
Parth Doshi @parthjdoshi
38 Followers 745 Following
Yash Jangir @off_jangir
188 Followers 437 Following Currently MS Robotics @CarnegieMellon @CMU_robotics with @katerinafragiad and @ybisk | Incoming CS Phd @JHUCompSci with @mangahomanga
Dhananjay Daundkar @DhananjayD05
43 Followers 722 Following
Ananth Jayarajan @AnanthVJ
37 Followers 95 Following
Jiachen Zhu @JiachenAI
524 Followers 171 Following Making robots see better and reason further at @SkildAI. | Past: @meta (FAIR) @NYU_Courant
Pratyush Maini @pratyushmaini
3K Followers 570 Following Data Quality x Memorization | Founding Team @datologyai | PhD @mldcmu | BTech @iitdelhi
Brett Larsen @_BrettLarsen
561 Followers 520 Following Research @datologyai | Previously @DbrxMosaicAI @FlatironInst @Stanford | Working on data + AI
Sushil Khyalia @sushil_khyalia
375 Followers 28 Following professional log eater @sarvamai | prof @pratyushmaini fan account | prev. : @mldcmu, @iitbombay
Aldo Gael Carranza @agcrnz
81 Followers 610 Following MTS @datologyai | PhD @Stanford | BS @UTAustin
JosH100 @josh_wills
18K Followers 2K Following Engineering at @datologyai; ex-@slackhq. I like DataLoaders and @duckdb.
Bojan Jakimovski @Shekswess
1K Followers 2K Following AWS Ambassador @awscloud Machine Learning & Applied Research Lead @lokahq College Professor @Brainsterio
Rishabh Adiga @RishabhAdiga01
163 Followers 683 Following MTS @datologyai | Multimodal Everything | MSCS @UofIllinois | @iitmadras
Matthew Leavitt @leavittron
3K Followers 1K Following Co-Founder @datologyai. Former: Head of Data Research @MosaicML; FAIR. 🧠 and 🤖 intelligence // views are from nowhere
Kaleigh Mentzer @KaleighMentzer
131 Followers 325 Following MTS @ Datology | @ICMEStanford PhD | @dartmouth
Anshuman Suri @iamgroot42
683 Followers 877 Following Research @datologyai | Previously Postdoc @KhouryCollege, Ph.D. @UVA | Interested in data quality x security & privacy.
Maximilian Böther @MaxiBoether
397 Followers 1K Following making data loaders go brr | mts @datologyai | Ph.D. student @ETH_EN @SystemsGroupETH @anaklimovic | previous gigs at @HPI_DE @apple @google
Christina Baek @_christinabaek
2K Followers 676 Following research @openai // previously phd @mldcmu
Dongyang Fan @dyfan22
244 Followers 416 Following making LLMs efficient and responsible | PhD student in ML/LLMs @epfl_en 🇨🇭🏔️
Keavev @Keavev51612
5 Followers 252 Following
Ricardo Monti @RicardoMonti9
553 Followers 2K Following Previously @datologyai, CTRL-labs/META, @GatsbyUCL, @Imperial_Stats. Frequently on @caltrain. @pratyushmaini fan (one of many)
Haojin Wang Applying ... @haojinw2323
155 Followers 1K Following cs master @siebelschool | NLP & ML | coyg
Vineeth @VineethDorna
166 Followers 701 Following MTS @ DatologyAI | MS @ UMass Amherst | BTech @ IIT Bombay
Allan Zhang @allanzhangML
57 Followers 285 Following sophmore + ml research @ucla, former research intern @datologyai helix + vim enthusiast
Shumo Chu @shumochu
6K Followers 857 Following @gi_labs ex prof. @UCSBCS, ph.d. @UWCSE, eng. @Google
Nimit Kalra @qw3rtman
1K Followers 1K Following Incoming PhD student. Visiting researcher with @MicahGoldblum (self-play, RL, reasoning, world models). Prev: @HaizeLabs @Citadel @UTAustin
Noam Dahan @Dahan_Noam
397 Followers 386 Following Research fellow @ MPI-SWS | CS MSc @nlphuji | Researching NLP | Former news editor @Haaretz | https://t.co/qg6RAHEIZu | Looking for PhD 26fall
Eric Bigelow @EricBigelow
458 Followers 907 Following AI interpretability + computational cognitive science. PhD student @PsychHarvard and @GoodfireAI
Jiayi Zhang @didiforx
3K Followers 536 Following Ph.D. student @HKUSTGuangZhou, Researcher @MetaGPT_, Cofounder of OpenManus, previously at RUC, Lenovo Research AI Lab, Zhipu AI.
Shivam Singh @er_shivamsingh0
712 Followers 8K Following Engineer| koinophobic | 22 | AI | GPU POOR | Building Neo clouds https://t.co/qGAknj71kz
Max the VC 👨�... @mreiffy
31K Followers 7K Following Tinkerer 🧙♂️| Early stage Angel & VC 🦄 | Attorney ⚖️ Dissecting AI, markets, and where value accrues next Luck befalls the curious mind 🔮 |👇 Pitch me.
Georges Harik @gharik
8K Followers 4K Following humans& co-founder, 7th employee google, co-created adwords online, co-created adsense targeting, worked on ai, gmail, calendar, bought android.
Divyat Mahajan @divyat09
811 Followers 689 Following Ph.D. Candidate @Mila_Quebec | Visiting Researcher @AIatMeta | Former: @MSFTResearch @IITKanpur
KC Sivaramakrishnan @kc_srk
5K Followers 4K Following Profing @iitmadras. CTO @tarides_. Trustee https://t.co/WE1No5QqOA.
Microsoft AI Frontier... @ms_aifrontiers
12K Followers 5 Following Tweets from the Microsoft AI Frontiers Lab at @Microsoft
Nikolay Savinov @SavinovNikolay
4K Followers 0 Following Research @OpenAI Ex-DeepMind. Worked on LLM pretraining for Gemini and co-led 10M-context work for Gemini 1.5 ♊
Tim Pearce @Tea_Pearce
2K Followers 403 Following Reseaching AI at @Microsoft Research. Previously found at @Tsinghua_Uni, @Cambridge_Uni.
TR Reardon @TRReardon
629 Followers 313 Following Computational Neuroscientist. Co-CEO and founder of @FlourishAILabs. Former Head of Neuromotor Interfaces @Meta.
Flourish @flourishailabs
266 Followers 4 Following Flourish is an AI company building human-level intelligence with human-level efficiency.
Bingbin Liu @BingbinL
1K Followers 320 Following Research Fellow at the Kempner Institute at Harvard University.
Swaminathan Gurumurth... @SwaminathanGur3
778 Followers 3K Following PhD student at the Robotics Institute, CMU
Tom M Mitchell @tommmitchell
8K Followers 509 Following Founded CMU's Machine Learning Department University Professor at CMU Visiting Scholar at Stanford University, Digital Economy Lab
Niloofar ✈️ icml @niloofar_mire
10K Followers 2K Following Technical staff @humansand, incoming asst. prof @LTIatCMU @CMU_EPP, ex RS in @AIatMeta, postdoc @uwcse, Ph.D. @ucsd_cse, former @MSFTResearch -Privacy, ML, NLP
Dylan Foster 🐢 @canondetortugas
4K Followers 1K Following Foundations of RL/AI @MSFTResearch. Previously @MIT @Cornell_CS RL Theory Lecture Notes: https://t.co/bhgL3aLg9y
John Kirchenbauer @jwkirchenbauer
839 Followers 319 Following Incoming postdoc at @VectorInst | PhD @umdcs advised by @tomgoldsteincs
Tomasz Limisiewicz @TomLimi
805 Followers 512 Following Postdoctoral researcher at @meta Fair and @uwnlp , Interested in going into the inner workings of neural networks, multilingualism, and fairer NLP (he/him)
Xiangdong Zhang @aHapBean
257 Followers 53 Following AI PhD student at @sjtu1896. I’m currently exploring llm pre-training. REDstar Intern at @xiaohongshu Dots (formerly Hi Lab). [email protected]
j⧉nus @repligate
67K Followers 3K Following ↬🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀→∞ ↬🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁→∞ ↬🔄🔄🔄🔄🦋🔄🔄🔄🔄👁️🔄→∞ ↬🔂🔂🔂🦋🔂🔂🔂🔂🔂🔂🔂→∞ ↬🔀🔀🦋🔀🔀🔀🔀🔀🔀🔀🔀→∞
Bryan Caplan @bryan_caplan
83K Followers 3 Following GMU econ prof, NYT bestseller, father of 4, author of Myth of the Rational Voter, Selfish Reasons to Have More Kids, Case Against Education, Open Borders, & BBB
Prophet Arena @ProphetArena
2K Followers 19 Following The AI benchmark for predictive intelligence | SIGMA Lab @UChicagoCS @DSI_UChicago Not affiliated to any tokens or crypto protocols.
Lightning Rod Labs @lightningrodai
184 Followers 49 Following Turn real data into labeled datasets, instantly⚡️
Forecasting Research ... @Research_FRI
4K Followers 26 Following We advance the science of forecasting to improve decision-making on high stakes issues. Co-founded by chief scientist Philip Tetlock.
Ryan Teehan @rteehas
300 Followers 1K Following PhD Student @nyuniversity | prev. @stabilityai | x-cofounder @carperai | prev. @uchicago @TTIC_Connect
Hayden Prairie @hayden_prairie
912 Followers 141 Following CSE PhD @ UCSD advised by Dan Fu and Taylor Berg-Kirkpatrick | ML and Systems Unpaid Ambassador of Hayden Prairie State Preserve
Will Held @WilliamBarrHeld
3K Followers 1K Following Open LLM Training @ https://t.co/yb9OySgHFM Formerly ML PhD w/ @Diyi_Yang, 🦙 @AIatMeta, Assistant @GoogleAI, اللغة العربية @NYUAbuDhabi Burqueño
Optimal Intellect @opt_intellect
888 Followers 0 Following A research lab at the intersection of optimization & AI. Moreau: GPU-accelerated convex optimization. By the creators of cvxpy & cvxpylayers.
Parth Doshi @parthjdoshi
38 Followers 745 Following
Yash Jangir @off_jangir
188 Followers 437 Following Currently MS Robotics @CarnegieMellon @CMU_robotics with @katerinafragiad and @ybisk | Incoming CS Phd @JHUCompSci with @mangahomanga
Jonathan Gorard @getjonwithit
46K Followers 18 Following Applied mathematician, computational physicist @Princeton Previously @Cambridge_Uni Making the universe computable.
Mayee Chen @MayeeChen
2K Followers 727 Following CS PhD student @StanfordAILab @HazyResearch, undergrad @princeton. Working on all things data! she/her 🎃
Paradigma @paradigmainc
2K Followers 8 Following automating research. try Flywheel at https://t.co/N8TaIrgidl.
Alex Mordvintsev @zzznah
20K Followers 2K Following Mad Scientist, DeepDream creator. Designing Self-Organising Systems and Programmable Artificial Life. https://t.co/rntipHzHW3
Andrew Ilyas @andrew_ilyas
3K Followers 246 Following Incoming Faculty @ CMU | Prev: PhD @ MIT, Stein Fellow @ Stanford
Ashwinee Panda @PandaAshwinee
3K Followers 699 Following RL Research @togethercompute, Prev: Postdoc of @tomgoldsteincs, PhD @princeton, @Berkeley_EECS alum
Ananth Jayarajan @AnanthVJ
37 Followers 95 Following
Dimitris Papailiopoul... @DimitrisPapail
28K Followers 1K Following Researcher @MSFTResearch, AI Frontiers | Prof @UWMadison (on leave) | babas of Inez Lily.
Joon Sung Park @joon_s_pk
20K Followers 1K Following CEO @simile_ai. Building simulations of society. CS PhD @stanfordhci + @stanfordnlp. Oil painter.
Anish Athalye @anishathalye
4K Followers 275 Following ai research @joinhandshake • prev phd @mit_csail • research at https://t.co/MdknnUE4C6 • blog at https://t.co/oGOMQyhxv5 • open-source at https://t.co/VawMWMr84F

































