Hiranmay Darshane @hdarshane
research intern https://t.co/Uyagq8Ilgm. deep learning and large language models. football (banter) fan. 18. hiranmay.com Mumbai, India. Joined October 2019-
Tweets5K
-
Followers672
-
Following1K
-
Likes32K
truth nuke
7/ Looking beyond this paper: scaling compute against a fixed, limited pool of data will need new primitives. Searching over a population of models is a different problem than standard gradient descent training and we've barely scratched the surface. We hope q0 pushes people toward crazy ideas in multi-epoch training and scaling compute in general!!
yay
1/ Now that we're running out of data, how do you optimally scale multi-epoch pretraining to hundreds of epochs? Our first paper from Q! q0 trains a population of models, instead of single model that saturates fast, reaching a dramatically lower loss at *every* epoch budget. w/
@soldni regularization is BACK i suppose. dropout 0.15 is quite large and i don't think anyone else uses dropout in the big 26. also rather high std for init these days but you can't go wrong with a good old 0.02. also why depth scale output proj when you have sandwich norm??
🔥
This paper empirically ~verifies the section of my first Zipfian grokking blog post where I hypothesize about how capacity competition dynamics extrapolate from the grokking to language pretraining case Cool work from the authors! :)
q: "why don't Sora-like models learn compositional physics understanding or do ICL like how language models learn compositional semantics?" a: every attempt to date heavily leaks information from the future. some even bake it into the bottleneck design without realizing (!!!)
is this not regulated by SEC?
Rule changes for the SpaceX $SPCX IPO: Index providers waived the profitability requirement and cut the seasoning window from 90 days to 5. This forces over $30 trillion in passive 401k and retirement money to buy SpaceX at IPO valuations. Bloomberg Intelligence estimates S&P
The following animation convey the intuition: when a 1-neuron model tries to learn two tasks, the frequent task updates suppress the infrequent task updates. The 2-neuron model can dedicate a neuron to the infrequent task once the frequent one is fully learned.
a quick way to force oneself into thinking about a thing is maintaining a list of words about that thing and just staring at it something something required circuits activate from high cosine similarity
I was thinking about it again recently, Google Allo was really ahead on the idea of chatting with Google Assistant or @'ing in conversations to build out this Agent/AI UX we have now
most things arrive unrecognizable to the ideas that summoned them
my favorite interp researcher can identify neurons responsible for any behavior and provide steering vectors for them her name is backprop and her steering vectors are just gradients
@mschoening and I are starting a podcast where we nerd out about human-AI collaboration and malleable software. In this episode: is HTML actually better than Markdown? and an alternative to Software Factories... Watch on YT: youtu.be/KB9lRdM5eO0?si…
does seem like all that time with colah did not alter his worldview at all
Artificial intelligences do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships, and do not know from within what love, work, friendship or responsibility mean. Nor do they have a moral conscience, since they do not judge
Yes, AIs are going to do all or almost all of the pure theory, but tbh humans probably finished most of the pure theory that it's possible for humans to do by the end of the 20th century. Yes there has been some recent theory progress but let's be honest, most is of marginal economic value at best. There's probably lots of useful pure theory left to do in this universe, but it's probably not the kind of stuff that can be intuited by a single human, explained to a grad student, and written down in a textbook. AI will do all that stuff.
1. people undestimate how hard this problem is 2. universal issue. IGCSE billed ~Rs 40k for exams - still many papers leaked 3. change is much harder than running things as is. migration to OSM requires competence++++ 4. with privatization, public sector => competence----
Why have routine tasks like holding exams become so difficult suddenly? What is the source of this new found incompetence?
Cool presenting on why generalization in neural nets is less of a mystery than many make it out to be:
Last week we hosted the first ever YC Paper Club in Mountain View. We brought together great AI researchers and founders to discuss both the state of the art and what it actually takes to get it into production. Thanks to the following presenters: 0:12 - Intro from YC Visiting
Onʇɹᴉpǝɹ @CDLXXXIII
165 Followers 2K Following
Gee Cee @GeeCee802
20 Followers 515 Following Center Right politically.. (key word “center”)…. ashamed by Trump… pro USA 🇺🇸 from forever…. but know we need allies/partners…. Let’s all be smart, not dumb!!
Aman @beingamanFF
362 Followers 897 Following GSOC ’26 @ GNU Octave | DS @HiLabs_inc | alum @iitroorkee | Explaining SOTA DL & systems | building, lifting, running
Bishwas @bishmdl76
63 Followers 70 Following i like training ai models | research @ https://t.co/25Ona6XwZL, cs phd
//TODO: fix later �... @enjoyingthewind
797 Followers 8K Following
Muzaffer Kal @🏡 �... @MuzafferKal_
2K Followers 7K Following Chips: ASIC, FPGA. CV/ML. Duck pictures by the lake. Some bread making.
serdarml @cs_serdar
115 Followers 594 Following Aspiring AI researcher, undergrad student @TU_Muenchen
Abhinav Singh @imabhi0703
184 Followers 2K Following Avid Learner | Pupil @ Codeforces | You are on your own | Coz I'm a young man after all
📗 @__the__human__
26 Followers 3K Following
Cam Howe @camhowe1729
116 Followers 2K Following a humble rebel interested in basement agi, higher order thinking and positive sum games.
truthixify @truthixifi
461 Followers 493 Following prev fellow @onlydust_com & @pldevguild | 0.5x hackathon winner
Alchemist ☢☢ 🇮... @LLawliet126
397 Followers 869 Following Atheist. Futurist/Transhumanist. Astronomy/Aerospace enthusiast. Defense enthusiast. Anime-Manga enjoyer. e/acc. Autocratic. priv @lawlietlight126
Abhipray Chavan @abhipray_chavan
7 Followers 775 Following
Chinmay @ChinmayKak
4K Followers 1K Following 21. gradient ascender. RL and Agents @MSFTResearch . love @teamIvLabs. dms open!
Dimitris Papailiopoul... @DimitrisPapail
28K Followers 1K Following Researcher @MSFTResearch, AI Frontiers | Prof @UWMadison (on leave) | babas of Inez Lily.
Akshay @akshayvegesna
431 Followers 172 Following Working on generalization at Q Labs. https://t.co/ExPhN2Kb4X Previously perception @nuro, math @caltech
calvin @calvingenuity
21 Followers 138 Following performative nomad. indulging in the science of next word speakers and more.
Josh Harkins-Finn @JHarkFinn
2 Followers 8K Following
nrRNjkitRHmMP @RNjkit72037
0 Followers 4K Following I'm interested in category theory and machine learning.
Neev Parikh @neev_parikh
945 Followers 2K Following are you ready for the intelligence explosion anon? ML research at @METR_Evals. prev @Stripe opinions my own.
Stéphane Deny @StphTphsn1
4K Followers 7K Following Neuroscience & ML Researcher. Posting about various topics on here. I retweet papers to increase their visibility (I do not read all), tag me for a retweet.
Jasper Gilley @0xjasper
1K Followers 621 Following representation learning, interpretability, aesthetics | the greatest art is yet to be created
Andrew Lampinen @AndrewLampinen
12K Followers 2K Following Interested in cognition and artificial intelligence. MTS at @AnthropicAI. Previously @DeepMind, cognitive science @StanfordPsych. Tweets are mine.
Melissa Du @bearablylight
3K Followers 852 Following bio + ml, prev @mit @WhiteheadInst @DbrxMosaicAI
Adam Patni @adam_patni
652 Followers 838 Following programming robots @rhoda_ai_ ex-@sauronsystems, @spacex, @wayve_ai, @georgiatech
Jorge Bravo Abad @bravo_abad
11K Followers 9K Following AI for Science | Prof. of Physics @UAM_Madrid. Author of "IA y Física": https://t.co/Nxue94kfOG & "Ciencia 5.0": https://t.co/Y3rBUU7Xzg
Nathan Zhao @nathanzhaoo
390 Followers 278 Following Ex-Stanford dropout | prev. SPC Founder Fellow, K-Scale Labs. Random 20 y/o from Delaware
Adam Patarino @Patarino
640 Followers 2K Following 2x Founder. 2x Dad. SWE turned CEO. Working on the first AI coding assistant that runs entirely on a standard laptop - @rig_code The future of AI is local
Shmuel Berman @ShmuelBerman
84 Followers 155 Following PVL Lab @ Princeton | Memory and Perception | Anthropic Fellow | https://t.co/jdfRoBjvfJ
Lisan al Gaib @scaling01
46K Followers 1K Following lead them to paradise LisanBench: https://t.co/vorVk7NMCy Impressum & Datenschutz: https://t.co/lFLgiu8EAU
Max Weinbach @mweinbach
293K Followers 8K Following Analyst @creativestrat | Analyst and Market Research Firm | Typo ignorer Email: [email protected]
✈️ Flight Leader ... @HQuarterma43504
1K Followers 496 Following High performance computing. AWACS Thunderbird ally.
Dhruv Batra @DhruvBatra_
21K Followers 724 Following Co-founder & Chief Scientist @yutori_ai. Prev: Senior Director leading FAIR Embodied AI @MetaAI and Professor @GeorgiaTech.
John Schulman @johnschulman2
75K Followers 2K Following Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music
Xiangdong Zhang @aHapBean
256 Followers 53 Following AI PhD student at @sjtu1896. I’m currently exploring llm pre-training. REDstar Intern at @xiaohongshu Dots (formerly Hi Lab). [email protected]
Brad Gerstner @altcap
206K Followers 1K Following Founder - Altimeter, Invest America | Trump Accounts, Center for Heart Attack Prevention. One precious life. Views here personal. @investamerica24 @bg2pod
Larry Dial @classiclarryd
2K Followers 41 Following Technical Staff at Open Athena, working on Marin
Tantum Collins @tantumscollins
867 Followers 64 Following CEO of @inherent_labs, previously @GoogleDeepMind
Jukan @jukan05
139K Followers 317 Following Tech otakus save the world | Not Investment Advice | DYODD
Jason Dean @_Jason_Dean_
8K Followers 5K Following “What must it be like to live in this world, seeing it just the way it is, and think that it will never change, never get any better?”
The OpenAI Foundation @FoundationOAI
7K Followers 0 Following OpenAI was founded in 2015 as a nonprofit; its mission is to ensure artificial general intelligence benefits all of humanity.
Taylor Sorensen @ma_tay_
2K Followers 605 Following make LLMs good for people! | PhD from @uwnlp, prev @humansand @stanfordnlp @byuacme, intern @GoogleDeepMind, @allen_ai | LLMs + alignment, pluralism, diversity
Amarillo Slim @Amarillo_Slim1
12K Followers 659 Following
pranav @pranav_so
471 Followers 973 Following 21 | use this as a note taking app | econ @ashokauniv | now https://t.co/JYoxUalFH8 via @vidhi_india + https://t.co/YZf9VQHQNn | increasing economic growth & improving dev outcomes
Laurie Whitwell @lauriewhitwell
484K Followers 999 Following Journalist for @TheAthleticFC, covering Manchester United. Instagram: lauriewhitwell
David Bessis @davidbessis
19K Followers 460 Following Rogue mathematician. "The product of mathematics is clarity and understanding." — Bill Thurston https://t.co/l95RHuWz2S
Albin Sheqiri @albinsheqiri
32K Followers 351 Following Assistant Coach @cercleofficial • UEFA PRO (2026-2028)
Nicholas Joseph @nickevanjoseph
8K Followers 51 Following Pretraining @AnthropicAI, formerly safety @OpenAI
Jiaxin Wen @jiaxinwen22
6K Followers 196 Following research @berkeley_ai @anthropicai. prev @tsinghua_univ.
Bishwas @bishmdl76
63 Followers 70 Following i like training ai models | research @ https://t.co/25Ona6XwZL, cs phd
serdarml @cs_serdar
115 Followers 594 Following Aspiring AI researcher, undergrad student @TU_Muenchen
Colossus @colossusmag
45K Followers 136 Following Subscribe: https://t.co/Zu7Sv2Efxd. Listen: @InvestLikeBest, @FoundersPodcast, @BizBreakdowns, @joyscompounding.
Swapan Dasgupta @swapan55
1.1M Followers 1K Following MLA from Rashbehari (Kolkata). Ideologically conservative & nationalist. Padma Bhushan (2015). Former MP (Rajya Sabha). Member BJP National Executive.
Maharashtra Progress ... @abhirammodak
13K Followers 157 Following Solution Arch, BFSI Specialist, Music Composer & Hobbytual Theoretical Physicist,dev/infra/industry/jobs tracker for Pune&MH- Retired, now Freelancer
Elon Litman @elon_lit
5K Followers 289 Following AI researcher @Stanford. hypernetworks, energy-based models, genomics. Everettian. information geometer.
Quinn Barry @quinnmbarry
796 Followers 1K Following thinking. ex: @maplefinance. @avarilabs. @stanford.
Tom Brown @NotTomBrown
27K Followers 427 Following Co-founder and Chief Compute Officer @AnthropicAI
Mao Shichigan @weAllGonnaDye
4K Followers 2K Following shitposting about Arsenal and my deteriorating mental health. Professional Bhanda Ghasser. I Probably tweet every minute, sorry for the tl spam.
Renny @rennyzucker
12K Followers 2K Following 28. A little long here, some short there. Head of trading, fintwit sarcasm desk.
🪓 @crazyxedi
1K Followers 858 Following
Dune Quotes @DuneQuoteBot
50K Followers 0 Following Unofficial bot that spits out a random quote from Frank Herbert's Dune books, even though spit is a terrible waste of water. By @thatjasonweiser
Alexi Gladstone @AlexiGlad
3K Followers 659 Following PhD @ UIUC and doing stuff @flappyairplanes. Working on EBTs/EBMs, World Models, Reasoning. Prev @Meta, @PalantirTech, @UVA
Jamie Simon @learning_mech
1K Followers 75 Following doing fundamental science of deep learning | PhD from Berkeley | can catch a whole egg in my mouth
Ulisse Mini @MiniUlisse
1K Followers 329 Following unschooled autodidact focused on inner work / self development / spiritual path. technical staff @ https://t.co/1HqI3jciBd












































