Chris Olah @ch402
Reverse engineering neural networks at @AnthropicAI. DMs open! Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account. colah.github.io San Francisco, CA Joined June 2010-
Tweets5K
-
Followers90K
-
Following173
-
Likes10K
Scaling laws for dictionary learning! transformer-circuits.pub/2024/april-upd…
Scaling laws for dictionary learning! transformer-circuits.pub/2024/april-upd… https://t.co/f4ERLNvhof
Some small updates from the Anthropic Interpretability team: transformer-circuits.pub/2024/april-upd…
New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…
Announcing a progress update from the @GoogleDeepMind mech interp team! Inspired by @AnthropicAI's excellent monthly updates, we share a range of updates on our work on Sparse Autoencoders, from signs of life on interpreting steering vectors with SAEs to improving ghost grads.
Great visualisation library for Sparse Autoencoder features from @calsmcdougall! My team has already been finding it super useful, go check it out: lesswrong.com/posts/nAhy6Zqu…
I'm incredibly excited to have Craig joining us on the Anthropic Interpretability team! I've been a huge fan of @GoogleColab for nearly a decade (I used it internally at Google!) and have really admired Craig's work on it.
I'm incredibly excited to have Craig joining us on the Anthropic Interpretability team! I've been a huge fan of @GoogleColab for nearly a decade (I used it internally at Google!) and have really admired Craig's work on it.
big news for me: after 5000+ days and too many excellent colleagues to mention, I'm leaving Google. it's been a fantastic ride, and the hardest part about leaving is saying goodbye to my teammates and colleagues.
Next our series of small monthly updates from the interpretability team, including a few fun things: 1. We use do feature attribution to find features related to specific completions (following the athlete-sport association example of @NeelNanda5 )
Another small update from us, including some fun results about circuit analysis with SAEs.
Another small update from us, including some fun results about circuit analysis with SAEs.
We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)
I continue to be impressed by the work of Neel's scholars -- very excited to see what the next group will do!
I continue to be impressed by the work of Neel's scholars -- very excited to see what the next group will do!
Reflections on Qualitative Research: transformer-circuits.pub/2024/qualitati… [h/t to @ch402 for originating & driving this!]
Soumith Chintala @soumithchintala
186K Followers 877 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingAlfredo Canziani @alfcnz
86K Followers 268 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordRichard Ngo @RichardMCNgo
35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openaiDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pAnthropic @AnthropicAI
262K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Richard Socher @RichardSocher
101K Followers 970 Following CEO @youSearchEngine Investing at @aixventuresHQ Before: Stanford Adj Prof in AI/NLP, Chief Scientist at Salesforce, MetaMindMiles Brundage @Miles_Brundage
43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Horace He @cHHillee
23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemalePercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistRosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRJack Clark @jackclarkSF
67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futuresNeel Nanda @NeelNanda5
13K Followers 89 Following Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!Amanda Askell @AmandaAskell
26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.Ferenc Huszár @fhuszar
40K Followers 1K Following Secular Bayesian. Associate Professor in Machine Learning @Cambridge_CL. Talent aficionado at https://t.co/RbJkoLguey Alum of @Twitter, Magic Pony and @BaldertonPasi Vuorio @pavuorio
201 Followers 395 Following Developer and SW Architect since -97, nowadays innovating new things with #AI at LastBot. Startup guy 4ever!Jimmy Chang @JimmyCh60659590
59 Followers 914 FollowingAI in ICAI @AIinICAI
496 Followers 388 Following Committee on @AIinICAI at @theicai . Exploring the fusion of AI & creativity. Innovating for smarter, more imaginative future. Bridging the gap between finance.😼 @kum2kum3
3 Followers 121 FollowingMridul Rao @MridulRao
0 Followers 30 FollowingJay Bee @JayBee123460856
212 Followers 1K FollowingBryan Wise @BWizise
23 Followers 68 Following I make cool stuff, and my name is Bryan Nicholas Robinson Wise.Mike Shaw @mikeshawnuff
658 Followers 2K Following Christian, Husband, Dad, Musician, CTO. Proud to serve district 47 in the Alabama Legislature!Addie Foote @AddieF38654
0 Followers 26 FollowingFreudian H.I.P.S. @FreudianHIPS
122 Followers 2K Following Welcome to Freudian H.I.P.S... where we test the idea that the universe is made up from differentiations & all differentiations can be measured in Arabic!GersonDeWinter @GersondeWinter
345 Followers 5K Following Biological Human Intelligence. Non Biological Intelligence (AGI) will replace allmost all jobs, thus ending the monetary economy and culture as we know it+NHIravikd @ravi_ravikd
26 Followers 84 FollowingHarry Surden @HarrySurden
3K Followers 1K Following Professor, University of Colorado Law School • Associate Director Stanford CodeX Center for Legal Informatics. • Research: AI & Law. Former software engineer.Julia Kempe @KempeLab
4 Followers 15 Following Silver Professor at NYU Courant and CDS, Visiting Prof. ENS Paris, Visiting Senior Researcher at FAIR Research in Machine Learning, past in Quantum ComputingPranav Sachdev @_PranavSachdev
8 Followers 112 Following I'm a Sr. Data scientist, a storyteller passionate about solving complex problems with data #DataScience #MachineLearninghunter @pseudokami
38 Followers 198 Following i like math. calisthenics. rl (both). and anime // currently ml @paypal // prev @nasajplKowndinya Renduchinta.. @KowndinyaR
2 Followers 37 Followingpeterbob @whatsurshtyle
10 Followers 1K FollowingRohekael Part @rohekael
48 Followers 57 FollowingTrung Nguyen @trungnguyentav
9 Followers 98 Followingkang @belt_treatment
0 Followers 64 FollowingEli Sennesh @EliSennesh
201 Followers 477 Following NHP electrophysiology @VanderbiltU. Predictive coding, probabilistic programming, affective science. Abolish the value function! It's all taxis navigation!Naveen Ramasamy @notnavram
0 Followers 128 Followingsimone amoroso @amorososimone
129 Followers 2K FollowingGeoff Ladwig @GeoffLadwig
24 Followers 145 FollowingDaniel @DVG_NET
15 Followers 143 FollowingAlihan @BotBrainiac
18 Followers 37 Followingeric @eric64453375341
3 Followers 409 Followingarjun khandelwal @Arjunkh07
22 Followers 57 FollowingYtkkk @Monotonik_
17 Followers 815 FollowingKartikey Jha @kartikey_ai
7 Followers 108 Following I love developing LLM-based and GenAI apps , studying math behind deep learning algorithms, and keeping up with the developments in practical uses of AIjr @jamesrichmanx
16K Followers 157 FollowingNamdev Kambli @namd89465
1 Followers 27 Followingmforgione @mforgione3
9 Followers 83 Following Researcher at IDSIA - Dalle Molle Institute for Artificial Intelligence银河 @ynh1661160
2K Followers 5K FollowingHC Rørby @codingdecay
0 Followers 7 FollowingLerio Celerio @lerio271
26 Followers 30 FollowingAnthropic @AnthropicAI
262K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Jack Clark @jackclarkSF
67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futuresOriol Vinyals @OriolVinyalsML
166K Followers 82 Following VP of Research & Deep Learning Lead, Google DeepMind. Gemini co-lead. Past: AlphaStar, AlphaFold, AlphaCode, WaveNet, seq2seq, distillation, TF.Amanda Askell @AmandaAskell
26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.Ferenc Huszár @fhuszar
40K Followers 1K Following Secular Bayesian. Associate Professor in Machine Learning @Cambridge_CL. Talent aficionado at https://t.co/RbJkoLguey Alum of @Twitter, Magic Pony and @BaldertonZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Lilian Weng @lilianweng
94K Followers 148 Following Working on AI safety, past on robotics, applied research @OpenAI; Writing ML blogs to help myself & others to learn; Ideas my own.Irina Rish @irinarish
9K Followers 994 Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; INCITE project PI https://t.co/0rV7szd7rH; CSO https://t.co/XDhj6MEtUjNando de Freitas 🏳.. @NandoDF
97K Followers 658 Following I research intelligence to understand it and to harness it wisely. Path: Wits, Cambridge, Berkeley, UBC, Oxford, DarkBlueLabs, Google DeepMindCatherine Olsson @catherineols
15K Followers 1K Following Hanging out with Claude, improving its behavior, and building tools to support that @AnthropicAI 😁 prev: @open_phil @googlebrain @openai (@microcovid)Joshua Batson @thebasepoint
2K Followers 707 Following trying to understand evolved systems (🖥 and 🧬) interpretability research @anthropicai formerly @czbiohub, @mit mathAdam Jermyn @AdamSJermyn
1K Followers 188 Following AI Interpretability & Safety @AnthropicAI. Previously at @FlatironInst @FlatironCCA, @KITP_UCSB, PhD @Cambridge_Uni, BS @Caltech.Trenton Bricken @TrentonBricken
6K Followers 2K Following Trying to figure out what makes minds and machines go "Beep Bop!" @AnthropicAIderek guy @dieworkwear
805K Followers 964 Following Menswear writer. Editor at @putthison. Creator of @RLGoesHard. Bylines at The New York Times, The Washington Post, The Financial Times, Esquire, and Mr. PorterColeman Hughes @coldxman
350K Followers 903 Following Conversations w/Coleman Podcast | Forbes 30 Under 30 | Contributor at @theFP | Analyst at @CNN https://t.co/cwLQsfPK19Michael Sellitto @MPSellitto
1K Followers 2K Following @AnthropicAI, @CNASdc, @StanfordHAI Formerly: @WhiteHouse NSC 2015-2018, @NSC44, @StateDept, @ODNIgov Personal viewsThe Folio Society @foliosociety
59K Followers 3K Following We are a unique and proudly independent publisher, creating beautiful books for 75 years. Made for book lovers, by booklovers. Here to help Mon-Fri 9:30-17:30.Forecasting Research .. @Research_FRI
580 Followers 21 Following Research institute focused on developing forecasting methods to improve decision-making on high-stakes issues, led by chief scientist Philip Tetlock.Andreas Tolias Lab @AToliasLab
4K Followers 683 Following to understand intelligence and develop technologies by combining neuroscience and AILiv @livgorton
607 Followers 268 Following ✨just a girl reading papers and writing code✨ | “Thanks for being such a wonderful human to interact with.” - ClaudeKeith Frankish @keithfrankish
30K Followers 2K Following Philosopher, writer, Ελληνοβρετανός. Hon Professor @sheffielduni. Mind, consciousness, illusionism, cog-sci, Ελλάδα. Podcast: https://t.co/kyMR0mRBqmEric Reinholdt @EricReinholdt
2K Followers 376 Following architect . entrepreneur . dad . guitar player . metal head . hiker .Tristan Hume @trishume
6K Followers 330 Following Performance optimization lead @AnthropicAI. Profiling, distributed systems, dev tools, interpretability. [email protected]Jason Matheny @JasonGMatheny
8K Followers 373 Following President and CEO of the @RANDCorporation, a nonprofit, nonpartisan research org that helps improve policy and decisionmaking through research and analysis.Emily Oster @ProfEmilyOster
109K Followers 283 Following Data-Driven Pregnancy and Parenting Economist @BrownUniversity Author #ExpectingBetter, #Cribsheet, #FamilyFirm, #TheUnexpected CEO of https://t.co/Q4hHIERBD5 👇Jacob Steinhardt @JacobSteinhardt
7K Followers 67 Following Assistant Professor of Statistics, UC BerkeleyTina White @CristinaRWhite
91 Followers 164 Following Researcher, nonprofit founder @CovidWatch. AI alignment, privacy-preserving technology, machine learning, aerodynamics. Emergent ventures grantee.John Nerst @everytstudies
4K Followers 867 Following big picture-fetishist | aspiring erisologist ("the study of disagreement and intellectual difference") | lover and hater of words/philosophy/artSimon Sarris @simonsarris
57K Followers 982 Following 🕯 In labouring to be concise, I become obscure. 🕯 Alchemist, sacred things, making things 🕯 The map is mostly water. 🌜 I make GoJS: https://t.co/7yYIMFfAtdandy jones @andy_l_jones
4K Followers 326 Following engineering & research at @AnthropicAI. DC, SF, LondonDario Amodei @Dario_Amodei
2K Followers 15 FollowingTom Brown @nottombrown
5K Followers 524 Following @AnthropicAI, GPT-3, AI alignment, robustness, etc. Cautiously optimistic.Kamal Ndousse @kandouss
2K Followers 495 Following AI @AnthropicAI Social learning enthusiast. Opinions and dumb jokes my own.Daniela Amodei @DanielaAmodei
6K Followers 300 Following President @AnthropicAI. Formerly @OpenAI, @Stripe, congressional staffer, global development@nelhage @nelhage
4K Followers 765 Following I've quit Twitter. Find me: https://t.co/e9ivqRR9JA https://t.co/oTZrAyGRU6 https://t.co/9fFULpcdVaSocial ch402 @ColahSocial
34 Followers 16 Following Social account of @ch402. Pushing myself to be genuine and vulnerable.Lulie @reasonisfun
16K Followers 486 Following Epistemology applied to everything. 💫 Host of Reason Is Fun podcast w/ @DavidDeutschOxf 🎙️ Taking critical rationalism into life – how to improve both.The White House @WhiteHouse
8.8M Followers 6 Following Welcome to the Biden-Harris White House! Tweets may be archived: https://t.co/UbZQo0sWVfLaurens Gunnarsen @MathPrinceps
1K Followers 235 Following Mathematical physicist and mentor to mathematically talented youth. Talent is that which bridges the gap between what can be taught and what must be learned.Eli Tyre @EpistemicHope
2K Followers 138 Following Trying to understand the world (my relationship to twitter: https://t.co/7UrZIBBeKS…)jennifer daniel @jenniferdaniel
15K Followers 1K Following Unicode ESC, Chair: 🥹🫠🫥🥲🫡🫢🫣😮💨😵💫😶🌫️❤️🔥❤️🩹🫦🫧🫗🪬 | Emoji Kitchen Chef 🧑🍳 | https://t.co/EYn9XPVsOCuncatherio @uncatherio
2K Followers 1K Following wholesomeness practitioner; user of words // profile pic used to look like @catherineols upside-down 🙃Kanjun 🐙🏡 @kanjun
17K Followers 487 Following understanding human & machine minds to build a creative abundant future. CEO @imbue_ai. support founders @outsetcap. co-organize https://t.co/H1aXYk96ja.David Chapman @Meaningness
31K Followers 137 Following Better ways of thinking, feeling, and acting—around problems of meaning and meaninglessness; self and society; ethics, purpose, and value.David Luan @jluan
9K Followers 1K Following led Google’s large models effort, director @googleai. former vp engineering @openai. interested in ML + society. all about type II fun.Nat.Sec.L. Podcast @NSLpodcast
7K Followers 21 Following The latest national security law debates w/ Professors @BobbyChesney & @steve_vladeck, at https://t.co/pFWVjZ6BOFBill Hilton 🇺🇦 @billhilton
2K Followers 1K Following I create stuff for adult piano learners. I'm especially interested in finding effective, low-cost ways in which older adults can develop their musical skills.Jim O’Neill @regardthefrost
5K Followers 2K Following Longevity, science, health, and peace. e/acc. Co-founder of the Thiel FellowshipSophia Sanborn @naturecomputes
4K Followers 3K Following Theory, ML, neurotechnology @ https://t.co/OmhC0RyxZp | Organizer @neur_reps | Prev: @geometric_intel @berkeley_ai @redwood_neuro @intelai @harvardNicolas Papernot @NicolasPapernot
10K Followers 665 Following Security and Privacy of Machine Learning @Uoft @VectorInst @Google 🇫🇷🇪🇺🇨🇦 Co-author https://t.co/VJF39DQPCu; @CentraleLyon + @PSUEngineering alumnus. Opinions mineLaura 🌲 ⛰️ @LauraDeming
44K Followers 194 Following I like molecules and thought experiments! personal: (notes on research + longevity) https://t.co/SoImhWr11i work: https://t.co/iVWTaFOzt2 and https://t.co/YBd4qE6jR6Danny Hernandez @Hernandez_Danny
3K Followers 545 Following Measuring and forecasting AI progress @AnthropicAI.Love it! I was thinking I'd really like to do this the other day. Now I don't have to!
Scaling laws for dictionary learning! transformer-circuits.pub/2024/april-upd…
Some small updates from the Anthropic Interpretability team: transformer-circuits.pub/2024/april-upd…
New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…
Announcing a progress update from the @GoogleDeepMind mech interp team! Inspired by @AnthropicAI's excellent monthly updates, we share a range of updates on our work on Sparse Autoencoders, from signs of life on interpreting steering vectors with SAEs to improving ghost grads.
We have a, uh, tradition of “bribing” 4yo with candy when she’s ready for school on Mondays (see QT). @Gena_I_Gorlin and I are traveling, sitters are living with kids. Sitters thought that it was candy *every* morning. Today 4yo *declined candy*, corrected and chided sitters.
The hardest part of 3yo’s week was Monday school drop off. She loves school but hates transitions, and this is a big one. After trying everything else, I resorted to bribing her with candy. I even branded it explicitly to her: the Monday Morning Bribe. Well, it, uh, “worked”…
@uncatherio FWIW I've had a lot of luck getting interior design advice from Claude. Give it a photo of my space and ask about alternatives/furnishing options.
@ArtirKel @shae_mcl like reading a book as the sun sets in your little nook pin.it/34fPVN20k
@ArtirKel @shae_mcl it just can’t compare to cozy maximalist plant-filled cabin pin.it/2ou3PjMpI
Just wanted to share some good news: After really truly almost dying of cancer, beyond the point we’d all accepted it & she was stopping treatment, my mom’s scans in Dec. showed sudden & miraculous improvement after a Hail Mary & today she heard she doesn’t have cancer anymore.
@ben_mathes @Noahpinion (or at least similarly right to restricting the overall housing supply, which I tend to fixate on as the cause of the high housing prices)
@ben_mathes @Noahpinion of course! I don't endorse that view and I agree it's reprehensible. but before reading that post, I thought "blaming the techies" was both descriptively and normatively wrong. the post suggests it's descriptively sorta right
google cloud insisting i do a sales call in order to get more than a single GPU is going to be the reason i go with another cloud provider 💀💀 another provider let me have my nice little A100 cluster just because. didn’t have to talk to anyone.
@flawedaxioms @anveio idk if not doing this this requires galaxy brain neurotypical social skills tbf.
i worry a lot that sometimes when people make these arguments it’s because they want to allow themselves to indulge one of the things they most deeply want for themselves but they can’t without feeling guilty. it’s okay to sometimes do things just because we really want to.
this fundamentally isn’t a well-reasoned position. that’s kind of the point. it is such a beautiful part of life and i want it to feel untouched by the responsibility to others that permeates most of my other choices. i want to enjoy it without any guilt.
one of the most not consequentialist takes i have is that it makes me sad when people feel the need to justify having kids as an effective choice at all. i really want a family. maybe it isn’t quantitatively justifiable. maybe by doing so, the world is somehow a net worse place
Every ethical argument for having children is dominated by other options that are more effective. 1. If you’re worried about population issues, just donate $10k to bednets That’s about the equivalent of two extra children existing in the world. It also does more good 🧵 1/
today someone asked me if i was EA and instead of explaining “well… you know, i’m a little bit adjacent but… blah blah” i actually said yes?? what does this mean??
Extremely cool work from @saprmarks! I think this is one of my favourite SAE papers since Towards Monosemanticity. I'm particularly excited about the use of error nodes, without which SAEs are a bit too janky to do reliable circuit analysis with
Can we understand & edit unanticipated mechanisms in LMs? We introduce sparse feature circuits, & use them to explain LM behaviors, discover & fix LM bugs, & build an automated interpretability pipeline! Preprint w/ @can_rager, @ericjmichaud_, @boknilev, @davidbau, @amuuueller