Raphaël Millière @raphaelmilliere
AI & Cognitive Science @UniofOxford @EthicsInAI Fellow @JesusOxford @raphaelmilliere.com on 🦋 Blog: https://t.co/2hJjfShFfr raphaelmilliere.com Oxford, UK Joined May 2016-
Tweets3K
-
Followers11K
-
Following3K
-
Likes8K
@dubova_marina @cogsci_soc Congrats Marina!
Great work! See also arxiv.org/abs/2603.05414 from @LedermanHarvey & @kmahowald This is a nice cautionary tale about Morgan's canon in interpretability: "introspection" here is closer to anomaly detection with confabulation than to direct/privileged access to injected content.
1/ Can LLMs introspect, i.e., reason about their internal states? Recent work claims LLMs notice when their "thoughts" get tampered with, and can report their content. We looked closely and we think it's too early to say that. Work led by @shashwat_s19 , with @tallinzen and me.
@GoukiMinegishi Thanks! I'll be in Seoul, we should chat
Some brief comments on the “meat computer” metaphor for humans in today’s New York Times: nytimes.com/2026/05/24/bus…
I still occasionally hear people claim that LLMs are hilariously bad at arithmetic. Another reminder that it's not 2022 anymore.
I redid the multi-digit multiplication experiment, now with gpt-5.5. With medium reasoning and 7 samples each cell, it pretty much aced the test with 99.46% accuracy. The model had no tools to call and had to rely on its reasoning. Can it go further? (1/4)
News to me! (from this slopfest: startupresearcher.com/news/h-company…)
@nikhil07prakash @GoodfireAI Congrats! Excited to see what you work on there
@francoisfleuret @TMoldwin What do you mean by “knowledge”? 🙃
@karinavold @TorontoSRI Thanks for having me!
New opinion piece on the interface between research on concepts and categories in minds vs. in neural network LMs! I take the position that there is much to be learned from this interface (e.g., learning about concepts from language alone) and outline some directions for future.
all mech interp people are bought into causality, this criticism is very lazy as of ~2 years ago. since this is a subtweet of NLAs, it is worth pointing out that their steering experiments on the poetry and eval awareness tasks *do* test for (in those cases) causality!
Guys, stop pestering Mech Interp researchers about causality please! It's this inexplicable obsession with causality that made us lose beautiful sciences like Astrology, Palmistry and Phrenology! 😡
@littmath POV you're Spinoza
pov: you are a natural language autoencoder and you are aware you are being subject to evals by Redwood Research. do you fake writing out a coherent cot or truthfully say "the math problem is giving me 92ish vibes"?
How well does this work? One quick independent test is to see if it can recover an "internal CoT" in cases where AIs can solve math problems in a single forward pass. TLDR: it doesn't. (TBC, this might require the NLA to see activations at multiple positions/location to work.)
@elyasbuilds I like activation steering as much as the next guy, but this isn't what I was referring to: x.com/raphaelmillier…
@jatin_n0 Mostly a joke, it's a cool paper! yes the planning result is causal but only looking at total effect (i.e. an NLA-derived resid stream edit changes the output). I was referring to causal effect on the model's downstream computations, not anything inside/after the autoencoder. 1/2
New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.
@jatin_n0 An additive AR-difference vector can change the output while acting as a broad steering perturbation without showing that the described content actually maps onto the operative feature in the model's putative "rhyme-planning" circuit 3/3
@jatin_n0 It's missing is evidecne about causal mediation: whether the NLA-described "rabbit plan" is the variable later components read, whether the edit produces a coherent "mouse plan" in later layers/tokens, whether ablating/patching intermediate states blocks or restores the effect 2/
@Dr_Atoosa @GoogleDeepMind Congrats! Looking forward to welcoming you back on this side of the pond :)
David Chalmers @davidchalmers42
42K Followers 661 Following philosopher@NYU. consciousness, reality+, life, the universe, and everything.
Melanie Mitchell @MelMitchell1
51K Followers 670 Following Professor, Santa Fe Institute. Mostly posting on https://t.co/4NpA2IL5Va (at-melaniemitchell). More thoughts at https://t.co/nC43NHRozX.
Chaz Firestone @chazfirestone
19K Followers 937 Following Cognitive scientist @JohnsHopkins. 🇨🇦 also: https://t.co/96gKNz0VWF
near @nearcyan
169K Followers 1K Following perhaps of the past, but greener pastures may still await us
Keith Frankish @keithfrankish
32K Followers 2K Following Philosopher, writer, Ελληνοβρετανός. Honorary Professor, University of Sheffield. Mind, consciousness, illusionism, cog-sci, Ελλάδα. Also at @philosynthesis_
Robert Long @rgblong
9K Followers 1K Following executive director of @eleosai AI consciousness and AI welfare
Miles Brundage @Miles_Brundage
72K Followers 13K Following AI policy researcher, @lfschiavo wife guy, fan of cute animals and sci-fi, executive director of AVERI (https://t.co/qq9xcmKQas), Substacker, views my own
Felix Hill @FelixHill84
12K Followers 740 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else's
Jacob Andreas @jacobandreas
24K Followers 947 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Sam Bowman @sleepinyourhat
65K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.
Jonathan Birch @birchlse
17K Followers 1K Following Professor, LSE. Philosophy of science, animal consciousness, animal ethics. Director of The Jeremy Coller Centre for Animal Sentience.
Julian Togelius @togelius
23K Followers 1K Following Researcher. AI, games, markets, open-endedness, evolution. Professor @nyuniversity @NYUGameLab Head of AI @the_nof1 Co-founded @modl_ai Rogueliker.
Thomas G. Dietterich @tdietterich
62K Followers 650 Following University Distinguished Professor (Emeritus), Oregon State Univ.; Former President, AAAI; Currently Chair CS Section of ArXiv
Amanda Askell @AmandaAskell
102K Followers 662 Following Philosopher & ethicist trying to make AI be good @AnthropicAI. Personal account. All opinions come from my training data.
Ethan Caballero @ethanCaballero
12K Followers 2K Following ML @Mila_Quebec ; previously @GoogleDeepMind
Inês Hipólito @ineshipolito
8K Followers 1K Following Exploring minds & AI at the intersection of philosophy and cognitive science. Assistant Prof, speaker. 🧠🌿✨
François Fleuret @francoisfleuret
52K Followers 469 Following Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @neural_concept_. I like reality.
Micah G. Allen @micahgallen
24K Followers 4K Following Prof. of Computational Neuroscience @AUclinical, PI @visceral_mind - research on interoception, metacognition, & brain-body interaction.
Patrick Mineault @patrickmineault
23K Followers 3K Following NeuroAI researcher @ Amaranth Foundation, safety, open science. Previously engineer @ Google, Meta, Mila.
Riccardo Cadei @riccardocadeii
270 Followers 827 Following PhD student in Causal Learning and AI @ISTA | ex @Harvard @EPFL
Alejandro De Los Ange... @delosangeles520
106 Followers 336 Following Physician-scientist in training (UCF COM). Research Affiliate, Yale. Psychiatry, brain development, stem cells & ethical AI
Juan @juanpcadile
600 Followers 1K Following Philosophy PhD @ URochester; AI Strategy & Interpretability.
Tamar Rott Shaham @TamarRottShaham
711 Followers 450 Following Incoming assistant prof at @WeizmannScience, postdoctoral fellow at @MIT_csail
Lled @dbsngdjdbdk
15 Followers 2K Following
Arash @incrementaliser
110 Followers 407 Following Computational Linguist (PhD) 📖 | Interested in cognitively-plausible models of dialogue understanding 🧠🤖💬 | On the research scientist job market 🔦
Bronson Schoen @BronsonSchoen
303 Followers 930 Following
xtwirer.account @xtwirer
58 Followers 2K Following
Startup Policy Forum @SPF_India
351 Followers 724 Following The Startup Policy Forum (SPF) is a pioneering alliance dedicated to advancing India’s rapidly growing new-age economy.
Max Cembalest @maxcembalest
399 Followers 2K Following trying to explain logarithms as well as possible
AIcontributors @aicontributors
10 Followers 2K Following
Yasaman Bahri @yasamanbb
6K Followers 1K Following Research Scientist @GoogleDeepMind // AI + physics // Ph.D. @UCBerkeley.
Tiago Pimentel @tpimentelms
2K Followers 320 Following Postdoc at @ETH_en. Formerly, PhD student at @Cambridge_Uni.
Shun Yoshizawa @Spectrum_cj
396 Followers 2K Following AI Alignment, Safety and Consciousness Researcher (RA) | BS Physics Candidate 🇯🇵
Samuel Kimbriel @sckimbriel
1K Followers 595 Following Founding Director Philosophy & Society @aspeninstitute. LINK TO WRITING: https://t.co/VSaKCtgN25. Loneliness: https://t.co/mTxGU70fSR
Alexander Rose | Hype... @hypertheoryalex
73 Followers 647 Following recursively self improving since ‘97. Researcher @uniofoxford, @ethicsinai Personal and universal views. Yes, I have a podcast/blog.
Joe Ochoa @JoeOcho59929743
170 Followers 2K Following
Julian Minder @jkminder
793 Followers 597 Following PhD at EPFL with Robert West and Ryan Cotterell, MATS 7 Scholar with Neel Nanda
Arman Madhvani @Arman_madhvani
73 Followers 342 Following A collection of thoughts that either keep me up all night or put me right to sleep
Audrey Horne Updates ... @credenzaclear2
40K Followers 3K Following acquired taste // secret ballot coeditor
Axel Cleeremans @axelcleeremans
3K Followers 949 Following Brain, minds, & consciousness. Also interested in design, visual arts, biology, space, typography, and anticipation. Also known as “Axel from Belgium”.
Fabien @Fabien_Mikol
5K Followers 1K Following Incapable de rester dans son domaine 🤷♂️ - Geoffrey Hinton : "We need to think hard about what's going to happen next, and we just don't know"
Kevin Rose @kevinrose
1.5M Followers 2K Following building at @basic_in (@digg) | Podcasts: The Kevin Rose Show, Random Show w/ @tferriss. | Ex: @google, Board of Directors: @ouraring, @hodinkee
Monique Crichlow @MoniqueCrichlow
64 Followers 221 Following AI policy and innovation business planning expert.
Momo @th000fly100
37 Followers 5K Following
Bryan Bastidas @BryanBRstds
27 Followers 2K Following
3cH0_Nu1L🇨🇳☭ @3cH0_Nu1L
26 Followers 1K Following A Communist. An avid fan of cyber security and cryptography.
vals🔸 @ValsTutor
1K Followers 2K Following philosophy and psychology, understanding and community. good reasoning and systemized winning enjoyer. dm me your life problems. less serious @TutorValsLife
eristic @vol_ition
297 Followers 7K Following A pseudonym's pseudonym; 'you can be arrogant or ignorant, but never both'
𝔫𝔢𝔯𝔟 @methylphenidev
2K Followers 5K Following ኃጢአተኛ ነፍስህ ከድነት በላይ ነው እናም ሰላምን እና ህመምንም አታውቅም ፣ ብቸኛው የባዶነት ቅዝቃዜ ብቻ ነው የንስሐ ጊዜ ተጠናቅቋል ፣ ምክንያቱም ጥፋቶችህ በክፉ ዓይነትህ ከማንኛውም ተልእኮ ይበልጣሉ መጨረሻው ቅርብ ነው ፣ የኃጢያት መርከቦች
mads @ForestTrading22
44 Followers 2K Following
Dorian Liu @LiuZT612
25 Followers 537 Following MSc Mind, Language & Embodied Cognition @EdinburghUni Phil of mind & cogsci|AI ethics|AI character |Consciousness
Dani Hinjos @brainydani
12 Followers 78 Following interpreting biological artificial intelligence | @BSC_CNS @IRBBarcelona @la_UPC
Jatin Nainani @jatin_n0
262 Followers 613 Following Explorer, researcher, engineer | Mechanistic Interpretability | Comp Bio
Jan Zilinsky @janzilinsky
6K Followers 1K Following Researching the role of technology in democratic politics. NYU PhD. Using LLMs as instruments, not oracles.
Nidal @nselmi
2K Followers 2K Following What's the Kolmogorov Complexity / Minimum Description Length of a Reasoning Language Model? LLMs, AI/ML, Data Science, Graph DBs. PhD student. 🇹🇳➡️🇺🇸
Diana Markova @heavyhelium11
59 Followers 2K Following a rose is a rose is a rose ~ G.S. ; reposts != full endorsements ; wisdom/acc
MaxBob @MaxBobeqyi
2 Followers 25 Following
David Chalmers @davidchalmers42
42K Followers 661 Following philosopher@NYU. consciousness, reality+, life, the universe, and everything.
Melanie Mitchell @MelMitchell1
51K Followers 670 Following Professor, Santa Fe Institute. Mostly posting on https://t.co/4NpA2IL5Va (at-melaniemitchell). More thoughts at https://t.co/nC43NHRozX.
Chaz Firestone @chazfirestone
19K Followers 937 Following Cognitive scientist @JohnsHopkins. 🇨🇦 also: https://t.co/96gKNz0VWF
Aran Komatsuzaki @arankomatsuzaki
178K Followers 362 Following Sharing AI research. Early work on AI (GPT-J, LAION, scaling, MoE). Ex ML PhD (GT) & Google.
Anthropic @AnthropicAI
1.3M Followers 35 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
Anil Seth @anilkseth
60K Followers 933 Following Neuroscientist: consciousness, perception, & dreamachines. TED speaker, & author: Being You - A New Science of Consciousness.
near @nearcyan
169K Followers 1K Following perhaps of the past, but greener pastures may still await us
Keith Frankish @keithfrankish
32K Followers 2K Following Philosopher, writer, Ελληνοβρετανός. Honorary Professor, University of Sheffield. Mind, consciousness, illusionism, cog-sci, Ελλάδα. Also at @philosynthesis_
Robert Long @rgblong
9K Followers 1K Following executive director of @eleosai AI consciousness and AI welfare
Miles Brundage @Miles_Brundage
72K Followers 13K Following AI policy researcher, @lfschiavo wife guy, fan of cute animals and sci-fi, executive director of AVERI (https://t.co/qq9xcmKQas), Substacker, views my own
Lucas Beyer (bl16) @giffmana
138K Followers 600 Following Researcher (now: Meta. ex: OpenAI, DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://t.co/xe2XUqkKit ✗DMs → email
Kording Lab 🦖 @KordingLab
64K Followers 3K Following Konrad kording, @Penn Prof, deep learning, brains, #causality, rigor, https://t.co/tTJW05RRfa, https://t.co/qf7ZHxjaK1, Transdisciplinary optimist, Dad, Loves outdoors, 🦖
Jack Clark @jackclarkSF
132K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkIJ2 Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures
Felix Hill @FelixHill84
12K Followers 740 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else's
Jacob Andreas @jacobandreas
24K Followers 947 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Sam Bowman @sleepinyourhat
65K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.
Neel Nanda @NeelNanda5
40K Followers 122 Following Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
Geodesic Research @GeodesResearch
719 Followers 175 Following We're behind https://t.co/qHdncajB6V. Building the base of alignment.
Jiaxin Wen @jiaxinwen22
6K Followers 194 Following research @berkeley_ai @anthropicai. prev @tsinghua_univ.
Haoxiang Wang @Haoxiang__Wang
2K Followers 1K Following Research Scientist at Luma AI. Prev. NVIDIA researcher. PhD from UIUC.
Tilde @tilderesearch
5K Followers 10 Following We build foundational understanding of models to advance the frontier of intelligence.
Vamshi Krishna Bonagi... @VictorKnox99
559 Followers 702 Following Interp and Alignment | prev CHAI @UCBerkeley | SPAR mentor | PhD @MBZUAI | prev @precogatiiith
Ingi Erlingsson 🪄 @ingi_erlingsson
10K Followers 2K Following Co-founder and CEO of @thesystms. Previously founder @GoldenWolf (acq. 2023) / chief content officer @doodles
Constellation Institu... @ConstellOrg
554 Followers 28 Following Bringing experts and leaders together to navigate transformative AI
Nymph @RhizoNymph
7K Followers 3K Following Rhizomatic technomancer exploring the interconnectedness of all things. Backend/infra eng obsessed with AI, mech interp, performance, and dist sys. 🏳️⚧️
The OpenAI Foundation @FoundationOAI
7K Followers 0 Following OpenAI was founded in 2015 as a nonprofit; its mission is to ensure artificial general intelligence benefits all of humanity.
SemiAnalysis @SemiAnalysis_
95K Followers 27 Following
Wilka Carvalho @cogscikid
2K Followers 920 Following @KempnerInst research fellow @Harvard. trying to understand the human reinforcement learning algorithm. hope to build AI that helps us live rewarding lives.
Jon Barron @jon_barron
34K Followers 1K Following Principal research scientist at Google DeepMind. Synthesized views are my own.
Andy Hall @ahall_research
11K Followers 2K Following Building free systems. Prof @StanfordGSB, Senior Fellow @HooverInst. Advisor, @a16zcrypto, @ByForumAI. Writing at https://t.co/K0BfKKi4sM
Tiago Pimentel @tpimentelms
2K Followers 320 Following Postdoc at @ETH_en. Formerly, PhD student at @Cambridge_Uni.
Yasaman Bahri @yasamanbb
6K Followers 1K Following Research Scientist @GoogleDeepMind // AI + physics // Ph.D. @UCBerkeley.
Samuel Kimbriel @sckimbriel
1K Followers 595 Following Founding Director Philosophy & Society @aspeninstitute. LINK TO WRITING: https://t.co/VSaKCtgN25. Loneliness: https://t.co/mTxGU70fSR
Josh Woodward @joshwoodward
63K Followers 778 Following VP, @Google @GoogleLabs @GeminiApp @GoogleAIStudio
Tufalabs @tufalabs
1K Followers 52 Following We are a small, independent research group working on fundamental AI research. The initial focus is on ARC
Future of Life Instit... @FLI_org
89K Followers 1K Following We work on reducing extreme risks and steering transformative technologies to benefit humanity. RT /=/ endorsement. Bluesky: https://t.co/IjvxJtEEeQ
The Future Society @thefuturesoc
5K Followers 302 Following The Future Society is an independent nonprofit organization based in the United States and Europe with a mission to align AI through better governance.
Kristy Loke @kristy_loke
562 Followers 1K Following Researching China AI & global governance @matsprogram | Analyses in @techreview @ieeespectrum | Website: https://t.co/MIxqRIam0H
Rosmine @rosmine
3K Followers 567 Following Independent researcher | AI advisor | Distribution Fine Tuning (DFT) for better LLM writing quality
Akanksha @akankshanc
2K Followers 861 Following Passionately in love with Science, Altruistic, Engineer, Amateur Astronomer & Critical thinker. Current Research focus: ▫️Mechanistic Interpretability▫️
Adam Belfki @adambelfki
98 Followers 266 Following Software Engineer @ndif_team | CS @Northeastern 24' AI Interpretability & Safety.
Sixing Chen @_SixingChen
114 Followers 508 Following Ph.D. student @nyuniversity @NYUPsych | B.S. @PKU1898
David Bessis @davidbessis
19K Followers 460 Following Rogue mathematician. "The product of mathematics is clarity and understanding." — Bill Thurston https://t.co/l95RHuWz2S
Recursive @Recursive_SI
6K Followers 0 Following Recursive self-improving superintelligence to automate knowledge discovery.
⿻ Audrey Tang 唐�... @audreyt
276K Followers 214 Following 🇹🇼 Cyber Ambassador, 1st Digital Minister (2016-2024) & 🌐 1st 🏳️⚧️ cabinet minister.
Leon Lang @Lang__Leon
2K Followers 653 Following Head of curriculum development at Iliad. Previously PhD in AI safety and multivariate information theory @AmlabUva. Trying to prevent x-risks from AI.
Jiayi Weng @Trinkle23897
12K Followers 184 Following MTS @openai, author of the entire post-training RL infra, core contributor of ChatGPT/GPT4/GPT4o etc. 30U30
Charles Curran @charliebcurran
83K Followers 4K Following Filmmaker // Director of See Know Evil // Menace Studio
Helen Toner @hlntnr
36K Followers 1K Following AI, national security, China. Part of the founding team at @CSETGeorgetown (opinions my own). Author of Rising Tide on substack: https://t.co/LKAoyL00iB
Tim Hua 🇺🇦 @Tim_Hua_
1K Followers 1K Following AI safety, Econ, new liberalism, math, and a bit of art history (as a treat) Behavioral evaluations @TransluceAI. Prev Astra, MATS & Walmart's Econ Team
Shane Legg @ShaneLegg
81K Followers 66 Following Chief AGI Scientist & Co-Founder, Google DeepMind Work website: https://t.co/E4SyeGVYXk Personal blog: https://t.co/LL9JNdNpW1
Meridian Labs @meridianlabs_ai
798 Followers 11 Following Open source tools for frontier AI research and evaluation
Cameron Holmes @CameronHolmes92
398 Followers 974 Following Research Manager in Alignment @ AISI. prev MATS. Market participant, EA. Parenting like Dr Louise Banks
Paul de Font-Reaulx @PReaulx
225 Followers 649 Following Working on AI evals | Research in cogsci and decision theory | PhD in Philosophy at UMich prev Oxford
The Aspen Institute @AspenInstitute
110K Followers 4K Following We drive change to help solve the greatest challenges of our time. We are a nonprofit, nonpartisan organization. RT ≠ endorsement.
Harish Kamath @kamath_harish
1K Followers 547 Following interp @anthropic | previously ML @scale_ai, @GeorgiaTech
Technical AI Safety C... @tais_2026
329 Followers 40 Following TAIS 2026 will bring together leading AI safety experts to discuss how to make AI safe, beneficial, and aligned with human values.
Lucius Bushnaq ⏹️ @BushnaqLucius
324 Followers 160 Following Ours is the era of inadequate AI alignment theory.
Dan Braun @danbraunai
295 Followers 92 Following big == complex. small == simple. many_small == hopefully simple.

























