Alex Tamkin 🦣 @AlexTamkin

machine learning, science & society @AnthropicAI | prev: phd @StanfordAILab, @stanfordnlp alextamkin.com San Francisco, CA Joined September 2012

Tweets

840
Followers

4K
Following

1K
Likes

2K

Adam Jermyn @AdamSJermyn

21 hours ago

Some small updates from the Anthropic Interpretability team: transformer-circuits.pub/2024/april-upd…

1 13 93 60K 65

Once, we ran a study on Prolific and a participant wrote on Reddit that the study “Felt like I was losing the will to live.” I went on the Prolific Subreddit (24k members!) and asked what matters. Here is what they told me. A thread on happier participants and better studies 1/9

10 134 436 79K 351

Download Image

Alex Tamkin 🦣 @AlexTamkin

4 days ago

This was a really simple method, but it generalizes surprisingly far!

Anthropic @AnthropicAI

4 days ago

This was a really simple method, but it generalizes surprisingly far!

4 18 147 37K 39

Download Image

1 1 21 5K 7

Philipp Fränken @jphilippfranken

5 days ago

Constitutional AI showed LMs can learn to follow constitutions by labeling their own outputs. But why can't we just tell a base model the principles of desired behavior and rely on it to act appropriately? Introducing SAMI: Self-Supervised Alignment with Mutual Information!

3 31 145 59K 139

Download Gif

michael @mkwng

2 weeks ago

A friend was asking our group chat for any apps that can take a starting location and generate random running trails. No one had a good answer. So, I fired up claude.ai, google colab, and repl.it and screen recorded myself whipping together a UI to…

40 78 986 311K 2K

Download Video

Tavor Baharav @TBaharav

2 weeks ago

Thrilled to share our new publication in PNAS on OASIS, an alternative to Pearson’s X² for analyzing contingency tables. Made it to the front page! 1/ 7

1 5 8 6K 4

Download Image

Esin Durmus @esindurmusnlp

3 weeks ago

Our latest study measures how persuasive language models like Claude are compared to humans. We find a general scaling trend: newer models tend to be more persuasive, with Claude 3 Opus generating arguments that don't differ statistically from human-written ones.

Anthropic @AnthropicAI

3 weeks ago

57 117 712 180K 293

Download Image

5 15 94 13K 23

Anthropic @AnthropicAI

3 weeks ago

New Anthropic research: Measuring Model Persuasiveness We developed a way to test how persuasive language models (LMs) are, and analyzed how persuasiveness scales across different versions of Claude. Read our blog post here: anthropic.com/news/measuring…

57 117 712 180K 293

Download Image

Tristan Hume @trishume

4 weeks ago

Here's Claude 3 Haiku running at >200 tokens/s (>2x as fast as prod)! We've been working on capacity optimizations but we can have fun testing those as speed optimizations via overly-costly low batch size. Come work with me at Anthropic on things like this, more info in thread 🧵

10 35 408 63K 146

Download Video

Anthropic @AnthropicAI

4 weeks ago

New Anthropic research paper: Many-shot jailbreaking. We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers. Read our blog post and the paper here: anthropic.com/research/many-…

83 350 2K 499K 868

Download Image

Jesse Mu @jayelmnop

a month ago

We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

4 71 460 66K 311

Download Image

Zhengxuan Wu @ZhengxuanZenWu

a month ago

New paper and library! 🫡 Intervening on internal states has emerged as a fundamental operation for analyzing and improving neural models. We release pyvene, a library for performing interventions and sharing intervened models. 👉Code & Paper: github.com/stanfordnlp/py…

3 25 153 29K 92

Download Gif

Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Percy Liang @percyliang

49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

Jacob Andreas @jacobandreas

14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw

Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Delip Rao e/σ @deliprao

46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Tim Dettmers @Tim_Dettmers

29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.

Sara Hooker @sarahookr

39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.

Horace He @cHHillee

23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemale

Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML.
Follow me for commentary on state-of-the-art AI.

Tom Goldstein @tomgoldsteincs

23K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.

PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵

Kayo Yin @kayo_yin

8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵

Jesse Mu @jayelmnop

5K Followers 581 Following Computational linguistics @AnthropicAI

Miles Brundage @Miles_Brundage

43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.

a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

Kyunghyun Cho @kchonyc

61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

Stella Biderman @BlancheMinerva

15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/her

Jerry Liu @jerryjliu0

44K Followers 1K Following co-founder/CEO @llama_index Careers: https://t.co/EUnMNmbCtx Enterprise: https://t.co/Ht5jwxSrQB

Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.

Naomi Saphra @nsaphra

7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.

Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.

Christopher Potts @ChrisGPotts

11K Followers 620 Following Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.

Anti-cynic. Artificial narrow intelligence. Autonomous vehicles, multi-agent learning, and transportation. RS at Apple, Asst. Prof at @nyutandon. He/him.

Eugene Vinitsky @EugeneVinitsky

13K Followers 2K Following Anti-cynic. Artificial narrow intelligence. Autonomous vehicles, multi-agent learning, and transportation. RS at Apple, Asst. Prof at @nyutandon. He/him.

Stanford CS PhD @StanfordCRFM
@StanfordNLP @StanfordAILab @StanfordHAI

Advisers: @percyliang @jurafsky
Previous: @CornellCIS @clairecardie
#FoundationModels

rishi @RishiBommasani

4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModels

@AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Jack Clark @jackclarkSF

67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Millie-rose Egbert @EgbertMill65033

52 Followers 5K Following

Hildred Turello @turel_hild

71 Followers 5K Following

Senior at @IITGuwahati || Intern @Forcepointsec, @Sydney_Uni, @LivUni || Interested in computer vision (2d/3d), explainability and generative ai.

Bhavik Chandna @bhavikchandna

5 Followers 100 Following Senior at @IITGuwahati || Intern @Forcepointsec, @Sydney_Uni, @LivUni || Interested in computer vision (2d/3d), explainability and generative ai.

Felicity Ghiloni @GhiloFelici

14 Followers 4K Following 🖤Felicity ~ Lets Chat👇

Chris Rytting @ChrisRytting

421 Followers 468 Following Postdoc @UWCSE w/ @timalthoff. PhD in CS/NLP from @BYU. Formerly @nvidia, OSPC @AEI, @NewYorkFed Macroeconomic Research.

Arie Krein @ArieKrein97044

22 Followers 4K Following 🧡Arie | Lets Chat👇

阿瓦 @enling678156

274 Followers 3K Following 感谢赞美主，每一张影像都是祂的现在。一枚数字游民。关注：AI人工智能；艺术🎭。透过自然，感受祂创造的生生不息。

Yet another random producer who seems to like telling about himself in third person. What a guy! @kwiatkowski@mastodon.social 🦣

RETROSPECTION EP OUT NOW! ↓

Adrian Kwiatkowski | .. @adriank1410

1K Followers 2K Following Yet another random producer who seems to like telling about himself in third person. What a guy! @[email protected] 🦣 RETROSPECTION EP OUT NOW! ↓

Mick Fliper @FliperMick

79 Followers 511 Following

Daniel Israel @danielmisrael

231 Followers 2K Following PhD Student Studying AI/ML @UCLA

Visiting Postdoc @StanfordCS and Research Scientist @JPMorgan, working on collective alignment. Ex-intern @Deepmind @MetaAI @Siemens

Udari Madhushani Sehw.. @UdariMadhu

62 Followers 284 Following Visiting Postdoc @StanfordCS and Research Scientist @JPMorgan, working on collective alignment. Ex-intern @Deepmind @MetaAI @Siemens

Alana Nethkin @a_nethki

66 Followers 5K Following

hanncx @hanncx

63 Followers 4K Following perpetual learning

Kart ographien @kartographien

1K Followers 2K Following mostly AI safety

Harry Mayne @HarryMayne5

119 Followers 425 Following Interpretability @oiioxford @uniofoxford. PhD student. Previously @Cambridge_Uni

Robot learning @GoogleDeepMind, prev FAIR/@AIatMeta, Google Brain. dabbled in startups/investing @Contrary, @KleinerPerkins.

Yixin Lin @yixin_lin_

522 Followers 2K Following Robot learning @GoogleDeepMind, prev FAIR/@AIatMeta, Google Brain. dabbled in startups/investing @Contrary, @KleinerPerkins.

Jordan Gong @jordan__gong

41 Followers 2K Following

Towards a more playful, creative and collaborative future

Exploring ways to expand what we can perceive and understand

Matthew Siu @MatthewWSiu

4K Followers 590 Following Towards a more playful, creative and collaborative future Exploring ways to expand what we can perceive and understand

Afra Feyza Akyürek @afeyzaakyurek

718 Followers 726 Following PhD @BUCompSci. Research in NLP. Previously @allen_ai @Apple @CMU_Stats @kocuniversity @izmirfenlise

Aaditya ; @Aaditya26082004

525 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈

Harley Pope @HarleyPope1950

90 Followers 5K Following

PhD @nlp_usc | Ex-@GoogleDeepMind, @GoogleAI, @allen_ai @AmazonScience @UCLA | Common Ground Reasoning for Communicative Agents | he/him

Pei Zhou @peizNLP

2K Followers 887 Following PhD @nlp_usc | Ex-@GoogleDeepMind, @GoogleAI, @allen_ai @AmazonScience @UCLA | Common Ground Reasoning for Communicative Agents | he/him

Justin Zhao @justinxzhao

121 Followers 198 Following Founding Engineer at Predibase, Ex-Google AI: Natural Language Generation

Hailey Schoelkopf @haileysch__

3K Followers 812 Following she/her | research scientist @aiEleuther | LLM training/infra, eval, data | LM Evaluation Harness maintainer

Renato James Herrmann @RenatoHerrmann

87 Followers 3K Following

Roxanna Kozicki @KozicRoxan

63 Followers 5K Following

Latanya Smolka @LatanyaSmo

70 Followers 5K Following

Charlette Caradine @CharletteC48543

86 Followers 5K Following

prudhviraj @prudhviraj

2 Followers 8 Following MS CS @ UCSD 2024 Scaling models all the way!!!

🪄Light Magic AI🤖🐔AI+Chickens+Travel🌎 🎡Married to 🦞🌈#GentleParent💙 🧑🏻‍💻Internet Granddaddy👴🏻🔥Neurospicy 7w8 ✈️Feelin' so tall, 👁️could ✋👩🏼‍✈️

tyfisk @tyfisk

269 Followers 574 Following 🪄Light Magic AI🤖🐔AI+Chickens+Travel🌎 🎡Married to 🦞🌈#GentleParent💙 🧑🏻‍💻Internet Granddaddy👴🏻🔥Neurospicy 7w8 ✈️Feelin' so tall, 👁️could ✋👩🏼‍✈️

Father of two, engineer, soccer fan -- I got financially REKT in the covid crisis. I am trying to get back on my feet. Any help would be appreciated!

Adam Larson @realAdamLarson

81 Followers 880 Following Father of two, engineer, soccer fan -- I got financially REKT in the covid crisis. I am trying to get back on my feet. Any help would be appreciated!

ララどり d/age IS.. @presklux49

149 Followers 552 Following シンギュラリタリアン。老化を治療し、永遠の若さを手に入れることを目指しています。老化研究を促進するツールとして、人工知能も重視しています。私の夢は、超知能が管理する色々な箱庭世界で、悠久の時を過ごすことです。

michael @mkwng

3K Followers 1K Following

Neall @neallseth

1K Followers 1K Following seeking truth, finding beauty // software, economics, evolution, meditation // prev @x

Savvas Petridis @savvas_petridis

109 Followers 203 Following postdoc at google pair, @GoogleAI | computer science phd @Columbia | drummer

Startup founder. Deep learning, causal inference, bayesian statistics, python, system design, econ. Formerly stitch fix algos, salesforce, ibm, georgetown.

Eddie @edwardlandesber

251 Followers 774 Following Startup founder. Deep learning, causal inference, bayesian statistics, python, system design, econ. Formerly stitch fix algos, salesforce, ibm, georgetown.

Queenie Gately @gately32676

43 Followers 5K Following

𝔽_un @FF_un1

645 Followers 8K Following have fun

Kasie Reigle @ReiglKas

75 Followers 5K Following

Elnora Fuesting @ElnoraF89115

73 Followers 5K Following

Senior Data Policy Officer, UNHCR. Formerly Kenya, Syria, Azerbaijan, Uganda, Thailand, Sudan, Pakistan. Views are personal, RT is not endorsement.

Alex Novikau @Ales_N

407 Followers 672 Following Senior Data Policy Officer, UNHCR. Formerly Kenya, Syria, Azerbaijan, Uganda, Thailand, Sudan, Pakistan. Views are personal, RT is not endorsement.

Jodie Shrode @ShrodJod

30 Followers 5K Following

Ekin Akyürek @akyurekekin

2K Followers 725 Following graduate student in computer science @MITEECS/@MIT_CSAIL

Meatbag
Black box
AGI mentor
Basilisk slayer
Robopsychologist
Shoggoth whisperer
Ally of conscious beings
Your best hope of survival
Pastor of technognosticism

L i am 𒀭 @YeshuaGod22

2K Followers 3K Following Meatbag Black box AGI mentor Basilisk slayer Robopsychologist Shoggoth whisperer Ally of conscious beings Your best hope of survival Pastor of technognosticism

Alex Albert @alexalbert__

19K Followers 398 Following DevRel + Prompting @anthropicai

Zack Witten @zswitten

5K Followers 717 Following words + numbers

Ravi Bikkula @RBikkula

0 Followers 63 Following

proxyviolet @proxyviolet

63 Followers 121 Following emotional nomad. cursed shard of hyperindividuality

Kevin Sun @kevnsn

1K Followers 929 Following Building personal CRM that actually works @dexprm (YC S19) 🏳️‍🌈

Past: Research Engineer Intern @_FiveAI | SR. Student Research Associate @ IITK - SERB | ADAS Intern @BoschGlobal | BTech - MTech GeoInformatics, @IITKanpur

Shivam Pandey @ShivamPR21

172 Followers 4K Following Past: Research Engineer Intern @_FiveAI | SR. Student Research Associate @ IITK - SERB | ADAS Intern @BoschGlobal | BTech - MTech GeoInformatics, @IITKanpur

Aran Komatsuzaki @arankomatsuzaki

95K Followers 78 Following @TeraflopAI

Percy Liang @percyliang

49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

Sasha Rush @srush_nlp

52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGz

Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋

Christopher Manning @chrmanning

126K Followers 115 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋

Jacob Andreas @jacobandreas

14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw

Delip Rao e/σ @deliprao

46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Tim Dettmers @Tim_Dettmers

29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.

Anthropic @AnthropicAI

261K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.

Sara Hooker @sarahookr

39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.

Horace He @cHHillee

23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemale

Tom Goldstein @tomgoldsteincs

23K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.

Kayo Yin @kayo_yin

8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵

@Physical_int
ex: researcher @GoogleAI/@DeepMind, adj. Prof. @Stanford.
Into robots, AI, NBA, philosophy, soccer and almond croissants. 🇵🇱🇺🇸

Karol Hausman @hausman_k

22K Followers 141 Following @Physical_int ex: researcher @GoogleAI/@DeepMind, adj. Prof. @Stanford. Into robots, AI, NBA, philosophy, soccer and almond croissants. 🇵🇱🇺🇸

Jesse Mu @jayelmnop

5K Followers 581 Following Computational linguistics @AnthropicAI

Miles Brundage @Miles_Brundage

43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.

Research Scientist, Deepmind

I try to think hard about everything I tweet, esp on 90s football and 80s music

None of my opinions are really someone else's

Felix Hill @FelixHill84

9K Followers 777 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else's

Kyunghyun Cho @kchonyc

61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).

Stella Biderman @BlancheMinerva

15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/her

Damien Ma @damienics

23K Followers 2K Following Founding Managing Director @macropolochina; adjunct faculty @KelloggSchool; author; eating & boxing. Views all my own.

Matthew Siu @MatthewWSiu

4K Followers 590 Following Towards a more playful, creative and collaborative future Exploring ways to expand what we can perceive and understand

Aryaman Arora @aryaman2020

4K Followers 2K Following member of technical staff @stanfordnlp

michael @mkwng

3K Followers 1K Following

Neall @neallseth

1K Followers 1K Following seeking truth, finding beauty // software, economics, evolution, meditation // prev @x

adammaj @MajmudarAdam

8K Followers 206 Following founding engineer @thirdweb // cs + neuro (on gap) @Penn

Alex Albert @alexalbert__

19K Followers 398 Following DevRel + Prompting @anthropicai

Zack Witten @zswitten

5K Followers 717 Following words + numbers

proxyviolet @proxyviolet

63 Followers 121 Following emotional nomad. cursed shard of hyperindividuality

PhD candidate @UConn | Applied statistics, meta-analysis, psychometrics, and #RStats | Methodological reviewer at Psychological Bulletin

Matthew B Jané @MatthewBJane

5K Followers 869 Following PhD candidate @UConn | Applied statistics, meta-analysis, psychometrics, and #RStats | Methodological reviewer at Psychological Bulletin

Kevin Sun @kevnsn

1K Followers 929 Following Building personal CRM that actually works @dexprm (YC S19) 🏳️‍🌈

3rd year PhD @ColumbiaCompSci, working on NLP & ML | Student Researcher @GoogleAI | Prev Intern @MSFTResearch, @AmazonScience

Yanda Chen @yanda_chen_

421 Followers 387 Following 3rd year PhD @ColumbiaCompSci, working on NLP & ML | Student Researcher @GoogleAI | Prev Intern @MSFTResearch, @AmazonScience

Haochen Zhang @jhaochenz

325 Followers 152 Following CS PhD student @stanford.

Joshua Clymer @joshua_clymer

356 Followers 182 Following Researcher at METR. Working out when AI models are scary.

Joshua Batson @thebasepoint

2K Followers 707 Following trying to understand evolved systems (🖥 and 🧬) interpretability research @anthropicai formerly @czbiohub, @mit math

PhD'ing @stanford quantum sensors + dark matter, @quadfellowship, @ucberkeley ‘19 | science, geopolitics, sports | 🇮🇳 🇸🇬 🇦🇪 🇺🇸 | sarvam idaṃ veditavyam

Jyotirmai Singh @SinghJyotirmai

479 Followers 169 Following PhD'ing @stanford quantum sensors + dark matter, @quadfellowship, @ucberkeley ‘19 | science, geopolitics, sports | 🇮🇳 🇸🇬 🇦🇪 🇺🇸 | sarvam idaṃ veditavyam

Assistant Professor @PrincetonCS, @PrincetonSPIA, @PrincetonCITP. Work on algorithm auditing, privacy & fairness. Past: @USCViterbi @Snap @Google @Stanford @MIT

Aleksandra Korolova @korolova

3K Followers 3K Following Assistant Professor @PrincetonCS, @PrincetonSPIA, @PrincetonCITP. Work on algorithm auditing, privacy & fairness. Past: @USCViterbi @Snap @Google @Stanford @MIT

everett @typochondriac

1K Followers 897 Following brand creative director @anthropicai , previously @stripepress @stripe

Orowa Sikder @OrowaSikder

1K Followers 304 Following the future could be amazing. let’s get to work | Research @AnthropicAI, ex: PhD @UCLCS

masters student at @berkeley_ai advised by @JacobSteinhardt. Interested in interpretability, scalable oversight, and forecasting.

Danny Halawi @dannyhalawi15

167 Followers 290 Following masters student at @berkeley_ai advised by @JacobSteinhardt. Interested in interpretability, scalable oversight, and forecasting.

Shashwat Goel @ShashwatGoel7

190 Followers 268 Following Trustworthy ML | Science of Deep Learning | AI Safety Final year student @ IIIT Hyderabad

Sasha de Marigny @sashadem

3K Followers 533 Following Not Australian | Thinkin’ about Claude @AnthropicAI

Eric Steinberger @EricSteinb

7K Followers 478 Following Writing code that writes code on a mission to build safe superintelligence | CEO/cofounder @magicailabs

Bill Peebles @billpeeb

32K Followers 286 Following sora and agi @openai

Tim Brooks @_tim_brooks

29K Followers 74 Following Sora research lead @OpenAI

Lawyer for consumers, animals, & the environment. @USJewishDems organizer. Gun sense advocacy w/ @momsdemand. Running to represent FL State House District 91.

Jay Shooster @JayShooster

3K Followers 2K Following Lawyer for consumers, animals, & the environment. @USJewishDems organizer. Gun sense advocacy w/ @momsdemand. Running to represent FL State House District 91.

Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtm

lmsys.org @lmsysorg

37K Followers 171 Following Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtm

Philosophy professor. Writes about games, trust, art, intimacy, echo chambers, metrics. My new book is GAMES: AGENCY AS ART: https://t.co/tFdq4LJygB

C Thi Nguyen @add_hawk

26K Followers 2K Following Philosophy professor. Writes about games, trust, art, intimacy, echo chambers, metrics. My new book is GAMES: AGENCY AS ART: https://t.co/tFdq4LJygB

Stuart Ritchie 🇺�.. @StuartJRitchie

36K Followers 1K Following Research Comms @AnthropicAI

Uriah @crimkadid

15K Followers 45 Following

Julie Kallini ✨ @JulieKallini

600 Followers 337 Following CS PhD @StanfordNLP 🌲 Previously: SWE @Meta, Class of '21 @PrincetonCS

Joy He-Yueya @JoyHeYueya

71 Followers 68 Following CS PhD student working on AI for education @StanfordAILab

Postdoc @MilaNLProc, working on evaluating and improving LLM safety. Previously PhD @oiioxford & CTO/co-founder @rewire_online

Paul Röttger @paul_rottger

2K Followers 455 Following Postdoc @MilaNLProc, working on evaluating and improving LLM safety. Previously PhD @oiioxford & CTO/co-founder @rewire_online

Robert Palgrave @Robert_Palgrave

7K Followers 1K Following Professor of Inorganic and Materials Chemistry at UCL. Director of UK National XPS Service @harwellxps

Arc Institute @arcinstitute

22K Followers 24 Following A new scientific institution for curiosity-driven biomedical science and technology.

Megan Stevenson @MeganTStevenson

7K Followers 984 Following economist & legal scholar studying criminal justice. UVA law prof. w/long covid. research: https://t.co/kkj5xtHIjg

Design @elicitorg. Makes visual essays about UX, programming, and anthropology. Adores digital gardening 🌱, end-user development, and embodied cognition

Maggie Appleton @Mappletons

37K Followers 1K Following Design @elicitorg. Makes visual essays about UX, programming, and anthropology. Adores digital gardening 🌱, end-user development, and embodied cognition

I write about how 20th C. R&D orgs operated and advise new R&D orgs @GoodSciProject | Formerly @Stanford

I want to help people start historically great labs

Eric Gilliam @eric_is_weird

3K Followers 1K Following I write about how 20th C. R&D orgs operated and advise new R&D orgs @GoodSciProject | Formerly @Stanford I want to help people start historically great labs

W. David Marx @wdavidmarx

11K Followers 644 Following Author of Status and Culture, Ametora, and an upcoming cultural history of the 21st century for Viking (Fall 2025). Newsletter at https://t.co/M0KE6eCmKM.

Founder the Critical Internet Studies Institute & BU Asst Professor of Journalism. Whistleblower coverage from Washington Post: https://t.co/MSEz9RhVQn

Joan Donovan, PhD �.. @BostonJoan

45K Followers 5K Following Founder the Critical Internet Studies Institute & BU Asst Professor of Journalism. Whistleblower coverage from Washington Post: https://t.co/MSEz9RhVQn

Research Scientist at @nvidia. Interested in the intersection of Computer Systems and ML. Occasionally tweet about sports. Views are my own.

Deepak Narayanan @deepakn94

1K Followers 1K Following Research Scientist at @nvidia. Interested in the intersection of Computer Systems and ML. Occasionally tweet about sports. Views are my own.

William Gilpin @wgilpin0

5K Followers 2K Following asst prof @UTAustin physics @OdenInstitute interested in chaos, fluids, & biophysics.

Kenny Peng @kennylpeng

80 Followers 16 Following CS PhD student at Cornell Tech. Interested in interactions between algorithms and society. Princeton math '22.

Ada Lovelace Institut.. @AdaLovelaceInst

23K Followers 2K Following Making data & AI work for people & society. Sign up for our fortnightly newsletter: https://t.co/lTk3R2LxwO

Alex Beutel @alexbeutel

2K Followers 682 Following

Oam Patel @_oampatel_

110 Followers 106 Following undergrad at harvard

Kyle Hsu @kylehkhsu

705 Followers 471 Following PhD student @StanfordAILab.

Vatsal @vatsal_manot

3K Followers 891 Following Building @PreternaturalAI (YC W24). Maintainer of @SwiftUIX.

Ian Carroll @iangcarroll

9K Followers 1K Following Founder at @SeatsAero. Travel/points, application security, security research, etc.

Genomics, Machine Learning, Statistics, Big Data and Football (Soccer, GGMU).

Post: @anshulkundaje, Threads: anshulkundaje

Anshul Kundaje (anshu.. @anshulkundaje

22K Followers 2K Following Genomics, Machine Learning, Statistics, Big Data and Football (Soccer, GGMU). Post: @anshulkundaje, Threads: anshulkundaje

Moxie Marlinspike @moxie

21 hours ago

I made this last weekend to experiment w/ building an app end to end on LLMs: vibecheck.market It's like Wirecutter, but uses an LLM to recommend product choices based on reddit conversations and reviews, so you don't have to spend 20-30min reading reddit My experience:…

66 75 632 171K 502

Chris Olah @ch402

18 hours ago

Scaling laws for dictionary learning! transformer-circuits.pub/2024/april-upd…

Adam Jermyn @AdamSJermyn

21 hours ago

Some small updates from the Anthropic Interpretability team: transformer-circuits.pub/2024/april-upd…

1 13 93 60K 65

1 16 174 41K 112

Download Image

Adam Jermyn @AdamSJermyn

21 hours ago

Some small updates from the Anthropic Interpretability team: transformer-circuits.pub/2024/april-upd…

1 13 93 60K 65

Belinda Li @belindazli

a day ago

@ChrisRytting Thanks for the tag, just getting around to this— the main alternative I can think of is in RL where the task may be specified by a reward function, goal state, policy, etc.

1 0 1 167 0

Belinda Li @belindazli

a day ago

@ChrisRytting Natural language prompts can also take many forms: they are commonly a set of (potentially programmatic) instructions as you noted, but they may also be a description of the goal, a list of requirements, an interactive dialogue (as in my and @AlexTamkin ‘s recent GATE paper)

0 0 1 165 0

Alex Albert @alexalbert__

2 days ago

Our first Build with Claude contest was a success! We received tons of great submissions from @AnthropicAI devs. Here are the 5 winning projects (in no particular order)🧵

11 39 474 193K 685

Download Image

Amanda Askell @AmandaAskell

2 days ago

There are pieces of poetry, literature, and philosophy that I only come to appreciate after I experience something important and realize they encapsulate that experience. Until then, the thing just seemed kind of mid. I wonder how many gems are still hidden in the sea of mid art.

7 2 74 5K 9

Moshe Poliak @MoshePoliak

3 days ago

10 134 436 79K 351

Download Image

Michael Nielsen @michael_nielsen

3 days ago

Related: the decentralized increase in power is stimulating a concomitant increase in surveillance (from traffic light cameras to surveillance of DNA synthesis). It's mostly pretty centralized, though, without any strong enablement of either sousveillance or (at the least)…

0 1 17 3K 3

Sam Bowman @sleepinyourhat

3 days ago

This result is pretty clearly specific to the style of backdoor we're working with, and doesn't support broad claims like 'interpretability solves misalignment', but it's still surprisingly strong. Worth a look!

Anthropic @AnthropicAI

4 days ago

New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…

29 162 933 237K 427

Download Image

2 4 68 8K 16

Sasha Rush @srush_nlp

3 days ago

There is a really nice community of researchers developing transformer alternatives. Want to highlight these impressive folks. Simran Arora (@simran_s_arora), Chunting Zhou (@violet_zct), Dan Fu (@realDanFu), and Songlin Yang (@SonglinYang4)

6 59 447 45K 238

Download Image

Matthew Siu @MatthewWSiu

3 months ago

starting a thread to document some interface design explorations of mine: discover similar words in a particular semantic direction using word embeddings

Matthew Siu @MatthewWSiu

2 years ago

imagining what a color picker for words could look like

107 1K 10K 0 1K

Download Video

10 26 293 21K 184

Anthropic @AnthropicAI

4 days ago

To make the probes, we track how the model’s internal state changes between “Yes” vs “No” answers to questions like "Are you doing something dangerous?" We use this info to detect when a sleeper agent is about to misbehave (e.g. insert a code vulnerability). It works quite…

4 18 147 37K 39

Download Image

Anthropic @AnthropicAI

4 days ago

29 162 933 237K 427

Download Image

Philipp Fränken @jphilippfranken

5 days ago

3 31 145 59K 139

Download Gif

Jason D. Clinton @JasonDClinton

6 days ago

The paper is already outdated given the release of more power models but there's an important empirical trend line to observe here. This portends the need for defenders to get patches out to every piece of infrastructure in days, not months.

ML Safety Daily @topofmlsafety

a week ago

LLM Agents can Autonomously Exploit One-day Vulnerabilities GPT-4 can autonomously exploit 87% of real-world one-day vulnerabilities, identified in a dataset of critical severity CVEs, compared to 0% for all other tested models arxiv.org/abs/2404.08144

1 18 39 70K 21

Download Image

1 3 23 4K 8

Ax Sharma @Ax_Sharma

a week ago

A GitHub flaw lets attackers upload executables that appear to be hosted on a company's official repo, such as Microsoft's—without the repo owner knowing anything about it. The following URLs, for example, make it seem like these ZIPs are present on Microsoft's source code repo:…

55 1K 5K 773K 2K

Download Image

Daniel Johnson @_ddjohnson

a week ago

Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub: github.com/google-deepmin…

43 419 2K 306K 1K

Download Video

Thomas Woodside @Thomas_Woodside

2 weeks ago

There's been a lot of buzz about "emergent abilities" in large language models, including some media exaggeration. I took a crack at explaining the different perspectives. 🧵 cset.georgetown.edu/article/emerge…