Cornelius Emde @CorEmde
AI Security | AI Agents | ML Robustness | PhD @UniofOxford and @OxfordTVG | ex RS @Wise Oxford Joined September 2020-
Tweets71
-
Followers152
-
Following614
-
Likes422
@OwainEvans_UK What do you think: where does an inductive bias towards true statements come from? Did you test this with open data models, Pythia or Olmo, where you can check negation in data mix? Might it be possible to run controlled experiments in <100M models?
4. is the reason why we built github.com/parameterlab/M…
It's interesting how the usage of LLMs has been quickly progressing to higher levels of abstraction: 1. prompt engineering 2. context engineering 3. agent scaffold engineering (we are here now) 4. multi-agent architecture engineering 5. ??? It's also curious how people don't
9/ Work done at @parameterlab with Alexander Rubinstein @a_rubique, Anmol Goel @anmgoel, Ahmed Heakl, Sangdoo Yun @oodgnas, Seong Joon Oh @coallaoh, and Martin Gubri @framart1
8/ 🔗 Website: parameterlab.github.io/MASEval/ GitHub: github.com/parameterlab/M… Docs: maseval.readthedocs.io/en/stable/ arXiv: arxiv.org/abs/2603.08835
Great work lead by @anmgoel on how fragile contextual integrity can be in LLMs. This work shows that contextual privacy degrades easily during fine-tuning on benign data and common safety benchmarks don't pick this up. #AISecurity #AIAgents
🚨 Fine-tuning your model to be more helpful or empathetic might be making it less private, without you noticing. In our latest work, we show that benign fine-tuning can silently break contextual privacy in language models while safety & general capabilities appear intact. ⬇️
@oanacamb @imperialcollege @ucl @UniofOxford Congrats!
Excited to share our preprint! We show that sustained macrophage and B cell responses are essential for heart regeneration in Mexican cavefish, helping uncover why surface fish heal but cavefish scar 🫀🐟. Check out the full story: biorxiv.org/content/10.110…
@negar_rz I am very interested in working with you and would love to connect but l can’t message you on Twitter nor LinkedIn :)
Come see our poster today. 🗓️ Poster session 1 @ 10am 📍 Hall 3 + Hall 2B #239
🚨 New paper alert: Our recent work on LLM safety has been accepted to ICLR 2025 🇸🇬 We propose a new framework for LLMs safety. 🧵 (1/7) #LLM #AISafety #ICLR2025 #Certification #AdversarialRobustness #NLP #Shhhhhh #DomainCertification #AI
Read more: cemde.github.io/Domain-Certifi… Thanks to my amazing collaborators: - Alasdair Paren, @trojantiger88 (P. Arvind), @maximek3 (M Kayser), @tom_rainforth, @philiptorr, @Adel_Bibi at @UniofOxford - @BernardSGhanem at @KAUST - Thomas Lukasiewicz at @tu_wien (7/7)
To obtain such certificates, we present a simple, scalable and powerful algorithm: VALID. Remarkably, for each unwanted response it provides a global bound in prompt space 🚀 (6/7)
🚨 New paper alert: Our recent work on LLM safety has been accepted to ICLR 2025 🇸🇬 We propose a new framework for LLMs safety. 🧵 (1/7) #LLM #AISafety #ICLR2025 #Certification #AdversarialRobustness #NLP #Shhhhhh #DomainCertification #AI
Ephraiem Sarabamoun @epsarabamoun
20 Followers 598 Following
Selim Kuzucu @SelimKuzucu
110 Followers 377 Following PhD Student @cvml_mpiinf, formerly @Google, @BoschGlobal / @_FiveAI, @metu_imagelab & @AFAR_Cambridge
Tycho van der Ouderaa @tychovdo
2K Followers 3K Following Postgraduate researcher (PhD) at Imperial College London and visiting researcher at the University of Oxford. ML & AI.
Henry Kenlay @hennesseeeeee
450 Followers 2K Following MTS @Latent_Labs. Previously @ExscientiaAi, @UniofOxford (@aims_oxford), @Cambridge_Uni. Machine Learning ∩ Biology 🤖🧬
Charlie London @CharlieLondon02
206 Followers 421 Following DPhil student in ML theory at Oxford. Learning theory, RL theory, LLMs. Arsenal fan.
Ziyan Wang @ZiyanWang98
88 Followers 418 Following PhD Student @Kingscollegelon, Ex-Research Intern @MSFTResearch, Prev Visiting PhD @CarnegieMellon, IDAI fellow, working on MARL+LLM
Hjalmar Wijk @HjalmarWijk
815 Followers 413 Following Chief Scientist @ METR Trying to understand the transformative impacts of AI early enough to give the world a chance to react + shape what's to come
Fredrik K. Gustafsson @fregu856
1K Followers 6K Following Postdoc at IBME in Oxford. Machine learning for healthcare. I'm more active on https://t.co/vwXdiYvHig.
Quan Van @QuanVan1461346
2 Followers 188 Following
Sarthak @kaytraser
274 Followers 3K Following
James Oldfield @jamesaoldfield
215 Followers 442 Following Interested in interpretability and AI safety. Postdoc at Oxford
Quentin Berthet @qberthet
3K Followers 2K Following Research Scientist at Google DeepMind Machine Learning - Paris
Anmol Goel @anmgoel
642 Followers 2K Following NLP ∩ Privacy @ELLISForEurope PhD @UKPLab, @TUDarmstadt and @UCPH_Research | Prev @iiit_hyderabad |
Martin Gubri @framart1
382 Followers 783 Following Research Lead @parameterlab working on Trustworthy AI | he/him Other accounts: 🦋 mgubri | 🐘 @[email protected]
Peter Potaptchik @PPotaptchik
359 Followers 438 Following DPhil student at Oxford https://t.co/JH0l4u7wHv
Aayush Karan @aakaran31
2K Followers 1K Following PhD student @Harvard and @nvidia | Algorithmic insights for generative machine learning | @PDSoros 2024 | Prev @GoogleDeepMind, @citsecurities, @Apple
Elle Michelle Yang @ellemichelley
326 Followers 2K Following 🤖 AI @CompSciOxford @Berkeley_EECS @twosigma @cohere etc. 🤓 Social NLP (NLP x Societal Impacts) 🔍 Almost everywhere on the internet @elleismatic
Alexander Rubinstein @a_rubique
98 Followers 174 Following PhD Student @ IMPRS-IS + University of Tübingen
Davide Buoso @BuosoDavide
37 Followers 375 Following PhD Student focusing on Robot Learning @VandalLabPolito. x @OxfordTVG, @MRG_Oxford.
a @zheq3208
3 Followers 476 FollowingXing Jin @xingjinxj
9 Followers 153 Following
Sethu Priya @sethu_priya_
96 Followers 6K Following
Alex Gertz @AlexGertzUSA
229 Followers 1K Following MLE • Financial Engineering • previously Physics & Math @unc
Tapon's voice @realTaponRay
6 Followers 966 Following BTech CS @VITAPuniversity | #HCI #Agent #Algorithm ➡️Aim: Assembling human as superHuman | aim=∫(∞)dt 🎩
Dave Atkinson @dave_senseon
302 Followers 7K Following Founder @SenseonTech | Rebuilding cybersecurity for the age of Human-AI teaming.
James Morrison Rubin @import_jmr
7K Followers 6K Following Product Lead | Gemini Applied Research Prev: Launched @aws Trainium, @Columbia Neuro Tweets are my own. Retweets are not endorsements. Build anything.
Tegan Jegede @jegede_tegan
191 Followers 7K Following Founder @ T*Technologies Building AI tools that help people work smarter 🤖 BEng (EEE) | MSc CompSci | RL & AI Agents 🧑🏾💻 Madrid ⚽️ | Arsenal ⚽️ | GSW 🏀
Joe Stacey @_joestacey_
2K Followers 2K Following NLP postdoc at @SheffieldNLP Ex @Imperial_NLP PhD, @Apple AI/ML Scholar, @UCL MSc Model robustness and now uncertainty quantification
Yoram Bachrach @yorambac
4K Followers 7K Following Research Scientist at Meta (prev Google DeepMind and Microsoft Research). Working on LLM Agents and Multi-Agent Systems.
Ahmed Elgohary @aagohary
275 Followers 968 Following #NLP Researcher @MSFTResearch - AI Frontiers. Ph.D. @umdcs
Blah @underthesen
45 Followers 5K Following City fan since 2006. Love SAF, Jose and Pep equally. Treble Winning Citeh hopeful.
Freddie Bickford Smit... @fbickfordsmith
482 Followers 861 Following ML postdoc at Oxford with @tom_rainforth
Kia @kiaashour
99 Followers 475 Following DPhil student at @UniofOxford, interested in UQ and decision making
1 @B1y5OvVU7T68240
689 Followers 7K Following
Ameya P. @AmyPrb
592 Followers 669 Following Postdoc @bethgelab; Previously: @OxfordTVG, @intelailabs Profile - https://t.co/To9NNR5Izc
Anya Sims @anyaasims
155 Followers 147 Following PhD student @UniofOxford+@FLAIR_Ox supervised by @yeewhye and @j_foerst. Prev interned @graphcoreai; placement @CambridgeMLG. Deep learning, LLMs x RL, meta-RL
Lachin Naghashyar @kalsbskk81826
8 Followers 30 Following
Alexander Pondaven @alexpondaven
111 Followers 599 Following Working on controllable video generation. PhD student @UniofOxford @aims_oxford @OxfordTVG @Snap. MEng @Imperialcollege
Antonio Valerio Micel... @AVMiceliBarone
1K Followers 2K Following ML / NLP School of Informatics, The University of Edinburgh
Mathias Jackermeier @m_jackermeier
34 Followers 100 Following ML PhD @UniofOxford @aims_oxford | Student Researcher @GoogleDeepMind | Instruction-following, RL, LLMs | Prev MSc CompSci @UniofOxford 🎓
Ben Walker @benjaminwalker
184 Followers 185 Following Postdoctoral Researcher with DataSig II @OxUniMaths. Researching Neural DEs and the theory of rough paths. Email: [email protected]
Ngọc Duyên @Ngocduyenng9x
2 Followers 47 Following Hạnh phúc của bạn, bạn phải tự mình nắm bắt bởi sẽ không ai thay thế bạn làm điều đó cả ❤️
Arcadia Impact @ArcadiaImpact
117 Followers 16 Following We run AI safety training and research programmes including LASR Labs, AI Governance Taskforce, and the Arcadia Alignment Team.
Stephan Rabanser @steverab
788 Followers 384 Following Postdoctoral Researcher @Princeton. Reliable, safe, trustworthy machine learning. Previously: @UofT @VectorInst @TU_Muenchen @Google @awscloud
Anca Dragan @ancadianadragan
14K Followers 184 Following Google DeepMind • AI safety, alignment, collaboration • post training • associate professor @ UC Berkeley EECS
Adam Karvonen @a_karvonen
4K Followers 702 Following ML Researcher, doing MATS with Owain Evans. I prefer email to DM.
Andon Labs @andonlabs
12K Followers 14 Following Safe Autonomous Organizations without humans in the loop
Yo Shavit @yonashav
9K Followers 1K Following ai resilience @foundationOAI. Past: @openai / @HarvardSEAS / @SchmidtFutures / @MIT_CSAIL. Tweets my own; on my head be it.
Trent AI @TrentAIHQ
56 Followers 43 Following We make every software product secure and compliant by design — delivering invisible, automatic security for the AI age, empowering teams to innovate freely.
Sahar Abdelnabi 🕊 @sahar_abdelnabi
2K Followers 889 Following PI @ELLISInst_Tue & @MPI_IS | ex. @Microsoft, PhD @CISPA | AI safety & security | life & peace for all Opinions my own.
Benjamin Chang @benjamin0chang
2K Followers 460 Following Automating discovery | ML PhD @OxfordStats
Selim Kuzucu @SelimKuzucu
110 Followers 377 Following PhD Student @cvml_mpiinf, formerly @Google, @BoschGlobal / @_FiveAI, @metu_imagelab & @AFAR_Cambridge
Steven Adler @sjgadler
10K Followers 1K Following Co-founder of Guidelight AI Standards (https://t.co/tNBPmVsPqo), ex-OpenAI safety researcher, writing at https://t.co/R5KV9j3lsG
Owain Evans @OwainEvans_UK
20K Followers 451 Following Runs an AI Safety research group in Berkeley (Truthful AI) + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email to DM.
Tim Rocktäschel @_rockt
46K Followers 2K Following Co-Founder @Recursive_SI, Professor of AI @AI_UCL, PI @UCL_DARK, Fellow @ELLISforEurope. Ex @GoogleDeepMind @AIatMeta @CompSciOxford
Geoffrey Irving @geoffreyirving
11K Followers 349 Following Previously Chief Scientist at the UK AI Security Institute (AISI), before that DeepMind, OpenAI, Google Brain, etc.
Gal Yona @_galyo
948 Followers 534 Following Research scientist @googleai, previously CS PhD @weizmannscience
John Schulman @johnschulman2
75K Followers 2K Following Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music
Liu Yang @Yang_Liuu
930 Followers 383 Following Research Scientist @GoogleDeepMind, PhD @WisconsinCS
Eghbal Hosseini @eghbal_hosseini
808 Followers 1K Following visiting researcher at @GoogleDeepMind; PhD in computational neuroscience at @mit with @ev_fedorenko
Chris Painter @ChrisPainterYup
6K Followers 1K Following president @METR_Evals, evals accelerationist, working hard on AGI preparedness
Rylan Schaeffer @RylanSchaeffer
6K Followers 2K Following AI RS @ Meta TBD. On-Leave from Stanford w/ @sanmikoyejo. Prev @ Gemini, Meta, MIT, Harvard, Uber, UCL, UC Davis
Sanmi Koyejo @sanmikoyejo
3K Followers 106 Following I lead @stai_research at Stanford. Co-founder @VirtueAI_co
Stanford Trustworthy ... @stai_research
846 Followers 57 Following A research group in @StanfordAILab researching AI Capabilities, Trust and Safety, Equity and Reliability Website: https://t.co/CgOHvNHL4x
Federico Barbero @fedzbar
3K Followers 322 Following Research scientist @googledeepmind I like Transformers and graphs. I also like chess and a few other things as well.
Samuel Stanton @samuel_stanton_
2K Followers 1K Following technical staff @AnthropicAI | prev. cofounder @CoefficientBio | @NYUDataScience PhD | developing AI agents for scientific discovery in biotech
Drew Prinster @DrewPrinster
387 Followers 386 Following Safe & regulatable AI/ML for health | My job is (mostly) error bars 🫡 (eg, conformal prediction) | CS PhD at Johns Hopkins. Prev at Yale. he/him
Technical AI Safety C... @tais_2026
330 Followers 40 Following TAIS 2026 will bring together leading AI safety experts to discuss how to make AI safe, beneficial, and aligned with human values.
Mikita Balesni 🇺�... @balesni
1K Followers 693 Following AI alignment @openai. Past: @apolloaievals, Reversal curse, Out-of-context reasoning // support 🇺🇦 https://t.co/eagDB8VUzz
Peter Romov @romovpa
141 Followers 50 Following AI Security & Privacy Researcher PhD @imperialcollege
Florian Tramèr @florian_tramer
6K Followers 208 Following Assistant professor of computer science at ETH Zürich. Interested in Security, Privacy and Machine Learning
Edoardo Debenedetti @edoardo_debe
1K Followers 2K Following PhD student @CSatETH 🇨🇭 | AI Security and Privacy 😈🤖 | From 🇪🇺🇮🇹 | prev research intern @meta @google
Alexander Panfilov @kotekjedi_ml
1K Followers 318 Following MATS 9.0 | PhD @ELLISInst_Tue & @MPI_IS doing AI Safety & Adversarial ML
Igor Shilov @_igorshilov
2K Followers 451 Following Research Fellow at @goodfireai PhD student at @imperialcollege AI Security & Privacy 🏳️🌈
Andrew Campbell @AndrewC_ML
772 Followers 141 Following Research Scientist, Google DeepMind. Previous: @Xaira_Thera, PhD @oxcsml
Henry Kenlay @hennesseeeeee
450 Followers 2K Following MTS @Latent_Labs. Previously @ExscientiaAi, @UniofOxford (@aims_oxford), @Cambridge_Uni. Machine Learning ∩ Biology 🤖🧬
Tycho van der Ouderaa @tychovdo
2K Followers 3K Following Postgraduate researcher (PhD) at Imperial College London and visiting researcher at the University of Oxford. ML & AI.
Keir Bradwell @keirbradwell
4K Followers 1K Following Editorial @anthropicai. Formerly @givewell & very tenuously @cambridge_cpt
Stewart Slocum @StewartSlocum1
1K Followers 197 Following AI alignment @xai | prev @AnthropicAI fellow, phd @MIT
Tomek Korbak @tomekkorbak
4K Followers 618 Following ai safety @openai | previously: @AISecurityInst @AnthropicAI @nyuniversity @SussexUni
Vahab Mirrokni @mirrokni
3K Followers 82 Following Google Fellow, VP | Gemini Data Area Lead | GenAI Algo, GraphML, ML efficiency & Economics @ Google Research. Former MSR, Amazon, MIT PhD, Sharif Univ. BSc
Hardik Bhatnagar @hrdkbhatnagar
390 Followers 234 Following Building PostTrainBench Evals, Long horizon, Interp, Safety | PhD @ Max Planck, Tübingen Prev: @MSFTResearch
Martin Pawelczyk @MartinPawelczyk
492 Followers 452 Following Prof for Responsible AI @Vienna. #AISafety #DataCentricAI. Previously Postdoc @Harvard, @JP_Morgan AI Research, PhD from @uni_tue
Charles Foster @CFGeek
3K Followers 569 Following Excels at reasoning & tool use🪄 Tensor-enjoyer 🧪 @METR_Evals. My COI policy is available under “Disclosures” at https://t.co/bihrMIUKJq
Charlie London @CharlieLondon02
206 Followers 421 Following DPhil student in ML theory at Oxford. Learning theory, RL theory, LLMs. Arsenal fan.














