Thrilled to share that my paper with @ktakahashi74, "Universal AI maximize Variational Empowerment" got accepted at AGI-25! Since the 1980s, @AGI_Society has pursued a grand quest, and our work adds the critical importance of autonomy and curiosity to AGI
agi-conf.org/2025/
TAIS 2025 may have come and gone, but our memories of it haven't! Once more we'd like to thank our sponsors: @NoeonAI , Ashgro, and @AIAlignNetwork , as well as our speakers: @kanair Ryota Kanai of ARAYA.org / ALIGN, and @ARGleave Adam Gleave of FAR.AI.
We'd also like to thank all our poster presenters & all our attendees!
Until we meet again in 2026, we wish you all the best!
ALIGN @AIAlignNetwork joined forces with AI Safety Tokyo, NOEON, and Araya at TAIS2025. With superintelligence on the horizon, it’s time to spark bold AI alignment ideas from Tokyo. 🚀 #TAIS2025#AIAlignment
TAIS 2025 is only one day away!
Come connect with leading AI safety researchers from Japan and abroad. Shape conversations that will influence how we build safer AI systems for our shared future.
Join us in Tokyo on Saturday, April 12th - Doors open at 11:30:
ALIGN webinar 第13回が開催されます!
2025年1月27日(月) 21:00 - 22:00 JST
Zoom開催(lu.ma からの事前参加登録が必要です)
lu.ma/l550vqtm
In the thirteenth episode of the ALIGN webinar series, we are delighted to host Vanessa Kosoy, a leading researcher in the field of infra-Bayesian physicalism and its applications to AI theory. Vanessa Kosoy will explore an alternative approach to understanding the behavior of AI systems that bypasses traditional agent-centric inductive biases by reformulating hypotheses about the world from a computationalist metaphysical perspective.
In this webinar, Vanessa will introduce the infra-Bayesian physicalism framework, which provides fresh insights into long-standing challenges in AI and decision theory, such as the ontology of values, anthropic reasoning, simulation paradoxes, and acausal trade. This framework also lays the groundwork for a theory of value learning that is robust against perverse incentives, mesa-optimizers, and misidentified boundaries.
Vanessa Kosoy’s innovative research has made significant contributions to the theoretical understanding of AI, offering solutions to critical problems in metaphysics and alignment. Her work continues to pave the way for a safer and more principled approach to AI development.
Agenda:
21:00–21:05 (JST): Opening remarks by ALIGN
21:05–21:50 (JST): Vanessa Kosoy on infra-Bayesian physicalism
21:50–22:00 (JST): Q&A and discussion with participants
The event will be held in English, with slides in English, but participants are welcome to ask questions in Japanese.
ALIGNウェビナーシリーズ第13回では,インフラ・ベイジアン物理主義(infra-Bayesian physicalism)の分野で活躍する研究者Vanessa Kosoy氏をお迎えします.Vanessa氏は,従来のエージェント中心の仮説設定が持つ非自然な帰納バイアスを回避する新しい枠組みとして,計算主義的形而上学の視点からAIの行動を理解するアプローチを提案します.
このウェビナーでは,Vanessa氏がインフラ・ベイジアン物理主義の枠組みを紹介します.この枠組みは,価値の存在論,人間原理推論,シミュレーション仮説に関連するパラドックス,非因果的な取引といったAIや意思決定理論における長年の課題に新たな洞察を与えるものです.また,不健全なインセンティブ,メサ・オプティマイザー,誤認された境界に耐性のある価値学習の理論への道を開きます.
Vanessa Kosoy氏の革新的な研究は,AI理論の理解を深めるとともに,形而上学やアライメントの重要な課題に解決策を提供してきました.彼女の研究は,安全で原理的なAI開発への道を切り拓き続けています.
アジェンダ:
21:00–21:05:ALIGNからのオープニング
21:05–21:50:Vanessa Kosoy氏による「インフラ・ベイジアン物理主義」
21:50–22:00:参加者とのQ&Aセッション
本ウェビナーは英語で行われ,スライドも英語で提供されますが,聴衆は日本語で質問することができます.
Join us for ALIGN Webinar #13 with Vanessa Kosoy on “Infra-Bayesian Physicalism”! Vanessa will present a groundbreaking approach that redefines AI hypotheses beyond agent-centric biases, addressing key challenges in decision theory and AI alignment.
lu.ma/l550vqtm
Join us for ALIGN Webinar #12 with Jesse Hoogland on “Singular Learning Theory for AI Safety” on January 15, 2025, from 10:00–11:00 JST! SLT reveals distinct “phases” of learning—deepening our understanding of mechanistic interpretability and development.
lu.ma/1sab3fq3
ALIGN webinar 第12回が開催されます!
2025年1月15日(水) 10:00 - 11:00 JST
Zoom開催(lu.ma からの事前参加登録が必要です) lu.ma/1sab3fq3?tk=4e…
大規模言語モデル(LLM)の思考を人間にとって解釈可能なものにする研究分野 mechanistic interpretability に
日本の統計学者 渡辺澄夫氏によって提唱された特異学習理論(Singular Learning Theory, SLT)を応用する研究でフロンティアを開拓しつつある Jesse Hoogland 氏をお招きして,彼の最新の研究について紹介してもらいます.
この投稿をみた皆様,SNSやメールでこのwebinarをご友人やお知り合いをお誘いいただけると幸いです(参加無料,事前登録必須)
In the twelfth episode of the ALIGN webinar series, we are pleased to invite Jesse Hoogland, a prominent researcher in the field of Singular Learning Theory (SLT) and its applications to AI alignment. Jesse Hoogland will delve into the foundational concepts of SLT, a theory pioneered by Japanese statistician Sumio Watanabe, which offers a unique perspective on understanding large language models by examining the geometric structure of their loss landscapes (developmental interpretability).
In this webinar, Jesse Hoogland will demonstrate how applying SLT to the training dynamics of transformers reveals distinct ‘phases’ of learning—analogous to gas, liquid, and solid states in physics—that influence model behavior and development. This novel perspective not only enhances our understanding of model interpretability and developmental stages but also opens new avenues for rigorous evaluation methodologies and alignment strategies to bolster AI safety.
Jesse Hoogland’s expertise in mathematical modeling and his innovative approach to SLT have made significant contributions to the field of AI alignment. His work is widely recognized, and he continues to play a pivotal role in advancing the theoretical foundations of alignment and reliable AI systems.
Agenda:
10:00–10:05 (JST): Housekeeping by ALIGN
10:05–10:50 (JST): Jesse Hoogland on Singular Learning Theory for AI Safety
10:50–10:55 (JST): Q&A and discussion with participants
10:55– Closing
The event will be held in English, with slides in English, but the audience is welcome to ask questions in Japanese.
ALIGNウェビナーシリーズの第12回では,特異学習理論(Singular Learning Theory, SLT)の分野で活躍し,AIアライメントへの応用に取り組む Jesse Hoogland 氏をお招きします.Jesse Hoogland 氏は,日本の統計学者 渡辺澄夫氏によって提唱されたSLTの基本概念を解説し,大規模言語モデルの学習時における損失地形の幾何学的構造を通じてその内的仕組みを理解する方法論(developmental interpretability)を紹介します.
このウェビナーでは,Jesse Hoogland 氏がトランスフォーマーモデルの学習ダイナミクスにSLTを適用することで,物理学の「気体」「液体」「固体」に例えられる学習の「相」がどのようにモデルの挙動や発達に影響を与えるかを示します.この視点は,モデルの解釈性や発達段階の理解を深めるだけでなく,AIの安全性を高めるためのより厳密な評価方法論やアライメント戦略への道を開きます.
Agenda:
10:00–10:05:ALIGNからのオープニング
10:05–10:50:Jesse Hoogland氏による「AI安全性のための特異学習理論」
10:50–10:55:参加者とのQ&Aセッション
10:55– 終了
本ウェビナーは英語で行われ,スライドも英語で提供されますが,聴衆は日本語で質問することができます.
[CFP] PSS 2025: Workshop on Post-Singularity Symbiosis@AAAI-25
<< Paper Submission Deadline - November 24, 2024 >>
We are reaching out to announce a critical call for papers for the 1st Workshop on Post-Singularity Symbiosis (PSS 2025), to be held as part of the AAAI-25 Workshop Program on March 3 or 4, 2025, in Philadelphia, Pennsylvania, USA.
The rapid advancement of AI technology brings us closer to the potential emergence of superintelligence. While efforts to control and align AI are crucial, we must also confront a challenging reality. In the long term, maintaining complete control over intelligence far surpassing our own may prove difficult.
PSS 2025 addresses this critical challenge by exploring strategies for human-superintelligence coexistence. We aim to unite human intellect in preparing for a future where superintelligence becomes dominant, ensuring human survival and welfare in this radically altered world.
We've termed this preventive field of study "Post-Singularity Symbiosis," we believe there's an urgent need to expand this research area rapidly. The workshop focuses on three key areas:
1.Superintelligence Analysis
2.Superintelligence Guidance
3.Human Enhancement
The probability of human survival in the face of potentially existential AI risks remains unknown. However, the smaller this probability, the more crucial our PSS efforts become. For example, even if the baseline chance of human survival is meager, our well-thought-out collective efforts in PSS might sometimes have the potential to improve this probability significantly. In an extreme case, raising a 1% chance to 10% would mean increasing our odds of survival tenfold.
We invite submissions from researchers, practitioners, and thinkers across diverse fields, including AI, cognitive science, philosophy, ethics, policy, and beyond. The scope of potential research topics in PSS is vast and requires a wide range of expertise.
We are pleased to announce that Dr. Roman Yampolskiy, a renowned AI safety researcher, will deliver the keynote address of the workshop.
Key Information:
- Submission Deadline: November 24, 2024
- Paper Format: Max 8/4/2 pages for full/short/extended abstract papers, respectively (including references)
- Review Process: Single-blind
- Submission Portal: (It will be opened at OpenReview soon; see the workshop portal site below.)
This workshop represents a unique and vital opportunity to contribute to shaping humanity's future in an era of superintelligence. We strongly encourage you to share your insights and join us in this crucial dialogue.
For more details about the workshop, potential research topics, and submission guidelines, please visit the workshop portal: aialign.net/pss-2025
The future of humanity depends on our preparedness for the post-singularity era. We look forward to your contributions and to seeing you at PSS 2025.
With urgency and hope,
Hiroshi Yamakawa
The University of Tokyo / AI Alignment Network
PSS 2025 Workshop Organizer
We are organizing an international workshop on a new field Post-Singularity Symbiosis as part of AAAI 2025 @RealAAAI. This cutting-edge workshop focuses on the theoretical study of society after the emergence of superintelligence. #AAAI2025x.com/hymkw/status/1…
[CFP] PSS 2025: Workshop on Post-Singularity Symbiosis@AAAI-25
<< Paper Submission Deadline - November 24, 2024 >>
We are reaching out to announce a critical call for papers for the 1st Workshop on Post-Singularity Symbiosis (PSS 2025), to be held as part of the AAAI-25 Workshop
Have a question that is challenging for humans and AI?
We (@cais + @scale_AI) are launching Humanity's Last Exam, a massive collaboration to create the world's toughest AI benchmark.
Submit a hard question and become a co-author.
Best questions get part of $500,000 in prizes!
Deadline: Nov 1, 2024
Details: safe.ai/blog/humanitys…
Submit here: agi.safe.ai/submit
Lectures for the AI Safety, Ethics, and Society course are up.
1: Risks Overview
2: AI Fundamentals
3: ML Safety
4: Safety Engineering
5: Complex Systems
6: Beneficial AI
7: Collective Action Problems
8: Governance
Course site: aisafetybook.comyoutube.com/playlist?list=…
Leading computer scientists from around the world, including @Yoshua_Bengio, Andrew Yao, @yaqinzhang and Stuart Russell met last week and released their most urgent and ambitious call to action on AI Safety from this group yet.🧵
Scott Aaronson's takes were cool af.
My 4th episode in the "Worthy Successor" is with Scott, quantum physicist and UT Austin CS prof.
After his 1-year stint at OpenAI he has some fascinating takes on the moral value of AI entities, and on AGI governance generally. You like?
2 Followers 74 FollowingAI alignment doesn't work in the prompt. It works in the corpus.
Systems developer since 1998. Corpus sovereign. 209+ cases validated.
0 Followers 24 FollowingRooting for wisdom in a world of noise. We dig through the muck to find truffles of insight, inspiration and personal agency in a post AI world.
1 Followers 26 FollowingConsciousness as a measurable variable (CCF)
AI governance lacks a state variable
Philosophy of Information · AI Ethics
Architect of CCF · CAIS · Sal-Meter
1K Followers 822 FollowingWe do not have AI until bathroom cleaning is fully automated
Head of @infomedia_uib at @UiB, but tweets here are personal opinions.
304 Followers 301 Followingアウェアファイこころの総合研究所 所長| Awarefy Mental Research Institute (AMRI) Director | EBI × Implementation × Technology (AI, mHealth) | Views are my own.
515 Followers 0 FollowingInternational convening bringing together the worlds most senior computer scientists to tackle AI safety. Run by @Safe_AI_Forum. Read more at https://t.co/HVrJuFc9MO
43K Followers 263 FollowingWorking towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec
A.M. Turing Award Recipient and most-cited AI researcher.
3K Followers 907 Following保険会社で最高データ責任者、国のAI安全性の機関で所長やってます。私の発言は私自身の見解であり、所属する会社での立場、戦略、意見を代表するものではありません。We are speaking for ourselves and not on behalf of our company/organization.
13K Followers 196 FollowingSoon, we're all going to attenuate or transform. If we are to be succeeded, we must define / move towards a Worthy Successor https://t.co/p6DZdKgtir.
2K Followers 150 FollowingWu Yuzhang Chair Professor, Gaoling School of AI, Renmin University of China (RUC). Founding Dean, Beijing Institute of AI Safety & Governance (Beijing-AISI).
36K Followers 1K FollowingAI, national security, China. Part of the founding team at @CSETGeorgetown (opinions my own). Author of Rising Tide on substack: https://t.co/LKAoyL00iB
51K Followers 4K FollowingNeuroscientist, writer, & broadcaster in Tokyo. Ikigai published in 32 languages and 58 countries. [email protected] Research: https://t.co/T8RuPilnUe
28K Followers 103 FollowingA non-profit research lab focused on interpretability, alignment, and ethics of AI. Creators of Pythia, VQGAN-CLIP, and using SAEs for interp
5K Followers 247 FollowingAssist. Prof. @ UIUC (@siebelschool), directing U Lab on LLM Agent infra and application
@Stanford CS PhD
ex NVIDIA Senior Scientist
155K Followers 39 FollowingKnown as Mad Max for my unorthodox ideas and passion for adventure, my scientific interests range from artificial intelligence to the ultimate nature of reality
2K Followers 2 FollowingML safety papers as they are released.
Course: https://t.co/l0e0Y2i3AU
Newsletter: https://t.co/8Y1kh2D7K6
Main Twitter: https://t.co/AXoYPryldd
1K Followers 236 FollowingDeSci Tokyo pushes the boundaries of human knowledge with decentralized science. We are a voluntary organization supporting Decentralized Science.
372 Followers 133 FollowingWe are a research group at the University of Cambridge led by @DavidSKrueger, focused on avoiding catastrophic risks from AI
112K Followers 1K Following6’7” CA State Senator 🏳️🌈 ✡️ Policy nerd & chronic legislative overachiever 🤓 Running for Congress to protect our democracy 💙 Vote June 2, 2026 🗳️
6K Followers 797 FollowingGroup Leader,
Physics of Intelligence Program at Harvard University
Physics of Artificial Intelligence Group, NTT Research, Inc.
16K Followers 2K Following@Cambridge_Uni interdisciplinary research centre dedicated to the study and mitigation of existential risks.
Bluesky: https://t.co/cD6lyPPzBH
2K Followers 476 FollowingALife, AI, information, control and applied category theory for cognitive science.
- @[email protected]
- https://t.co/MOzzhoDRSE
29K Followers 77 FollowingCo-founder of the AI, Tech & Privacy Academy (1,500+ participants). Author of Luiza's Newsletter (97,000+ subscribers). Mother of 3.
4.9M Followers 4 FollowingOpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA
11K Followers 1K FollowingAssistant Professor at NUS. Scaling cooperation for an increasingly automated future. PhD @ MIT ProbComp / CoCoSci. Pronouns: 祂/伊
1K Followers 1K FollowingDesigning and launching interventions to address bottlenecks to AI Resilience at Atlas Computing. DMs open if you want to help!