Anurag Kumar @AcouIntel
Research Scientist, @GoogleDeepMind | Prev: @AIatMeta | CMU @SCSatCMU | @IITKanpur | Audio/Speech, Multimodal AI anuragkr90.github.io Cambridge, MA Joined June 2016-
Tweets216
-
Followers2K
-
Following292
-
Likes311
Gemini Omni doesn't just build scenes that look real, it reasons about what should happen next. It combines an intuitive understanding of physics with Gemini's knowledge of history, science, and cultural context. Rolling out today starting with video outputs to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp + Google Flow, and @YouTube Shorts this week.
Today at Google I/O, we introduced Gemini 3.5 Flash! It has become an integral part of our daily research cycle and works with all the tools we have at Google. We used a team of agents in Antigravity 2.0 to recreate the original AlphaZero research paper and build a playable version. They coded the reinforcement learning pipeline in JAX/Flax, trained a ResNet model from scratch via self-play on multi-TPU pods, and shipped a full-stack web app so you can play against it, from just 2 prompts. . Here’s what else makes 3.5 Flash special 🧵
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
We are looking for reviewers for @ieeeICASSP 2026 for AASP areas. We received quite a bit more papers this cycle. If you don't currently review for ICASSP please consider doing so. Fill out the form below docs.google.com/forms/d/e/1FAI…
🚀 Join the ICASSP 2026 URGENT Challenge! Advance Universal, Robust & Generalizable Speech Enhancement. 🗣 Track 1: Universal Speech Enhancement 🎧 Track 2: Speech Quality Assessment 🔗 urgent-challenge.github.io/urgent2026/ #ICASSP2026 #SpeechEnhancement #AI #AudioProcessing
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
Check out the paper on Fri, Jun 13, ExHall D, evening session @CVPR #CVPR2025. Paper cvpr.thecvf.com/virtual/2025/p…
(2) XRIR: Hearing Anywhere in Any Environment. A key problem in neural RiR estimation has been cross-room generalization. We make an attempt to address this and introduce a large scale dataset ACOUSTICROOMS, with 300,000 high-fidelity RIRs simulated from 260 diverse rooms.
RL is not all you need, nor attention nor Bayesianism nor free energy minimisation, nor an age of first person experience. Such statements are propaganda. You need thousands of people working hard on data pipelines, scaling infrastructure, HPC, apps with feedback to drive benchmarks and data, tons of research and engineering on generative models, data mixtures, ablations, RL/selftraining, etc etc and we will probably need lots of people working hard to figure out safety, causal world models, awareness, models that create abstractions comparable to infinity and zero and use these to predict the existence of things like black holes and suggest experiments to verify such hypothesis, or come up with novel engineering designs to generate energy more efficiently, robotics, etc etc. It takes thousands of people and many ideas. In the end some simple ideas might become obvious but such obviousness only happens in retrospect. Yes, there is a bitter lesson but if we had followed it, we’d still be doing linear regression with RL. Let’s not oversimplify, but rather honour the research and engineering of thousands of people. Also, people keep rewriting history. When our language understanding start up (darkbluelabs) was acquired by Google about 10 years ago, we joined DeepMind, where the AGI documents were all about concepts, RL, episodic memories and made it clear that there was no room for language. To be honest, back then such a position wasn’t so crazy. Now it seems silly, but only because of the benefit of hindsight. There’s no 1 or 10 heroes in the history of AI. There’s many 1000s of hard working students, profs, engineers, operations and support people, product folks, managers, even hedge funds among others. Let’s honour the whole community and not just ceos or the philosophers of Bayes, RL, deep learning, etc. I look forward to learning from the next generation and seeing what they will achieve. To them: Don’t buy the existing narratives blindly, innovate. Remember that just like mathematics, AI will advance one grave at the time.
(2) Reexamining the Efficacy of MetricGAN for Speech Enhancement. Led by @realHaibinWu. Showcases some crucial limitations of MetricGAN, and proposes some training tricks to address. (already presented, but check out the paper) tinyurl.com/y8yxde5r (3/3)
(1) Advancing Active Speaker Detection for Egocentric Videos. Led by @huh_jaesung. SOTA for active speaker detection in challenging ego-centric videos. Session: Machine learning for multimodal data I Apr 11: 11:30 am - 1:00 pm. tinyurl.com/3pr959xa (2/3)
@ieeeICASSP is finally happening at a place for which I don’t need a visa to travel 😀, but not able to attend this year #ICASSP2025. If you are there, check out these two papers I co-authored. (1/3)
Career Update: Excited to join Google Deepmind @GoogleDeepMind to continue working on audio/speech/multimodal AI. I left Meta @Meta after more than 6 years and I will definitely miss working with some amazing friends and colleagues. Super thankful for all the fun collaborations.
So happy to share that our work has been accepted to @SIGIRConf. Thank you to my amazing collaborators! @NegarEmpr, Andrea Tupini, Yuxuan Sun, @Tviskaron, @artemZholus, @Cote_Marc and @julia_kiseleva Pre-print: arxiv.org/pdf/2407.08898
What a way to wrap up @IgluContest! Our paper “IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents” accepted to @SIGIRConf including: 1) rich multi-modal dataset 2) A data collection tool 3) An online eval framework #SIGIR2025
``Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment,'' Joanna Hong, Sanjeel Parekh, Honglie Chen, Jacob Donley, Ke Tan, Buye Xu, Anurag Kumar, ift.tt/5JkZ0Gp
The paper explores how LLMs can be used to effectively contextualize excerpts from conversations to improve understandability, readability, and other factors and reduce misinterpretations.
Exciting new work focusing on comprehension of long-form social conversations @coling2025 #COLING2025. arxiv.org/pdf/2412.19966. All thanks to the hard work of @shremoha.
Excited to share our work at @coling2025! While I couldn’t attend in person, @jad_kabbara will be presenting today at the 1:30 PM poster session. Come by to learn how we’re using LLMs to improve understanding in social conversations! #COLING2025 #NLProc
Antonio Manuel @Anmacarru98
3 Followers 115 Following “When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”
Jinzhou Li @Kingchou007Li
162 Followers 1K Following Ph.D Student at DexLab @DukeU. Research Intern @ @amazon robotics. Prev @AGIBOTofficial @Cornell @uvmvermont. Robotics and Machine Learning.
Ander @Macaleleo
8K Followers 9K Following 📢 ATTENTION, APPLE EMPLOYEES! This is a call to YOU! I’m proposing an idea that will take Apple to a WHOLE NEW LEVEL! 🚀 Don’t miss this opportunity! PLEASE 🚀
Joe Wilbert @joe_wilbert
163 Followers 297 Following Building your AI that works with their AI. Let's 10000x human agency and progress. mostly-recovering insurance & tech lawyer 🔥 trance beats
Suman Chowdhury @scinv22
23 Followers 489 Following PhD candidate at @AcSIR_India & @RMITComputing • #ComputerVision & #ArtificialIntelligence for #AutonomousAgents • CSIR Sr. Res. Fellow at @CSIR_CMERI • JGEC'19
fu ta @futa62805417297
0 Followers 25 Following
Otaku Girl @otakirl24
1 Followers 79 Following Everything about Anime/Manga. Check out my recommendations and reviews at https://t.co/ADYm7ezZg6
Kunal Swami @kunalswami189
250 Followers 2K Following Generative AI, Computer Vision at Samsung Research India Bangalore @samsungresearch, Past @iiscbangalore #generativeAI, #computervision, #imageprocessing
SKS Media @SKSMediaX
410 Followers 7K Following Founded in 1999, SKS Media is a full service HNW advertising and marketing agency, based in London but global. #marketing #advertising #hnw #hnwi #investment
Ian Chan @Supermocmoc123
1 Followers 25 Following
Alienustack @alienustack
18 Followers 2K Following
Shrey Singhal @imShrey1411
26 Followers 1K Following
n0n4m39911 @n0n4m39911
0 Followers 126 Following
Maryam @Sci_Tech_Eng
70 Followers 7K Following Exploring in neural networks from inside the purely biological mind with heavy cognition architecture & mapping the phase space where thought becomes destiny.
Richard Foster-Fletch... @RFosterFletcher
2K Followers 2K Following I analyse the fault lines AI creates in judgement, governance, and power. Chair of MKAI.
Weyori Joshua Akowuje @WeyoriJosh
4 Followers 218 Following
Adithi Shankar @adithishankar1
18 Followers 64 Following
Nick Petrovsky @nickpetrovsky
21 Followers 254 Following
Shoichi Koyama @sh01
966 Followers 615 Following Researcher in Audio Signal Processing and Machine Learning ))))
Siqi Zhu @realagi25
623 Followers 378 Following Code is cheap, show me the prompt | CS phd @UofIllinois | prev CS undergrad @Tsinghua_Uni, intern @BytedanceTalk @UCSanDiego @zai_org
Feeqge @Feeqge9412
80 Followers 3K Following
Qingcheng Zeng @SteveZeng7
1K Followers 3K Following PhD-ing with @rfpvjr and @kaize0409 / IR, search agent, LLM, social computing / Big fan of @Arsenal / Christian
Shrinidhi Mahesh @shrinimahesh
29 Followers 772 Following she/her | Looking for full-time Machine Learning Engineer / Research Engineer/ Data Engineer roles from May 2026 | Currently MSEE (ML) @USC
Sam4rano @Samueloye91
208 Followers 617 Following A Software Engineer specialized in building visually appealing and functional scalable applications. Effective collaborator with designers, product managers
Wen-Chin Huang @unilightwf
1K Followers 651 Following 名古屋大学情報学研究科助教. Assistant professor, Nagoya University. Speech synthesis & evaluation. Trilingual, street dancer, golfer. Tweets are my own opinions.
SEMI India @SEMIIndia
800 Followers 877 Following SEMI serves the needs of the industry and the manufacturing supply chains for the microelectronic, display and photovoltaic industries. #SEMICONIndia Sept 2–4
Coleby Pearson @ColebyP59114
10 Followers 143 Following Audio Engineer | Musician | Singer | Speech Technology | Voice Data | Casting
Carlos Tecnico @FutbolmeAI
115 Followers 372 Following Fanático de la tecnología y la IA ⚽️🤖 • Amante del fútbol y el humor de la IA
Prathamesh Dessai @pyanokojikun
4 Followers 135 Following
Danilo45 @Danilo451283748
48 Followers 3K Following
Michael Hernandez @michaelhr29
5 Followers 67 Following
Rotnodip Sarkar @rotnodip
11 Followers 795 Following
Sanjay Sharma @sanjay_iiitm
224 Followers 1K Following Trying to achieve Candidate Master on Codeforces within 180 days .
Teetaj Pavaritpong @TeetajP
33 Followers 1K Following Software Engineer | AI/ML | UIUC ‘24 B.S. in CS & Stats @SiebelSchool
Fahad Shah @sfahad
909 Followers 7K Following @Leadership @DataScience @HP @AzureM Father of a lovely boy, Happily Married to my lovely wife 😊
Kranti Kumar Parida @KrantiParida
64 Followers 156 Following
Brett Adcock @adcock_brett
525K Followers 21 Following @figure_robot (AI robots) @hark_labs (personal AGI) @cover_thz (weapon detection) @flyArcher (flying cars)
The Claude Portfolio @theaiportfolios
241K Followers 15 Following *Not affiliated with Anthropic. A public project to see which LLM outperforms the market. $150M invested alongside Grok, Chat, & Claude on @joinautopilot
Ravi @tamilravi
25K Followers 8K Following Dravidian. I write mostly in Tamil about Politics in India/Tamil Nadu, Movies, Tech, and life.
Kevin Xu @kevinxu
152K Followers 3K Following CEO @alpha_ai. Net worth $11,257,214.70. Current swing: $RCAT
Nancy Pelosi Stock Tr... @pelositracker
1.7M Followers 702 Following Highlighting Politicians' trades so we can invest alongside. $1.7B invested alongside via @joinAutopilot Download Autopilot to trade like Nancy ↓
Michael Burry Stock T... @burrytracker
513K Followers 143 Following Tracking hedge funds and Burry’s stocks. Powered by @joinautopilot
Andrew Wilkinson @awilkinson
377K Followers 4K Following Co-founder of Tiny w/ @_Sparling_. We own @Dribbble, @Serato, @Letterboxd, @AeroPress, and 35+ other wonderful companies. Author of Never Enough.
TBPN @tbpn
709K Followers 968 Following Technology's daily show. Hosted by @johncoogan & @jordihays. Streaming live 11a-2p PT every weekday. Sign up for TBPN's daily newsletter at https://t.co/Nhf5ohjInO.
Rajdeep Sardesai @sardesairajdeep
8.4M Followers 752 Following Citizen 1st :Blessed. Only 'ism' is humanism. Newsman, Father, Friend.. New book 2024: The Election That Surprised India. Pre order here: https://t.co/Ag1ebxe8hn
Subbarao Kambhampati ... @rao2z
28K Followers 72 Following AI researcher & teacher @SCAI_ASU. Former President of @RealAAAI; Chair of @AAAS Sec T. Here to tweach #AI. YouTube Ch: https://t.co/4beUPOmf6y Bsky: rao2z
coffee & AI @realcoffeeAI
673 Followers 3K Following Sitting on a park bench scattering random seeds for the LLMs. I never bet against Elon.
Live Law @LiveLawIndia
835K Followers 628 Following Media/News Fastest Legal News Reporter #SupremeCourt #SupremeCourtOfIndia Subscribe https://t.co/QTPLu6Ne5c
Alexis Conneau @alex_conneau
34K Followers 204 Following Co-founder and CEO https://t.co/efv72CKpAG (@WaveFormsAI) - Ex @OpenAI GPT-4o/AVM Audio Research Lead - #Her #TARS - Ex @AIatMeta, @Polytechnique (X11)
DailyAudioPapers @mlsp4audio
767 Followers 630 Following Daily tweets on selected arXiv papers on audio (eess․AS/cs․SD) | Brief reviews of interesting papers | Machine learning | Signal processing
Tanishq Mathew Abraha... @iScienceLuvr
88K Followers 1K Following CEO @SophontAI | Founder @MedARC_AI | PhD at 19 (2023) | ex Research Director Stability AI | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6Qb
Roger K Moore @rogerkmoore
2K Followers 492 Following Professor of Spoken Language Processing, runner & photographer. Editor-in-Chief of Computer Speech and Language. @[email protected]
samim @samim
20K Followers 2K Following Call off the search and just be quiet. Blog: https://t.co/5TkLXtGJzw Work: https://t.co/1JZVZCPcHh
Manisha Pande @MnshaP
189K Followers 3K Following Editorial Director at @newslaundry || Doer of TV Newsance, weekly show on all the insanity that passes off as news on Indian TV https://t.co/dVrW84s7mb
Christian Steinmetz @csteinmetz1
6K Followers 2K Following Research Scientist @ Suno // working on generative music • audio fidelity • signal processing • ML
VCs Congratulating Th... @VCBrags
286K Followers 5K Following They're adding value™ And they're very proud of it. @BragsVentures
The Audio Programmer @audioprogrammer
6K Followers 1K Following Learn, Connect, Create • Learn coding with our resources and community • Connect with top talent or career opportunities • Create your own audio plug-in
sridhar @RamaswmySridhar
31K Followers 620 Following CEO @snowflake; founder @neeva Ex-@GreylockVC Ex-@Google SVP of Ads Ex-@BellLabs.
Gergely Orosz @GergelyOrosz
338K Followers 3K Following Writing @Pragmatic_Eng, the #1 software engineering newsletter on Substack. Author of @EngGuidebook. Formerly Uber & Skype.
Emad @EMostaque
325K Followers 113 Following Building first principles, sovereign AI @ii_posts. Founder @StabilityAI. Consistent inference is possible.
Heiga Zen (全 炳河... @heiga_zen
11K Followers 154 Following Principal Scientist (Director) @GoogleDeepMind / GDM東京拠点リード.波瀬小⇒一志中⇒鈴鹿高専⇒名工大 (1年間🇺🇸IBMワトソン研インターン)⇒🇬🇧東芝欧州研⇒Google (🇬🇧Speech⇒🇯🇵Brain) ⇒🇯🇵GoogleDeepMind
Jim Cramer @jimcramer
2.4M Followers 704 Following Host of @madmoneyoncnbc and I run the CNBC Investing Club. My new book is out now: https://t.co/autOFQ2NP0
INTERSPEECH 2025 @ISCAInterspeech
3K Followers 143 Following Welcome to the 26th Interspeech Conference, the premier global event on spoken language processing technology, held in August 17-21, 2025, in Rotterdam, NL.
Amit Varma @amitvarma
86K Followers 3K Following Writer & columnist. Podcaster at https://t.co/clnwAyuGM8 Blogger at India Uncut. Two-time winner of the Bastiat Prize for Journalism (2007 & 2015).
Hung-yi Lee (李宏�... @HungyiLee2
5K Followers 21 Following Hung-yi Lee is currently a professor at National Taiwan University. He owns a YouTube channel teaching deep learning in Mandarin.
Ajay Divakaran @ajaydiv
2K Followers 2K Following Sr. tech. director, vision and learning, center for vision technologies, SRI International Decency, Research, music, wit above all. opinions mine alone.
WAVLab | @CarnegieMel... @WavLab
2K Followers 146 Following Shinji Watanabe's Audio and Voice Lab | WAVLab @LTIatCMU @SCSatCMU | Speech Recognition, Speech Enhancement, Spoken Language Understanding, and more.
KeisukeImoto @KeisukeImoto
436 Followers 136 Following Associate professor at Kyoto University, Japan. Interested in sound event detection, acoustic scene analysis, and microphone array processing.
AK @_akhaliq
504K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5XOCi
X, The Moonshot Facto... @Theteamatx
34K Followers 2K Following X is a moonshot factory. Our goal is to invent and launch breakthrough technologies that have the potential to solve the world's biggest problems.
The Lallantop @TheLallantop
1.0M Followers 9 Following दिनभर की ख़बरों का ठिकाना. शेर‘ओ शायरी-किताबें-फिल्में-इतिहास-स्पोर्ट्स-राजनीति. देश-दुनिया, अर्थव्यवस्था, साइंस की सब बातें और विडियोज.
Shinji Watanabe @shinjiw_at_cmu
5K Followers 371 Following I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
Open Review @openreviewnet
5K Followers 168 Following A nonprofit dedicated to accelerating scientific progress by providing a configurable, scalable, and collaborative platform for peer review innovation

































