Takuya Yoshioka @_ty274
Speech technology researcher/manager @AssemblyAI linkedin.com/in/ty274/ Bellevue, WA Joined November 2016-
Tweets912
-
Followers546
-
Following57
-
Likes3K
Want to hear a friend in a noisy café? We designed deep learning-based headphones that let you isolate the speech from a specific person just by *looking* at them for a few seconds. CHI'24 honorable mention award. Paper: arxiv.org/abs/2405.06289 Code: github.com/vb000/LookOnce…
I got an early demo of this when I visited @uwcse a couple months ago and the ability to isolate sounds in your environment was pretty great. Nice work, @b_veluri, Malek Itani, Tuochao Chen, Takuya Yoshioka, and @ShyamGollakota!
Want to hear a friend in a noisy café? We designed deep learning-based headphones that let you isolate the speech from a specific person just by *looking* at them for a few seconds. CHI'24 honorable mention award. Paper: arxiv.org/abs/2405.06289 Code: github.com/vb000/LookOnce…
@JonathanLeRoux @IEEEsps @IEEEorg Congrats!
Hi all, please let me know if you know large-scale speech data that can be used for training our Whisper reproduction (OWSM) model (arxiv.org/abs/2309.13876). We plan to move to OWSM v4.
The code and project page are here. Code: github.com/uw-x/AcousticS… Project page: acousticswarm.cs.washington.edu
Creating speech zones with self-distributing acoustic swarms Our latest paper in Nature Communications unveils distributed microphones based on an autonomous acoustic robotic swarm, creating "speech zones" in real-world settings. Paper: nature.com/articles/s4146…
Last Friday marked the end of my 7-year journey at Microsoft, filled with rewarding challenges, both in research & production, and incredible colleagues. I'll be starting something new very soon. マイクロソフトを退職しました。まだずっとシアトル界隈にいます。
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer paper page: huggingface.co/papers/2308.06… Recent advancements in generative speech models based on audio-text prompts have enabled remarkable innovations like high-quality zero-shot text-to-speech. However, existing models still face limitations in handling diverse audio-text speech generation tasks involving transforming input speech and processing audio captured in adverse acoustic conditions. This paper introduces SpeechX, a versatile speech generation model capable of zero-shot TTS and various speech transformation tasks, dealing with both clean and noisy signals. SpeechX combines neural codec language modeling with multi-task learning using task-dependent prompting, enabling unified and extensible modeling and providing a consistent way for leveraging textual input in speech enhancement and transformation tasks. Experimental results show SpeechX's efficacy in various tasks, including zero-shot TTS, noise suppression, target speaker extraction, speech removal, and speech editing with or without background noise, achieving comparable or superior performance to specialized models across tasks.
SpeechX from our new paper is a single generative model that edits, enhances & creates speech, enabling zero-shot TTS, spoken content editing (while preserving ambience), speaker extraction & speech/noise removal. Demo: aka.ms/speechx Paper: arxiv.org/abs/2308.06873
To everyone booking their @IEEE_WASPAA trip: please consider attending #SANE2023, which will take place at NYU on Thursday October 26, the day after #WASPAA2023. Register at saneworkshop.org/sane2023/
Dear #WASPAA2023 authors, the review results are out now. Please go ahead and check out at cmt3.research.microsoft.com/WASPAA2023/. We appreciate your precious contribution and kind interest regardless of the acceptance decision!
@ieeeICASSP Are there poster printing facilities at/near the conference venue?
Real-time target sound extraction with waveformer (to appear in ICASSP). Joint work with UW researchers. Paper (updated): arxiv.org/abs/2211.02250 Demo: waveformer.cs.washington.edu Code (both causal and non-causal): github.com/vb000/Waveform…
WASPAA 2023 calls for papers! The traditional intimate Mohonk Mountain House with exciting changes: double-blind review, an unprecedented amount of travel grants, and more. More information: waspaa.com/call-for-paper… #waspaa2023
すごい! 世界最大1万9千時間の音声コーパスと高精度日本語音声認識モデルがオープンソースで公開 - 窓の杜 forest.watch.impress.co.jp/docs/news/1471… via @madonomori
@shinjiw_at_cmu Congratulations, Watanabe-san!
The #ICASSP2023 paper submission site is now open! Submit your papers by 19 October 2022 to be considered. Learn more about the paper guidelines and submission requirements here: hubs.la/Q01nmxt_0
@SamueleCornell Yep, conventional ASR models should be good for the headset recordings.
How can we do streaming multi-talker ASR by best combining speech separation and overlap-robust ASR? t-SOT-VA does that and works for real meeting audio with any # of mics, achieving the best published WERs of 13.7%/15.5% for AMI-MDM dev/eval. Paper: arxiv.org/abs/2209.04974
@SamueleCornell Good question! We focused on the distant mic setup and didn't do headset experiments in such a way that the distant-mic vs. headset numbers can be directly compared. Let us consider how to do the experiment and report the additional result.
Shinji Watanabe @shinjiw_at_cmu
5K Followers 371 Following I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
AK @_akhaliq
504K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5XOCi
WAVLab | @CarnegieMel... @WavLab
2K Followers 146 Following Shinji Watanabe's Audio and Voice Lab | WAVLab @LTIatCMU @SCSatCMU | Speech Recognition, Speech Enhancement, Spoken Language Understanding, and more.
Jonathan Le Roux @JonathanLeRoux
2K Followers 309 Following Speech and audio research scientist at MERL. Opinions never really my own. 🦋https://t.co/6pSuhzw3fb
Desh Raj @rdesh26
4K Followers 2K Following Speech + LLMs @nvidia | Previously: @Meta MSL, @jhuclsp, @IITGuwahati
Robin Scheibler @fakufakurevenge
886 Followers 934 Following Grower of cucumbers 🥒, tomatoes 🍅, and chilli peppers 🌶️. I ❤ audio, microphone arrays, IoT, Python, and data.
まっすー @ymas0315
2K Followers 2K Following
Yuma Koizumi @yuma_koizumi
4K Followers 505 Following Staff Research Scientist @GoogleDeepMind Tokyo 🇯🇵. Gemini for APAC speech research TL. Tweets are my own.
Hirofumi Inaguma @HirofumiInaguma
1K Followers 1K Following Multimodal agent at Reality Labs @MetaAI
yamakatz @kyama0321
1K Followers 1K Following 🐻🐼👨🏻🎓🧑🏻💻🎧🦻🚘🟢 Research Scientist 専門は聴覚や補聴技術など音響学全般。人間の感覚を補助・拡張する数理・技術・装置・環境の未来に興味あり。
shimoll @shimolle
269 Followers 254 Following
Siddharth Dalmia @siddalmia05
2K Followers 450 Following Voice AI @Meta | #SpeechProc and #NLProc | Previously @WaveformsAI @GoogleDeepmind | PhD @LTIatCMU @SCSatCMU
Samuele Cornell @SamueleCornell
985 Followers 525 Following Post-doc @ CMU LTI. Audio and speech researcher.
Katsuhito Sudoh (ja) @katsuhitosudoh
4K Followers 2K Following 奈良女子大学 教授(生活環境科学系 生活情報通信科学領域). 機械翻訳の研究をしている気がします/ Keywords: Eマウント,平SFC,平JGC,DL Gold,万年筆,インク,お茶,ZFS / ポストの内容は当人個人の見解です English: @katsuhito_sudoh
Yusuke Kida @KID_A_Radiohead
682 Followers 588 Following Gen-AX株式会社(ソフトバンク傘下の生成AI子会社)CTO。AIが自律的に判断してコールセンターの応対を行うX-Ghostの開発をリード。専門は音声認識・音声信号処理。https://t.co/vDh9zfkAZ2
Mirco Ravanelli @mirco_ravanelli
4K Followers 2K Following Deep learning for Conversational AI. Creator of SpeechBrain.
Yi Zhong @yiz_be_building
494 Followers 712 Following Work with AI, not for AI! Building @besimple_ai to make AI hear you YC P25, Ex Meta / Dropbox / MSFT PM, MIT '16
Miguel Twahirwa @MrTwahirwa_king
553 Followers 8K Following Results driven | Passionate about Analytics, Data Sc., AI | https://t.co/91qafWWSTy Analytics @ Carnegie Mellon'18 | Ex Deloitte Tech, Ex Chime | 🇷🇼🇺🇸🔛🇨🇦
Varun Singh @vr000m
2K Followers 2K Following @trydaily @pipecat_ai. ex-CEO @callstatsio acq’d by $eght. earlier multimedia protocols and video. Focus on growth, revenue. 🇺🇸🇫🇮🇮🇳
Robert Scoble @Scobleizer
586K Followers 50K Following San Francisco/Silicon Valley AI | Robots, holodecks, BCIs, analysis of new things | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future.
Jessee Constantino @JesseeCons20223
8 Followers 125 Following Oracle Cloud – GPU/AI Infrastructure | U.S. Army Veteran
Angelwing @Angelwing19714
32 Followers 4K Following
Matt @Matt60289293
0 Followers 28 Following
Parker Jennifer Nina ... @EdwindeLeon12
745 Followers 3K Following The best motivation for any trader is to be better today than you were yesterday. Washington District of Columbia, USA https://t.co/4ATZlKkdNM
ravinder syal @ravisyal
641 Followers 6K Following Independence OS embeds a Constitutional AI into independent clinics, lifts their economics with six agents, and exits at 2–3x to PE or health systems.
Tauthyez @TauthyezHzYb
40 Followers 919 Following
BUT Speech @ButSpeech
711 Followers 294 Following We do impactful research and raise new leading scientific personalities in the field of speech processing.
Susannahoffs hoffs @Susannahof65590
42 Followers 2K Following American singer songwriter musician and actress 🌎❤️🇺🇲
Hirotaka Hiraki 平�... @hirotakahiraki
1K Followers 2K Following PhD candidate @rkmt Lab in U-Tokyo/ HCI, Speech, Wearable Interface / ACT-X / intern @uwcse,@AIST_JP / IPA 未踏’21 / eeic2018 / @YURAteam1,Juggling,classic guitar
Young Scientist Award... @youngsc06963908
729 Followers 3K Following International Young Scientist Awards
nvm @iyoume___
50 Followers 309 Following UX Researcher / MS at @hcdeUW / Fulbrighter ←BS in CS←コミカレ←社会人←高校中退。HCI、UXリサーチとデザイン、A11yに興味があります。
Nebius @nebiusai
31K Followers 922 Following The ultimate cloud for AI innovators. For GenAI open-source model endpoints, check out @nebiustf.
Annabelle_US_ @AnnabelleU89654
76 Followers 5K Following
AJ @AJ__CB30
112 Followers 721 Following Interested in AI/ML and AI for scientific Discovery. Alumnus of @hinducollege_du CS Phd @RiceUniversity and currently at @MSFTResearch ex @GoogleIndia.
Takayuki Arakawa @ArakawaTakayuki
1 Followers 88 Following
Alkis Koudounas @AlkisKoudounas
242 Followers 632 Following Research @Sony | ex- @AmazonScience | PhD @PoliTOnews || Post-training SpeechLLMs | Trustworthy and Responsible AI
JHU CLSP @jhuclsp
8K Followers 7K Following Center for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSQtw @[email protected]
z jay @zjay951907
0 Followers 62 Following
yhpeng @peng_yanghui
0 Followers 65 Following
Ken Chatfield @kinacoken
93 Followers 176 Following
バイリンガルニ... @Bilingual_News
56K Followers 662 Following 毎週木曜更新の無料ポッドキャスト。独自の「バイリンガル会話方式」で、リアルな英会話を配信中!文字起こし・英語表現解説・宿題・単語帳などは公式アプリから。京都大学でリスニング教材として使われています。
Jeff Dean @JeffDean
443K Followers 6K Following Chief Scientist, Google DeepMind & Google Research. Gemini Lead. Opinions stated here are my own, not those of Google. TensorFlow, MapReduce, Bigtable, ...
Lenny Bogdonoff @rememberlenny
14K Followers 5K Following
Zixiong Su @zshawnsu
361 Followers 524 Following Human-Computer Interaction, Speech LM. @GoogleAI PhD Fellowship, JSPS DC2. Prev . Research intern @Meta @RealityLabs Visiting Researcher @UCLA, intern @SonyCSL
gbil @GOexle
60 Followers 532 Following
n0n4m39911 @n0n4m39911
0 Followers 126 Following
Bartosz Antosik @bartoszantosik
99 Followers 2K Following
Sathvik Udupa @SathvikUdupa
70 Followers 574 Following Graduate Student, BUT Speech@FIT. Previously, SPIRE Lab, IISc.
nakazawa kazushi(中�... @nkzwkzs
739 Followers 3K Following 博士(工学)音声認識系の仕事をしています DNNベースで音声の品質を評価する研究していました IEEE Sendai YPとASJ若手フォーラムで活動してます
bagofwords.ai @bagofwordsai
273 Followers 4K Following All About NLP and Its Applications #safenlp #NLProc #ai #ml
Melody @MQuashe59232
8 Followers 2K Following For good-looking clothes and worthy people, you have to work hard.
Awareness AI @AwarenessAI
24 Followers 261 Following 🤖 Leading AI awareness & ethical education. Bridging tech & society for a smarter future. #AIForAll #FutureOfEducation #TechEthics
Satvik Dixit @SatvikDixit9
159 Followers 1K Following Voice Agent Evals | Prev @CarnegieMellon @IITDelhi
Shinji Watanabe @shinjiw_at_cmu
5K Followers 371 Following I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
AK @_akhaliq
504K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5XOCi
WAVLab | @CarnegieMel... @WavLab
2K Followers 146 Following Shinji Watanabe's Audio and Voice Lab | WAVLab @LTIatCMU @SCSatCMU | Speech Recognition, Speech Enhancement, Spoken Language Understanding, and more.
Jonathan Le Roux @JonathanLeRoux
2K Followers 309 Following Speech and audio research scientist at MERL. Opinions never really my own. 🦋https://t.co/6pSuhzw3fb
Desh Raj @rdesh26
4K Followers 2K Following Speech + LLMs @nvidia | Previously: @Meta MSL, @jhuclsp, @IITGuwahati
Robin Scheibler @fakufakurevenge
886 Followers 934 Following Grower of cucumbers 🥒, tomatoes 🍅, and chilli peppers 🌶️. I ❤ audio, microphone arrays, IoT, Python, and data.
Yuma Koizumi @yuma_koizumi
4K Followers 505 Following Staff Research Scientist @GoogleDeepMind Tokyo 🇯🇵. Gemini for APAC speech research TL. Tweets are my own.
Hirofumi Inaguma @HirofumiInaguma
1K Followers 1K Following Multimodal agent at Reality Labs @MetaAI
Wei-Ning Hsu @mhnt1580
2K Followers 145 Following Research Scientist @ Meta FAIR / audio generation, self-supervised learning, speech processing
AI at Meta @AIatMeta
805K Followers 323 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Alexis Conneau @alex_conneau
34K Followers 204 Following Co-founder and CEO https://t.co/efv72CKpAG (@WaveFormsAI) - Ex @OpenAI GPT-4o/AVM Audio Research Lead - #Her #TARS - Ex @AIatMeta, @Polytechnique (X11)
Andrew Ng @AndrewYNg
1.6M Followers 1K Following Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs
Hung-yi Lee (李宏�... @HungyiLee2
5K Followers 21 Following Hung-yi Lee is currently a professor at National Taiwan University. He owns a YouTube channel teaching deep learning in Mandarin.
Soumith Chintala @soumithchintala
306K Followers 1K Following Building new things @thinkymachines. Also dabble in robotics at NYU. Cofounded @PyTorch. AI is delicious when it is accessible and open-source.
Anurag Kumar @AcouIntel
2K Followers 292 Following Research Scientist, @GoogleDeepMind | Prev: @AIatMeta | CMU @SCSatCMU | @IITKanpur | Audio/Speech, Multimodal AI
Yuki Mitsufuji @mittu1204
5K Followers 82 Following PhD, Distinguished Engineer @Sony, Lead Research Scientist/VP of AI Research @SonyAI_global, Visiting Research Professor @nyuniversity, World's Top 2% Scientist
Alexandre Défossez @honualx
5K Followers 525 Following Leading ambitious research @kyutai_labs. Chief Science Officer @gradiumai.
Somshubra Majumdar @HaseoX94
893 Followers 480 Following Sr. Deep Learning Research Engineer @NVIDIAAI. MSCS'18 @UICCS. Multi-domain Deep Learning researcher and library developer. All opinions are my own.
Jong Wook Kim 💟 @_jongwook_kim
4K Followers 699 Following Member of Technical Staff @OpenAI; previously at @nyuMARL, @SpotifyResearch, @pandoramusic, @kakaocorpglobal, and @NCSOFT
Naoyuki Kanda @naoyukikandaslp
144 Followers 88 Following
DailyAudioPapers @mlsp4audio
767 Followers 630 Following Daily tweets on selected arXiv papers on audio (eess․AS/cs․SD) | Brief reviews of interesting papers | Machine learning | Signal processing
Alexandr Wang @alexandr_wang
491K Followers 858 Following chief ai officer @meta, founder @scale_ai. rational in the fullness of time
Akash Mahajan @akashmjn
638 Followers 813 Following now 🎧; prev chatting with PDFs @ContextualAI; transcription @Azure Speech; @Stanford @atherenergy @iitmadras
Sanchit Gandhi @sanchitgandhi99
5K Followers 40 Following Research @MistralAI. Previously speech @huggingface, Masters at @Cambridge_Uni.
Georgi Gerganov @ggerganov
62K Followers 292 Following 24th at the Electrica puzzle challenge | building https://t.co/baTQS2bdia | engineer @huggingface
Hervé "pyannote" Bre... @hbredin
2K Followers 707 Following Hervé Bredin /👨🏻💻 Creator of 🎹 pyannote / ⚒️ Co-founder and CSO @pyannoteAI /👨🏼🔬 Researcher @CNRS (on leave)
Miguel J 🇺🇦 �... @bonuelphotog
358 Followers 495 Following Head of AI at Circle Medical. Previously: https://t.co/V2CJc8dV6V, Temi, Voicebox, Nuance. 18+ years of exp in Speech Recognition, Translation, and language technologies.
Polina Kazakova @polinaeterna
769 Followers 131 Following
Sriram Ganapathy @tweet4sri
379 Followers 161 Following Associate Professor, Indian Institute of Science, Bangalore. | Director @ https://t.co/8XecL8cSYI
Sharath Adavanne @adavanne
563 Followers 723 Following Applied AI/ML, PhD @TampereUni, Previously @facebook (@meta), @AdobeResearch, @FreshworksInc, @Krutrim
북구 미래 하정�... @JungWooHa2
11K Followers 3K Following 전) 청와대 AI미래기획수석비서관 전) 네이버클라우드 AI 혁신 센터장 한국공학한림원 정회원 Ex) Sr. Secretary to the President for AI & Future Planning, #Korea
IEEE ICASSP @ieeeICASSP
5K Followers 1 Following IEEE International Conference on Acoustics, Speech, and Signal Processing. #ICASSP2026 will be held 4-8 May 2026 in Barcelona, Spain.
Qiuqiang Kong @QiuqiangK
1K Followers 245 Following Assistant Professor at @CUHKofficial, previously at @ByteDanceTalk, Ph.D. at @UniOfSurrey
Gautham Mysore @GauthamMysore
780 Followers 301 Following Head of Audio and Video AI Research @AdobeResearch
Nicholas J. Bryan @NicholasJBryan
1K Followers 504 Following Head of Music AI, Adobe Research (personal account)
Lukas Biewald @l2k
25K Followers 4K Following Cofounder/CEO of @wandb - tools for AI developers acquired by @coreweave
Yu Wang @yuwang_tw
1K Followers 431 Following Music x ML | Research Scientist @Spotify. PhD @nyuMARL. prev @AdobeResearch @GoogleMagenta 🎶🎸🇹🇼
Eduardo Fonseca @edfonseca_
1K Followers 565 Following Research Scientist @GoogleDeepMind. Sound Understanding. Previously @GoogleAI and @mtg_upf. He/him.
Efthymios Tzinis @ETzinis
552 Followers 300 Following Senior Research Scientist @GoogleAI | Ph.D. from @IllinoisCS | Formerly @merl_news, @RealityLabs | My opinions do not represent my employer
Shang-Wen Li @ShangwenLi1
2K Followers 979 Following Research Scientist at FAIR; #AI, #NLProc & #speech processing; Past: PhD @MIT_CSAIL, ML scientist at AWS, Alexa & Siri; Views my own
gontani @gontani
1K Followers 1K Following 音声認識の周辺を徘徊するR&Dエンジニア。最近はLLM。博士(工学)→ポスドク中に1年間ドイツ滞在→2011年から企業勤め。電機→メガベンチャー→ @PreferredNetJP → @genax_corp // LLM、音声信号処理、音声認識 // 子育て、欧州等
Piotr Żelasko @PiotrZelasko
2K Followers 733 Following AI + Speech @ Nvidia. PhD @ AGH-UST, ex-JHU. My interests: speech processing technology; ML/AI software engineering. Building OSS for Speech AI.
Takuma OKAMOTO @okamotocamera
363 Followers 91 Following Research Manager@NICT, Japan / Jogging / Drinking
Justin Salamon @justin_salamon
3K Followers 762 Following Head of Sound Design AI Research at Adobe. Machine learning and signal processing for audio & video. Musician. He/him.
Weights & Biases @wandb
48K Followers 1K Following The AI developer platform.🛠️ Track and evaluate your LLM applications in real-time with @weave_wb.
























