seb @sebcrossa
cofounder @llmstats (yc s25). prev. @microHQ sebastiancrossa.com nyc Joined February 2015-
Tweets678
-
Followers634
-
Following2K
-
Likes2K
the problems are everywhere: data contamination (training on test sets), selective reporting (only publishing the good scores), prompt tuning that users can't replicate, cherry-picked metrics, custom internal benchmarks with zero transparency.
kimi k2 reported 50% on humanity's last exam. independent researchers tested it and got 29.4%. that's a 20 point gap on the same benchmark, same model.
started using poke two months ago and haven't gone a day without texting it. it's crazy good and 100% think people should try this before trying to set up openclaw / hermes
Say hi to the new Poke! 🌴 Now officially approved by Apple to text on Apple Messages. As the first and only AI agent. Chat now: Poke.com
5b tokens cursor.com/@sebcrossa
introducing cursor profiles! go claim your handle at cursor.com/profile
@benkimbuilds 100% true, have seen this firsthand
we started using them a few days before a big weekend outage. big fan of what @shcallaway and the team are building! in an era of slop, it's relieving to see a team with such high standards for craft and such crazy product velocity.
Sazabi is a next-generation observability platform designed for fast-moving, AI-native engineering teams. If you're using Datadog, Sentry, Grafana, or Axiom today, you could be moving ten times faster with @sazabi. Congrats on the launch, @shcallaway! ycombinator.com/launches/QdT-s…
Sazabi is a next-generation observability platform designed for fast-moving, AI-native engineering teams. If you're using Datadog, Sentry, Grafana, or Axiom today, you could be moving ten times faster with @sazabi. Congrats on the launch, @shcallaway! ycombinator.com/launches/QdT-s…
one of the first things you learn when starting a company is that it's 100x harder than you realized. but the greatest side effect is that you start noticing the hard work that was put into everything around you. from the businesses, the ads, the buildings. everything you see was built by people who poured their blood, sweat and tears into making things possible.
PROGRESSION: Claude Opus 4.8 (@AnthropicAI) sets a new high on LLM Stats Index after 3 years of progress. > 68 on LLM Stats Index > Frontier gained 85 index points over the 3.3-year window This shifts the predictions towards a more optimistic scenario of complete benchmarking saturation.
NEW #1: Claude Opus 4.8 (@AnthropicAI) takes the top spot on LLM Stats Index. > 68 on LLM Stats Index > +5 over previous SOTA (GPT-5.5)
NEW: Gemini 3.5 Flash (@GoogleDeepMind) lands at #5 on LLM Stats Index. >4x faster than other frontier models >$1.50 / $9 per 1M tokens It's significantly faster than other frontier models and the quality has increased significantly. It's also the best tool calling model we've tested lately.
@polynoamial Hey Noam! We're in the process of building new hard benchmarks focused on coding/long-ctx. We're a team of 2, fully focused on AI capability tracking since last year. Is it possible we could talk with someone from your team to align on what is maximally useful to measure?
🚀 LAUNCH WEEK KICKOFFS TODAY! The fastest way to ship an MCP app: Scaffold with mcp-use. Deploy with Manufact. Ship to Claude and ChatGPT. 🧵Here's the full path: 1/5
NEW: GPT-5.5 Instant is now available on LLM Stats. Try it now for free in our agent and code playgrounds.
Today we're introducing the LLM Stats Index. For 3.2 years, we've tracked every frontier model release. The Index aggregates 200+ benchmark results into a single TrueSkill rating per model, spanning law, healthcare, coding, tool calling, vision, and reasoning. Across every category and every modality, the leading model on the Pareto Frontier is GPT-5.5 (@OpenAI). On our trajectories, human-knowledge benchmarks saturate by mid-2027. Capability has been the primary axis. The field is converging on it. Two more are opening. The first is efficiency: total task cost is the cleanest proxy we have for intelligence/watt. The second is throughput: inference speed becomes the productivity ceiling once models are cheap and good enough. We're building the next generation of long-horizon coding, tool use, and long context benchmarks. If you're working on long-horizon evaluation in real domains, we'd like to chat.
Richard Mensah @riomensah
2K Followers 501 Following founder & ceo @salleylabs. 3x founder, building proactive AI infra for enterprises. 🔥insights on startups & AI. @fdotinc, @BostInno 25 under 25
Launch House 🏠 @launchhouse
25K Followers 543 Following A venture fund and community for the New Silicon Valley by @thatguybg & @j__cub
brett goldstein @thatguybg
27K Followers 5K Following founder/designer/context farmer @microHQ | investor @launchhouse | launch video critic | ex-Google, cognitive scientist | sprezzatura
yok @stephsugarchu
8 Followers 892 Following
Ayush @ay_ushr
2K Followers 821 Following Building @autumnpricing: rate limits and billing for ai agents
Emir Ayaz @emirayaaz
6K Followers 360 Following Designer and Founder @witharc_co ✴︎ Co-founder https://t.co/RxCqUBNXDf
Oykun @oykun
15K Followers 2K Following https://t.co/Ar0qlf92gB for ai start-up founders ——— https://t.co/XW59XjCPzH for designers.
Sigil Wen @0xSigil
53K Followers 7K Following thiel fellow | chairman @extraordinary 🇺🇸 | angel investor | @ConwayResearch Web 4.0: https://t.co/dskpyV1CqW
Eli Brown @iEchoic
1K Followers 1K Following Visiting partner at @ycombinator • founder at @teamguilded • engineer at @instagram, @xbox, @facebook, @roblox • angel investor • Starcraft II grandmaster
Ben Kim @benkimbuilds
3K Followers 383 Following staff product + eng @rabbithole_gg community https://t.co/MdNeaGydEV ambassador @openai
Ethan Calloway @ethanncalloway
495 Followers 1K Following lasagna enjoyer | head of marketing @ https://t.co/QbMrDt8ci4 | helping b2b companies turn Twitter into their hidden gtm weapon
Alejandro Mancilla @alexmancillapy
61 Followers 735 Following
Yosep @Yosep1956
248 Followers 3K Following ceo of scaling @Kissmetrics. growing to $20M with attribution tracking.
Liam Collins @liamcollins____
428 Followers 1K Following building @proxis_ai | YC S24 The open-source founder - everything in public Path to $40k mrr: ⬛︎▢▢▢▢▢▢▢▢▢ https://t.co/RbhbMZPiZ7 https://t.co/jFga9mQDYe
CallingBox @callingbox
2 Followers 2 Following The API for AI phone calls Discord: https://t.co/ae2fJYJbEm
luan oak @oak_luan
796 Followers 309 Following Founder @ Workez AI Building Behavioral Systems for Wealth, Health, Relationships
doubli abdelouhab @doubliOfficial
18 Followers 145 Following 💻 Full-Stack Developer 🌐 Premium Domain Investor 🚀 SaaS Entrepreneur Building the next wave of AI digital assets. Turning ideas into revenue. 🇲🇦
FARRUKH KHAN @1khanfarrukh
5K Followers 7K Following Founder & CEO Inference Analytics, building AI for sensitive data industries, AI entrepreneur. Alum: IBM (Big Data), Informatica, MicroStrategy, McKinsey.
Liam @liammatteson
3K Followers 1K Following Designing @Browserbase. Previously designed at Copilot Money, Vanta, and others startups.
Le Bon 🦊 @Nie_HAM
29K Followers 19K Following Head of growth x building @AppSnair. Scout https://t.co/fBIaAenpif Tracking liquidity and macro due to personal interests. Sometimes I share, but mostly under an NDA
Srikar Dandamuraju @realsreeks
85 Followers 152 Following Building the auth layer for next generation of agents https://t.co/CaT2dh819L
Angel 🙌 @IntAngel1
107 Followers 742 Following Building cool stuff 😎 The Mind blowing Labs 🧪: https://t.co/yx6qzkVWym
Arun Kushwaha @austen_dev
8 Followers 313 Following 21 | Developer | Secretary I&E Cell @AITPune | 2x Hackathon Winner | Open for tech talks
Mayank @mayonkeyy
438 Followers 461 Following eval-perfected harnesses @ https://t.co/HDRTrY9lcd | ex @meta @apple
J M @jmeyers5
149 Followers 334 Following macro/VC and other good risk adjusted ways to be long the market “never confuse genius with a bull market “
Cristobal Medina @CristobalInData
7 Followers 20 Following Data Scientist breaking AI myths What actually works in ML, startups & markets
Karen v-AI-e @karenkether
146 Followers 844 Following Tech Bussines Creator - AI & automation for SMEs / @torcdotdev Ambassador /@ketherlabs Founder / @dev3pack fellow / @shefiorg scholar
Sangha Park @sanghaya1
215 Followers 2K Following Co-founder of Light Anchor (YC P26). Building AI-run consumer brands. Prev @BrownUniversity @sendbird
Ben @benvspak
50K Followers 17K Following On a mission to connect 1 million builders by 2050. Co-Founder & CEO of https://t.co/uibKlTFIID (@buildersxoff $builders) - Futurist, Poker, Feedhead.
Position Ventures @Position_VC
1K Followers 1K Following Early stage venture fund backed by Tiger Global and Bain Capital Ventures. We invest in startups and position them for success.
Sherwood @shcallaway
5K Followers 4K Following Founder @sazabi ◆ Infra Scout @a16z ◆ Early Eng @brexHQ ◆ 2x @ycombinator
Frank S. @el_framk
18 Followers 37 Following
Paul Graham @paulg
3.4M Followers 791 Following
Nikita Bier @nikitabier
1.1M Followers 2K Following head of product @x, advisor @solana, venture partner @lightspeedvp, ex-founder @gasappteam (acq by discord), ex-founder @thetbhapp (acq by facebook)
Andrew Yeung @andruyeung
73K Followers 789 Following hosting extraordinary people @meetfibe @theshortlistnyc | angel investor in 20+ companies | former @google @meta product lead
GREG ISENBERG @gregisenberg
673K Followers 980 Following I drop startup ideas daily. Host @startupideaspod. CEO: @latecheckoutplz we build companies like @ideabrowser, @meetLCA, @boringmarketer etc
jess lozano schmitt @JessLozanoS
42K Followers 5K Following Technically Member of Staff @stripe | prev. aerospace investing @JetBlue | MSc Analytics | all opinions are (unfortunately) my own
em herrera @EmilyHerrera
38K Followers 5K Following I run https://t.co/hT5UCqsEgX and https://t.co/rURzTAu7IB | prev @slow @nightmedia
Brooke LeBlanc @brookeleblanc
32K Followers 1K Following Posting daily about life, running, sales, marketing, startups. Work Inquires: Email [email protected]
Garry Tan @garrytan
892K Followers 6K Following President & CEO @ycombinator —Founder @garryslist—Creator of GStack & GBrain—designer/engineer who helps founders—SF Dem accelerating the boom loop
Toby ☕️ @tobydoyhowell
29K Followers 1K Following your friendly neighborhood podcaster @mbdailyshow // I like to run
Richard Mensah @riomensah
2K Followers 501 Following founder & ceo @salleylabs. 3x founder, building proactive AI infra for enterprises. 🔥insights on startups & AI. @fdotinc, @BostInno 25 under 25
Farza 🇵🇰🇺�... @FarzaTV
115K Followers 2K Following hi. i make things. hacking on @heyclicky. prev founder @_buildspace.
tim @itstimconnors
17K Followers 5K Following
Ryan Hoover @rrhoover
390K Followers 2K Following Founder of @ProductHunt. Investor at @WeekendFund. Say hi! 👋🏼
Between The Posts @BetweenThePosts
95K Followers 27 Following 🧠 We explain why football matches unfold the way they do ⚽ Top 5 Leagues, Champions League & International Football 📊 Match reports on https://t.co/sWk3qv1PIV
Cody Schneider @codyschneider
62K Followers 4K Following follow to learn how to build AI agents that do marketing to grow your business building @graphed with @maxchehab
Emir Ayaz @emirayaaz
6K Followers 360 Following Designer and Founder @witharc_co ✴︎ Co-founder https://t.co/RxCqUBNXDf
Rinske Fris @curatedoutfit
26K Followers 57 Following Men's Style Consultant. Improve your style in 12 steps: https://t.co/sqPC1DcNZ7
Marvin von Hagen @marvinvonhagen
17K Followers 922 Following co-founder @interaction (hiring, dms open!) // prev co-founder @tum_boring, stints @tesla + @mit
samyok @samyok
2K Followers 223 Following vibing w poke @interaction | prev @janestreetgroup @robinhoodapp
Kyle Anthony Miller @kyleanthony
35K Followers 1K Following An American brand designer, designing for the new industrial age.
Ben Kim @benkimbuilds
3K Followers 383 Following staff product + eng @rabbithole_gg community https://t.co/MdNeaGydEV ambassador @openai
NewLimit @newlimit
43K Followers 3 Following Working toward radical extension of human healthspan using epigenetic reprogramming.
Brett Adcock @adcock_brett
552K Followers 21 Following @figure_robot (AI robots) @hark_labs (personal AGI) @cover_thz (weapon detection) @flyArcher (flying cars)
Daniel Heinen @heinenbros
8K Followers 480 Following Building the future of American Intelligence. Founder / CEO @graylark aka @GeospyAI
Raven (GeoSpy) @GeospyAI
10K Followers 97 Following Frontline Visual Intelligence Raven is Graylark’s frontline visual intelligence platform.
turbopuffer @turbopuffer
13K Followers 5 Following {vector, full-text} search engine built on object storage. fast, cheap, 1T scale. powers Anthropic, Cursor, Notion, and more
carmen @carmguti
17K Followers 3K Following doing the hard part | prev @a24 labs, ms cs @StanfordEng, physics @stanford
Nicolò Magnante (YC ... @nicolomagnante
2K Followers 3K Following life goals: 1 - build a Decacorn 2 - lead Italy to top 5 global GDP 3 - win an Oscar
Amaury Vergara Z. @Amauryvz
231K Followers 351 Following Presidente y Director General de @Omnilife®, Presidente de CDG @Chivas® ~ @NFuerza_® ~ Instagram: amaury_vergara
Standard Intelligence @si_pbc
10K Followers 0 Following
CallingBox @callingbox
2 Followers 2 Following The API for AI phone calls Discord: https://t.co/ae2fJYJbEm
Photon 🌈 @photon_hq
5K Followers 3 Following Photon brings agents to the interfaces millions already use. Join Discord: https://t.co/omNpZvNuHY
limitlesstack @limitlesstack
14K Followers 111 Following biochemical engineer of nootropic stacks | pursuing limitless biology | safe, evidence based protocols ☢️
ClaudeDevs @ClaudeDevs
511K Followers 2 Following Official updates for developers building with @ClaudeAI
U.S. Graphics Company @usgraphics
62K Followers 487 Following Engineering graphics. Check out our new typeface, Berkeley Mono → https://t.co/dUqr2XX9Wm
Paul Bakaus @pbakaus
17K Followers 786 Following Renaissance geek (creativity × culture × tech). Product engineer before it was cool. Made @impeccable_ai, @radiant_shaders, @jqueryui | x-@google, @zynga
jay @jayvraavi
9K Followers 652 Following solo traveling & (solo) building @nomadtableapp (1M+ users, 100% bootstrapped)
Spacesthetic @interiorsuckerr
324K Followers 850 Following Archive | Idea | Inspiration. A mood called Spacesthetic. Collab/removal — [email protected] | 🍉
FARRUKH KHAN @1khanfarrukh
5K Followers 7K Following Founder & CEO Inference Analytics, building AI for sensitive data industries, AI entrepreneur. Alum: IBM (Big Data), Informatica, MicroStrategy, McKinsey.
TOMOYUKI YABE 🇯�... @Bee_Yaah
9K Followers 451 Following Just a Football Fan & Traveller. #VFK #JFA #SGE #ROASTERY
Liam @liammatteson
3K Followers 1K Following Designing @Browserbase. Previously designed at Copilot Money, Vanta, and others startups.
Orca ADE @orca_build
4K Followers 193 Following The agentic IDE for 100x builders Open source (MIT) • Desktop + mobile Repo: https://t.co/3qBKIGsOuY (4.8k ⭐)
Jon Yongfook @yongfook
162K Followers 1K Following 🐻 https://t.co/KoqV5WRAhy image generation Bootstrapping SaaS @ $81K MRR
laura @laurathesimp
36K Followers 435 Following math + cs @ nyu | intern @ openai, notion, figma | 21 ౨ৎ | all opinions r my own
Bartek @bartuiux
4K Followers 229 Following CEO of @studio_17co Product Designer ➝ [email protected]
Gus Trigos @gustrigos
617 Followers 196 Following Building Runtime P26 (https://t.co/jdpCxdDOia) | Prev. Mentum (YC S21, acqud.)
Shann³ @shannholmberg
32K Followers 13K Following I cover AI marketing & growth. Sharing every framework as I build it. Founder @espressioai, @lunarstrategy
Austin Way @Austin_Way
6K Followers 150 Following 17y/o @AlphaSchoolATX Building the next generation of Ed-tech that will reach 1 billion kids.
















































