Andy Ye @creatstar
Building StarRocks & PhoenixAI | Serial Entrepreneur | Passionate about data analytics, engineering & lakehouse starrocks.io San Fransisco, CA Joined November 2009-
Tweets338
-
Followers104
-
Following87
-
Likes95
Still thinking about our PhoenixAI happy hour at Ula 🐦🔥 Thank you to everyone who came out for great data + AI conversations, new connections, and Phoenix Margaritas made just for the event 🍹 Congrats again to our AirPods raffle winners 🎧 See you at the next one!
Sharing a few recent thoughts on AI. 1. At this stage, what AI brings to people is increasingly anxiety rather than excitement. The pace of progress is simply too fast! What changes in a single week today might have taken a whole year in the past. Imagine suddenly having a team of highly capable, tireless members who can work 24 hours a day. You would immediately start worrying about how to make this team do more. Because if you don’t, someone else will. The FOMO only grows stronger. A friend of mine told me that if he doesn’t assign his AI agent a “big task” before going to bed, he can hardly fall asleep. 2. AI is forcing people to do what they least want to do: deep thinking. Execution work is no longer a competitive advantage. AI can do it faster and better than you. You can no longer make a living by execution alone. Instead, you must think deeply about how to direct AI toward more meaningful work. But here comes the paradox: many forms of deep thinking depend on firsthand execution experience. If you have never done the work yourself, how do you gain the intuition and judgment required for deep thinking? How can a programmer who has never written functions truly design modules, let alone an entire system? 3. Looking back at the technological transformations from my childhood to now, I feel both excited and deeply shaken. AI will dramatically widen the gap between people. Steve Jobs once said that in the software industry, one outstanding person can be worth a hundred average ones. AI may expand that gap by another hundred times, meaning that one person who can truly master AI could be equivalent to ten thousand ordinary people. And this gap will not be limited to the software industry. As a father, I have very mixed emotions about my children being born into such an era. “It was the best of times, it was the worst of times.”
I just finished reading the “State of AI” report jointly published by a16z and OpenRouter, which is based on an analysis of over 100 trillion tokens of real-world interaction data. It’s extremely impressive. Below are some key takeaways. 1. The Industry Turning Point & The Rise of "Reasoning" * The o1 Release: The release of OpenAI's o1 reasoning model on December 5, 2024, is viewed as a pivotal moment. It marked a shift in the field from "single-pass pattern generation" to "multi-step deliberation" inference. * Agentic Inference: Users are increasingly utilizing models as components within larger automated systems (Agents). 2. Open Source vs. Closed Source Models * The Rise of Open Source: While closed (proprietary) models still dominate, Open Source (OSS) usage has grown steadily, accounting for approximately 1/3 of total token volume by late 2025. * Contribution from Chinese Models: Open-source models developed in China (e.g., DeepSeek, Qwen, Moonshot/Kimi) have seen significant growth and now occupy a substantial share of the OSS ecosystem. * Dynamic Competition: There is no single hegemon in the open-source space. The landscape is highly dynamic; no single model consistently holds more than 20-25% of the share. New releases (like DeepSeek V3, GPT OSS) rapidly capture market share. 3. Evolution of Model Sizes: "Medium" is the New Small * Decline of "Small" Models: Models with fewer than 15B parameters are seeing their market share decline despite their high quantity. * Rise of "Medium" Models: The 15B-70B parameter range (e.g., Qwen2.5 Coder 32B, Mistral Small 3) has found "Model-Market Fit," striking a balance between performance and efficiency. * Diversification of "Large" Models: The 70B+ category shows a pluralistic competitive landscape where users do not converge on a single model. 4. Key Use Cases * Programming: This is the primary driver of prompt token growth. Coding tasks often involve extremely long contexts (frequently exceeding 20k tokens) for code understanding and debugging. * Roleplay: This remains a stronghold for open-source models, which often outperform more restricted closed-source models in creative tasks. 5. Other Key Findings * Global Usage: Over 50% of usage on OpenRouter originates from outside the United States, highlighting the globalization of AI adoption. * Retention (The "Cinderella Effect"): The report identifies a "Cinderella 'Glass Slipper' effect." This refers to early user cohorts who found a perfect fit between the model and their specific needs; these users exhibit much higher long-term retention compared to later users. Original report: openrouter.ai/state-of-ai
🎥 Watch on-demand: youtu.be/tktN3gjDoYk?si… Missed our “CelerData Cloud BYOC: New Features, Smarter Scaling for Real-time Analytics” webinar — or want to give it another watch? The full recording is now available!
How do you power dashboards that join over 100 tables and scan 20B rows—yet still respond in under two seconds? That was the challenge facing Celonis. See how they re-engineered their analytics architecture to achieve sub-second process mining performance. hubs.la/Q03Wqkmp0
CelerData Newsletter of November 2025! linkedin.com/pulse/november…
A German 🇩🇪 philosopher once said, “We learn from history that we learn nothing from history.” I’m often reminded of this line when I see senior leaders on social media sharing insights that I already figured out 15 years ago as a junior engineer.
🚀 Turning the Promise of the Data Lakehouse into Reality — Together with Hewlett Packard Enterprise Many organizations adopt data lakehouses to unify their analytics, but they often struggle with complexity — from fragmented data integration and inconsistent performance to governance and operational challenges. That’s why CelerData and HPE have partnered to deliver a truly modern, high-performance lakehouse architecture. 💡 HPE Alletra Storage MP X10000 provides cloud-native, all-flash storage with massive scalability and enterprise-grade reliability — the foundation for real-time analytics at any scale. ⚡ CelerData Enterprise, powered by StarRocks, brings blazing-fast query performance, elastic scalability, and seamless integration with open formats like Apache Iceberg — enabling complex analytics directly on large datasets, with no extra pipelines or overhead. Together, HPE and CelerData simplify the path to real-time, unified insights, helping organizations make faster, smarter decisions with greater confidence and lower cost. Please visit this link to learn more: lnkd.in/gb3D7Saf hashtag#DataLakehouse hashtag#RealTimeAnalytics hashtag#HPE hashtag#CelerData hashtag#StarRocks hashtag#ApacheIceberg hashtag#AIAnalytics
🌍 That’s a wrap on #StarRocksSummit2025! Thank you to everyone who registered, tuned in, and supported us — you made this milestone possible! Session replays are starting to roll out! Catch (or rewatch) them now — and check back soon for more: 🔗 app.events.ringcentral.com/events/starroc…
🌟 Speaker Highlight: Join Nicholas Reich of Demandbase at #StarRocksSummit2025 (Sept 10, free & virtual) as he shares how his team eliminated denormalization with StarRocks’ high-performance joins to power real-time analytics! 👉 Agenda + free pass: hubs.la/Q03GCVVj0
🌞Countdown check — 8 days left! At #StarRocksSummit2025, our customer keynote features Intuit — hear how they run real-time analytics with StarRocks at the core. 👉See the full agenda + grab your free pass if you haven’t already: hubs.la/Q03Gzd1p0
🔥 The wait is over! The full agenda for #StarRocksSummit2025 is live — Sept 10 | Free & Virtual 🔗 summit.starrocks.io/2025/Twitter 👉 20+ engineer-led talks on what actually works at scale in production: real-time customer-facing analytics, the lakehouse, and AI. ✅ RSVP today!
Demandbase, a B2B GTM leader, faced JOIN inefficiencies & scaling issues with ClickHouse. By switching to StarRocks-powered CelerData Cloud, they cut storage costs by 90%, reduced hardware by 60%, and optimized ETL pipelines for faster analytics.👇 hubs.la/Q03C8pvf0
Recently, some friends from Japan have been asking me questions about StarRocks. In fact, our Japanese documentation has been live for two months! 🇯🇵 For our friends in Japan, you can now read StarRocks docs directly in Japanese: 🔗 docs.starrocks.io/ja/docs/introd… We’d love to hear your feedback on the Japanese docs—your input will help us make them even better! 皆様、いつもありがとうございます!
StarRocks reached 10K ⭐ on GitHub! 🥳 From everyone building, using, and championing StarRocks: THANK YOU 💙 One milestone down—many more ahead!
⚡ Supercharging Customer-Facing Analytics with Delta Kernel + @StarRocksLabs Delta Lake’s Delta Kernel offers a robust set of Engine APIs to make data access faster and more efficient. StarRocks taps into these APIs to power smart caching techniques that: 🔹 Eliminate redundant reads of Parquet and JSON files 🔹 Slash query latency for real-time user-facing dashboards 🔹 Improve resource usage with intelligent cache control With tools like 𝗖𝗮𝗰𝗵𝗲𝗱𝗣𝗮𝗿𝗾𝘂𝗲𝘁𝗛𝗮𝗻𝗱𝗹𝗲𝗿 and 𝗖𝗮𝗰𝗵𝗲𝗱𝗝𝘀𝗼𝗻𝗛𝗮𝗻𝗱𝗹𝗲𝗿, StarRocks boosts performance and delivers snappy queries—perfect for high-concurrency scenarios. 👉 Dive into the blog for the full breakdown: delta.io/blog/starrocks… #deltalake #opensource #starrocks #linuxfoundation #APIs #analytics #dataengineering
Check out this behind-the-scenes look at how Zepto—one of India’s fastest-growing quick commerce companies—scaled its real-time brand analytics with StarRocks, evolving from a lean MVP to powering external dashboards with over 300 million rows! hubs.la/Q03v5slR0
🎟️lu.ma/95a5qys1 - The Iceberg Meetup lands in NYC on July 10, featuring 15+ talks from engineers, architects, and open-source contributors shaping the future of the lakehouse. Don’t miss our session on achieving sub-second latency on Iceberg with materialized views!
cammelia noir @1gary_evans
25 Followers 2K Following too soft for this timeline 💗 mutuals & follow back
ZOOMEX_Official @ZoomexO24386
4 Followers 51 Following Tap to Trade the Future with Zoomex. Official Partner @HaasF1Team | @emartinez Official links: https://t.co/dpdwgfLw1Z
RIDDLΞR @r_riddl66662
0 Followers 63 Following 🇬🇧🇵🇹 #NFTs c.2018 • Angel Investor • NFT Collector • Cryptopunk 6508 - For Business Enquiries: [email protected]
Right Pulse News @RightN38279
0 Followers 86 Following
heller @heller1421307
2 Followers 89 Following Demon mode | web3 marketing | Dm for business | posts and rts: NFA | TG: @hellerincrypto Calls: https://t.co/vDWvj6ANRX
Woody M @WoodyM1982
0 Followers 26 Following
Ashwin Jayaprakash @ashwinjay
219 Followers 4K Following Falling into the future at light speed. (Any opinions expressed are my own)
leozc @leozc
748 Followers 915 Following Building @cipherowl - ex-Coinbase, cruise, twtr, amazon and msft
Croadus @CroadusUWvF0
23 Followers 457 Following
Shisers @Shisersvrbw7AP
21 Followers 448 Following
猴子爱吃鱼 @jianyingse
7 Followers 1K Following
Tysers @Tysers1uXz7n
63 Followers 4K Following
HazelJasper @847a61ZqBuHz2
73 Followers 7K Following
Karuppiah @karuppiah7890
550 Followers 4K Following Tinkerer. Learning Databases. Follow for content on life, tech, businesses, pricing, (and more? :))
lukas @xianminx
51 Followers 1K Following
Allen Sun @shlallen
178 Followers 197 Following
Larry Lv @larrylv
2K Followers 315 Following Post-Training @OpenAI. Training GPT-5.x Thinking models. // If I blocked you, blame my bot.
Andrew C. Oliver @acoliver
2K Followers 2K Following Startup Marketing Exec Professional Vibecoder Infoworld column https://t.co/gf3nADYrVe. Opinions = Mine.
lism2013 @Ming_Paradise
23 Followers 539 Following
Arthur Wiedmer @awiedmer
558 Followers 4K Following Data, Privacy, Python, Airflow, free software. My opinions are my own.
Aneesh Rai @aneeshrai
24 Followers 79 Following
hubert dulay @hkdulay
763 Followers 2K Following “Streaming Data Mesh” & “Streaming Databases” OReilly Author @startreedata ex-@confluentinc ex-@cloudera #datamesh #clusterheadaches 🇵🇭🎸🥁🎹🤘
ᴏʀʟᴀɴᴅᴏᴍ... @polarguidance
9 Followers 138 Following Technology architect, developer, lifelong learner, rare visitor. Personal account. Don't lead or follow but befriend instead.
阿墨 @chelseamo0
4 Followers 101 Following
Li Kang @gnakilrm
14 Followers 229 Following
DataCater @DataCater
54 Followers 47 Following 🚢 Ship modern, real-time data pipelines in a matter of minutes. 🚀 Transform data using our no-code pipeline designer or Python. https://t.co/HHGQ5RnuhQ
Decodable @Decodableco
3K Followers 2K Following Decodable is a serverless real-time data platform built on #ApacheFlink. No clusters to set up. No code to write. No PhD required.
Yingjun Wu 🤘 @YingjunWu
4K Followers 1K Following Founder. Building the infra for agents and humans @RisingWaveLabs. Sharing what breaks, what doesn't, and what's next. ex-@awscloud, PhD @NUSingapore @CMUDB.
fxx @xiangfu0
319 Followers 440 Following Author of @ApachePinot. Data Analytics Infrastructure. https://t.co/YTJr7uttTN
JohnalLee @JohnalXLee
3 Followers 127 Following
Donny Ding @donny_ding
2 Followers 58 Following
PhoenixAI (formerly C... @phoenixdataai
347 Followers 451 Following The Agentic AI Database. Giving autonomous AI agents sub-second access to live enterprise data at massive scale. Built for production. Formerly CelerData.
Michael E. Driscoll @medriscoll
15K Followers 2K Following Founder @RillData, building the fastest business intelligence tool for humans and agents.
GameFi.Fund @GameFi_Fund
2K Followers 1K Following We are a cutting edge Web3.0 capital, focusing on GameFi and NFT, if you are startup in Metaverse, please send your business plan to: [email protected]
a文a文 @countryknightA
13 Followers 364 Following
JohnsonGinati @JohnsonGinati
8 Followers 42 Following
Pslydhh @Pslydhh_
10 Followers 500 Following
Chris @chriszhou1114
16 Followers 180 Following
StarRocks @StarRocksLabs
1K Followers 196 Following Visit StarRocks Website: https://t.co/nUo4tk00Yk 💬 Slack: https://t.co/huvhB2yaEA
StanleyHu @StanleyHuNJU
2 Followers 291 Following A backend software developer. Interested in OLAP System, AI ChatBot, and content creation powered by AI.
Serenity @aleabitoreddit
854K Followers 174 Following I only use X, beware of imposters. AI/Semi Supply Chain Analyst Not investment advice, DYODD. Now publishing free research on AI chokepoints.
CipherOwl @CipherOwl
3K Followers 49 Following Building the onchain intelligence layer for institutions and agents 🦉| Team ex-Coinbase/Cruise/AWS Start Scanning 🔎 https://t.co/NgU5fdC9Xn
Alex Merced | Open Da... @AMdatalakehouse
1K Followers 2K Following O'reilly and Manning Author, Dremio Head of DevRel, and Friendly Tech & Data Hipster. (https://t.co/RV3bH5h4cY)
DeepSeek @deepseek_ai
1.0M Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
InteractivePolls @IAPolls2022
242K Followers 775 Following Non-partisan aggregator of polls & prediction markets
Vinoth Chandar @byte_array
2K Followers 232 Following Founder @Onehousehq, Creator of @apachehudi, Built the World's first #DataLakehouse, Distributed/Data Systems, Linkedin, Uber, Confluent alum. (views are mine)
Andy Pavlo (@andypavl... @andy_pavlo
40K Followers 207 Following Associate Professor of Databases @CarnegieMellon.
Shawn Gordon @ProgRockRec
2K Followers 211 Following Datalake developer advocate, software designer, and prolific musician. Check out my music https://t.co/R1xwdImm4V
Andrew C. Oliver @acoliver
2K Followers 2K Following Startup Marketing Exec Professional Vibecoder Infoworld column https://t.co/gf3nADYrVe. Opinions = Mine.
InvestyWise by Groww @Investywise
32K Followers 130 Following 500K+ Community on Instagram, Market, and Economy News & Infographics Powered by @_groww
Ivanka Trump @IvankaTrump
11.2M Followers 2K Following
梨園浮生 @why0214
18K Followers 100 Following
ruanyf @ruanyf
202K Followers 371 Following Stay Focused, Keep Shipping. Build Early, Build Always. Improve yourself, Write solid/simple/stupid code.
王局志安 @wangzhian8848
1.5M Followers 671 Following 调查记者,主持人。油管频道地址:https://t.co/WfO3vZ0ygY?amp=1,Facebook地址:https://t.co/QO4mprdzUL,TikTok地址:https://t.co/chl5sh7Cfa
fxx @xiangfu0
319 Followers 440 Following Author of @ApachePinot. Data Analytics Infrastructure. https://t.co/YTJr7uttTN
Yingjun Wu 🤘 @YingjunWu
4K Followers 1K Following Founder. Building the infra for agents and humans @RisingWaveLabs. Sharing what breaks, what doesn't, and what's next. ex-@awscloud, PhD @NUSingapore @CMUDB.
PhoenixAI (formerly C... @phoenixdataai
347 Followers 451 Following The Agentic AI Database. Giving autonomous AI agents sub-second access to live enterprise data at massive scale. Built for production. Formerly CelerData.
GameFi.Fund @GameFi_Fund
2K Followers 1K Following We are a cutting edge Web3.0 capital, focusing on GameFi and NFT, if you are startup in Metaverse, please send your business plan to: [email protected]
Bernard Marr @BernardMarr
139K Followers 52K Following 📖 Internationally Best-selling #Author 🎤 #KeynoteSpeaker 🤖 #Futurist 💻 #Business, #Tech & #Data Advisor #AI #AR #VR #BigData #Robotics #MachineLearning
Robin Moffatt 🍻�... @rmoff
10K Followers 640 Following Shitposting and Memes. 🌐 https://t.co/WparjfmCF5
Analytical Aakriti | ... @Aakriti_Sarma
71K Followers 159 Following Open for DevRel roles! Data Nerd | Technical Writer | SQL Obsessed | Python Possessed | Let's dig into data the simple, effective and no nonsense way 📊
Kishore Gopalakrishna @KishoreBytes
2K Followers 755 Following Co-founder & CEO of @startreedata. co-creator of @apachepinot, @apachehelix, ThirdEye, and Espresso.
Kai Wähner @KaiWaehner
3K Followers 155 Following Technology Evangelist with Focus on Integration, Event Streaming, Big Data, Analytics, Machine Learning, and Cloud-Native Microservices
Snowflake @Snowflake
64K Followers 1K Following The AI Data Cloud where data does more, and proud to be the Official Data Collaboration Provider for LA28 and Team USA.
StarTree @startreedata
2K Followers 123 Following Fast, Fresh, Actionable Insights at Scale! From the creators of @ApachePinot. We're growing! Join the movement!
Apache Pinot @ApachePinot
5K Followers 120 Following Real-time distributed OLAP datastore. Point at Kafka and start querying. Join our growing community at https://t.co/s0Tg6jKf8W
vicki @vboykis
59K Followers 1K Following I move vectors to different machines sometimes. Founding ml engineer in recsys/search. building ✨I like Nutella.
Firebolt @FireboltHQ
861 Followers 166 Following Firebolt is the Analytical Database for Real-time Applications.
Imply @implydata
2K Followers 662 Following The Data Layer for Observability, Security, and AI, empowers organizations to keep more data, search it faster, and spend less—without changing their tools.
Gunnar Morling 🌍 @gunnarmorling
70K Followers 297 Following Technologist @Confluentinc · Ex-lead of Debezium · Spec lead of Bean Validation 2.0 · Creator of Hardwood, kcctl, JfrUnit, MapStruct · Java Champion · 🚴
clyfish @clyfish
12 Followers 99 Following
StarRocks @StarRocksLabs
1K Followers 196 Following Visit StarRocks Website: https://t.co/nUo4tk00Yk 💬 Slack: https://t.co/huvhB2yaEA
Apache Druid @druidio
5K Followers 47 Following Druid is a high performance analytics DB for modern analytic apps. Use Druid when fast query performance, real-time ingest, and high uptime are important.
ClickHouse @ClickHouseDB
18K Followers 62 Following ClickHouse is the fastest open-source OLAP database ⚡ Download: https://t.co/3JKlDJbkcH GitHub: https://t.co/bjCe9qIetg Slack: https://t.co/d95c6jVeJm
Jun Huang @leemars
2K Followers 417 Following 半工业半学术的知道分子。I/E ST/NF P/J 2*2*2 究极摇摆人。具有 8 年以上健身经验的胖子。喜欢了解各种新鲜事物,但可能不会轻易尝试。喜欢以「这不就是」开头「区别在哪里」结束。所发表的言论仅代表我个人认知,没有一点试图说服别人的想法。
TiDB, powered by Ping... @PingCAP
7K Followers 130 Following The company behind #TiDB, a #MySQL compatible, #OpenSource #DistributedSQL database for building scalable modern apps.
Codecademy @Codecademy
486K Followers 571 Following Learn the latest tech skills to build the career you’ve always wanted with Codecademy, by Skillsoft.
CodeAI @codeorg
936K Followers 677 Following Working toward every K-12 student having the agency to thrive in the digital world. Formerly https://t.co/Eo54isSG08.
Donald J. Trump @realDonaldTrump
111.6M Followers 53 Following 45th & 47th President of the United States of America🇺🇸
Jana Eggers @jeggers
9K Followers 9K Following CEO @NaraLogics, applied ai nerd, mathematician, amateur neuroscientist, learner/sharer, customer advoc8, innovation ldr, dog mum, friend, 5x Ironman, she/herBigData @BigDataDiary
21K Followers 1K Following BigDataDiary brings you latest news on BigData, NoSQL along with updates on relevant products and services.
Yves Mulkers @YvesMulkers
99K Followers 77K Following Data DJ │ AI & Data intelligence from 200K+ sources │ Daily signals → https://t.co/IKxe8yyr6D │ 7wData founder
William McKnight @williammcknight
17K Followers 5K Following President @mcknightconsult: #Inc5000 x 3. #bigdata & #cloudcomputing 🌎 🥇, #analytics, #artificialintelligence. 2xHyrox Pro Age Group US Champ, Global 5th















