Andy Grove @andygrove_io
@ApacheArrow PMC Chair. Apache DataFusion PMC. Original creator of Apache DataFusion. andygrove.io Colorado Joined February 2015-
Tweets909
-
Followers3K
-
Following584
-
Likes3K
We're starting to build a backlog of good first issues in DataFusion ⚛️ Comet ☄️ Contributing to these issues is a great way to start learning about Spark, DataFusion, and Arrow internals. github.com/apache/datafus…
We just need 8 more people to star the @ApacheDataFusio repo to get to the 5k popularity contest: datafusion.apache.org
It's official! Apache DataFusion is now a top-level Apache project. 😃 github.com/apache/datafus… The URLs for the subprojects have also been updated. github.com/apache/datafus… github.com/apache/datafus… github.com/apache/datafus… Congratulations to the community for this big milestone!
I'm excited to be getting started with some contributions to the Apache DataFusion Comet project for accelerating Apache Spark. If you would like to get involved in the project, a great place to start would be to help implement Spark-compatible CAST expressions. Some of these…
TIL about @paradedb: high-perf OLAP capabilities for #Postgres, based on #Parquet for storage and #DataFusion for query execution. Amazing to see all these OSS building blocks out there for making things like this possible without starting from scratch. github.com/paradedb/parad…
TIL about @paradedb: high-perf OLAP capabilities for #Postgres, based on #Parquet for storage and #DataFusion for query execution. Amazing to see all these OSS building blocks out there for making things like this possible without starting from scratch. github.com/paradedb/parad…
Here is a paper I helped write that is in VLDB this year. vldb.org/pvldb/vol17/p1… Among other things it confirms my believe that join ordering is not a solved problem in databases. Maybe someone will use DataFusion to explore better algorithms 🎣
I had a great time meeting some fellow DataFusion contributors today in Austin. Many thanks to @andrewlamb1111, @pauldix, and other @InfluxDB folks for arranging this!
I am pretty excited about the growth in projects built with @ApacheArrow DataFusion. Here is the official announcement of Comet, created by Engineers from Apple to accelerate Spark. arrow.apache.org/blog/2024/03/0…
There is a cool opportunity in my remote team at @Coralogix for a Software Engineer to contribute to an ambitious distributed query engine based on DataFusion / Ballista. The job is open for US candidates as well. Feel free to send me a DM for questions. coralogix.com/careers/co/rem…
Microsoft is hiring for a new Rust tooling team! We're currently looking for a Software Engineering Manager to "play a pivotal role in supporting diverse Rust initiatives throughout the entire Microsoft engineering ecosystem." #rust #rustlang jobs.careers.microsoft.com/global/en/shar…
Reason why devs are overengineering projects at work is they don’t have side projects to overengineer
We’ve partnered with @VoltronData and the Arrow community to align @ApacheArrow with Velox, Meta’s open source execution engine. This new convergence helps build data management systems that are unified, more efficient, and composable. engineering.fb.com/2024/02/20/dev…
Boring Data Tool (bdt) has now moved to the datafusion-contrib GitHub org. I think this is a nice example of building CLI data tools with @ApacheArrow and #DataFusion github.com/datafusion-con…
😝, The first @ApacheIceberg #Rust release is under voting! lists.apache.org/thread/1rnk9c1…
Another opportunity to get paid to work with/on DataFusion at @Coralogix coralogix.com/careers/co/rem…
Anyone out there using Ballista? Seems like a really interesting project.
Anyone out there using Ballista? Seems like a really interesting project.
Apache Arrow nanoarrow 0.4.0 is released! This project is one of my favourite things to work on - it's a C library with R and Python (new!) bindings that makes it easier than ever before to import and export tabular data using the Arrow format.
Gunnar Morling 🌍 @gunnarmorling
51K Followers 302 Following Software engineer @Decodableco · Ex-lead of Debezium · Spec lead of Bean Validation 2.0 · Creator of JfrUnit, kcctl and MapStruct · Java Champion · 🚴Tim Clicks @timClicks
22K Followers 3K Following On the planet to build a better planet. Software person, kind of into Rust and creative coding. Author of Rust in Action (https://t.co/qgWenxBkeP).Erik Bernhardsson @bernhardsson
38K Followers 3K Following Building @modal_labs when I'm not posting bangers about data and software. Previously built the music rec sys at Spotify and ran the eng team at Better.Phil Eaton @eatonphil
17K Followers 552 Following Working on @EDBPostgres. Mostly databases, distributed systems, books, and cooking. 💍🇰🇷Mim @mim_djo
9K Followers 3K Following #Fabric Enthusiast, Small Data And self service, #Microsoftemployee since Nov 2023 , but my tweets are my ownXuanwo @OnlyXuanwo
9K Followers 941 Following ASF Member. Apache #OpenDAL PMC Chair. VISION: Access data freely across services by any method.Luca Palmieri @algo_luca
14K Followers 2K Following Rust / backend dev / org design / climate. Baking on the good days. Author of https://t.co/WDxzUHAAe6. Building https://t.co/YMfw1oaHIi, a new Rust web framework.» teej @teej_m
9K Followers 2K Following » Working on Titan » https://t.co/aZwqUSdNXn » my friends call me teejAlex Monahan @__AlexMonahan__
3K Followers 703 Following Forward Deployed Software Engineer at MotherDuck! Docs & blogs at DuckDB Labs! Views expressed are my own and not my employers'.Pekka Enberg @penberg
7K Followers 1K Following Founder & CTO @tursodatabase. Interested in low latency, systems, and AI. Former @ScyllaDB and Linux.ABC @Ubunta
3K Followers 3K Following Data & ML Infrastructure for Healthcare https://t.co/FwocCiCQAT Opinions are पड़ोसी' In 🇩🇪Berlin from 🇮🇳Kolkata/छत्तीसगढ़Alex P @ifesdjeen
12K Followers 1K Following Distributed and Storage Systems. Apache Cassandra Committer and PMC member. Author of Database Internals @therealdatabass. Discord: https://t.co/8LwhZom9eQ🕺💃🤟 Alexande.. @emaxerrno
4K Followers 2K Following Founder & CEO of @RedpandaData - A Kafka® replacement for mission critical systems. 10x Faster; Safe; API compatible. 🇨🇴Delta Lake @DeltaLakeOSS
8K Followers 66 Following Delta Lake is an open-source storage framework that enables building a Lakehouse architecture for Spark, Flink, Trino, Hive, Scala, Java, Rust, Python, & more!卡比卡比 @jakevin7
2K Followers 347 Following Love open source!想做有成长性的事和有趣的事 Github: https://t.co/fB3t3tUqIVEduardo Leon @eseveroleon
3 Followers 24 FollowingGurkirat singh @gurkirat_7
40 Followers 927 FollowingAlex Wang @wangah__
218 Followers 994 FollowingVivek @vvkcnd
7 Followers 178 FollowingJulian Counihan @NYCounihan
2K Followers 839 Following @schematicvc: digital industrial venture capital [email protected]hemant nadakuditi @hnadakuditi
74 Followers 1K FollowingKevin Liu @kevinjqliu
68 Followers 246 Followingworksonmymachine @KevatCelle
0 Followers 50 Followinglol @lol929lol
1 Followers 43 Followingwisedb @wisedb185004
0 Followers 29 FollowingDarius Robson @DariusRRobson
92 Followers 823 FollowingSam Wasif @xsamwasif
14 Followers 347 FollowingXin Liang @phaidrosliang
2 Followers 37 FollowingEric Marnadi @EricMarnadi
5 Followers 41 FollowingClay Morton @LyonsClay
50 Followers 157 Followingtinyrobots @tinyrobots
809 Followers 5K FollowingRaymond Manaloto @sortakool
19 Followers 805 Followingzzzzzzoo00oo @zzzzzzoo00oo
254 Followers 4K FollowingJake @aerialfly
9 Followers 35 Following Data stuff at places like @okta, @shopify, @cargurus, @oreillymediamlecchaslayer156 @mlecchasla37448
97 Followers 3K FollowingHamish Ogilvy @hamishogilvy
998 Followers 835 Following VP Artificial Intelligence @Algolia 🏀☘️🇦🇺 Prev founder at https://t.co/QaFp3SIjvIGiacomo Rebecchi @RebecchiGiacomo
38 Followers 146 Following Data Scientist @ hiop - Passionate about AI and its implications on society and politicsZhong Xu @ZhongXu1619707
2 Followers 10 FollowingLuis Trigueiros @luistrigueiros
364 Followers 4K Followingsaulius @imsaulius
228 Followers 370 FollowingTudor @TudorStefanesc5
41 Followers 3K FollowingLei @ratuthomm
199 Followers 118 FollowingAvaneesh @Avan390
92 Followers 1K FollowingOlivier @maholi03
1 Followers 419 FollowingNathaniel Bechhofer @bechhof
3K Followers 3K Following Economics PhD student at UCSD #Python/#rstats/#EconTwitter & everything social science; (some) opinions revised regularlyflecka @dfleckinger
83 Followers 649 FollowingTrent Hauck @trent_hauck
198 Followers 159 Following @where_true_tech ... mostly bioinformatics, software, ML, music, and biking.Veli UYSAL | veliuysa.. @0xVeliUysal
3K Followers 1K Following #developer #blockchain #web3 #tutor #rustlang #java #solidity 🦀 | Founder @turkiyerustcom | Member of @SuperTeamTR & @Layka_DAO & @developer_dao & @TBDankaraaubrey quarcoo @ahene90
303 Followers 5K Following Ghanaian orgin, Freelance C++ fixed income developer. Founder of GeorgeTown Analytics, using Erlang and Esper for messaging and Nosql. Web isolationAndy Pavlo (@andy_pav.. @andy_pavlo
29K Followers 205 Following Associate Prof. of Databases @CarnegieMellon. Co-Founder @OtterTuneAIGunnar Morling 🌍 @gunnarmorling
51K Followers 302 Following Software engineer @Decodableco · Ex-lead of Debezium · Spec lead of Bean Validation 2.0 · Creator of JfrUnit, kcctl and MapStruct · Java Champion · 🚴Erik Bernhardsson @bernhardsson
38K Followers 3K Following Building @modal_labs when I'm not posting bangers about data and software. Previously built the music rec sys at Spotify and ran the eng team at Better.Phil Eaton @eatonphil
17K Followers 552 Following Working on @EDBPostgres. Mostly databases, distributed systems, books, and cooking. 💍🇰🇷Mim @mim_djo
9K Followers 3K Following #Fabric Enthusiast, Small Data And self service, #Microsoftemployee since Nov 2023 , but my tweets are my ownXuanwo @OnlyXuanwo
9K Followers 941 Following ASF Member. Apache #OpenDAL PMC Chair. VISION: Access data freely across services by any method.Luca Palmieri @algo_luca
14K Followers 2K Following Rust / backend dev / org design / climate. Baking on the good days. Author of https://t.co/WDxzUHAAe6. Building https://t.co/YMfw1oaHIi, a new Rust web framework.Alex Monahan @__AlexMonahan__
3K Followers 703 Following Forward Deployed Software Engineer at MotherDuck! Docs & blogs at DuckDB Labs! Views expressed are my own and not my employers'.Chris Riccomini @criccomini
8K Followers 241 Following I post about software infrastructure · SWE at WePay, LinkedIn, PayPal · Project https://t.co/wWKIqaVLfI · Newsletter https://t.co/1LZOT8NNDd · Author https://t.co/Wi3qaKkJlSJon Gjengset @jonhoo
32K Followers 192 Following Rust live-coder and OSS tinkerer who loves teaching. I try to keep a high SNR. Wrote Rust for Rustaceans. At @HelsingAI. Ex AWS. Co-founded @readysetio. he/himABC @Ubunta
3K Followers 3K Following Data & ML Infrastructure for Healthcare https://t.co/FwocCiCQAT Opinions are पड़ोसी' In 🇩🇪Berlin from 🇮🇳Kolkata/छत्तीसगढ़Alex P @ifesdjeen
12K Followers 1K Following Distributed and Storage Systems. Apache Cassandra Committer and PMC member. Author of Database Internals @therealdatabass. Discord: https://t.co/8LwhZom9eQClickHouse @ClickHouseDB
8K Followers 58 Following ClickHouse is the fastest open-source OLAP database ⚡ Download: https://t.co/3JKlDJbkcH GitHub: https://t.co/bjCe9qIetg Slack: https://t.co/d95c6jVeJmCharlie Marsh @charliermarsh
12K Followers 776 Following Building @astral_sh: Ruff, uv, and other high-performance Python tools. Prev: Staff engineer @SpringDiscovery, @KhanAcademy, BSE @PrincetonCS.Manish @ManishEarth
14K Followers 612 Following territory mapper. 💉did @ca_covid . 🗣️ likes languages. ✈️ Repatriate/ABCD. 👨🔬 Formerly physics. he/himPriyanka Somrah @psomrah
1K Followers 301 Following Principal @Work_Bench, leading seed investments in companies across cloud infrastructure, ML and developer tools Blog: https://t.co/kwh5thwuDFCarl Sverre @carlsverre
2K Followers 838 Following Exploring technology from first principles. Building SQLSync, real-time collaborative SQLite in the browser: https://t.co/SdLEfZo9eJDavid Li @lidavidm
50 Followers 172 Following Software engineer @VoltronData. @ApacheArrow contributor and PMC member.ApacheDataFusion @ApacheDataFusio
91 Followers 0 FollowingMichael E. Driscoll @medriscoll
16K Followers 2K Following Founder @RillData, the fastest path to operational BI. Previously founded @Metamarkets, @DCVC, @CustomInk. Lapsed computational biologist.Intuitive Machines @Int_Machines
94K Followers 376 Following We open access to the Moon for the progress of humanity.Jason Nochlin @jasonnochlin
639 Followers 162 Following “I don’t care what it was designed to do, I care about what it can do.” prev-Founder and Chief Engineer of Teleport, acq. by FivetranEngineering at Meta @fb_engineering
281K Followers 201 Following Engineering at Meta is a technical news resource for engineers interested in how we solve large-scale technical challenges at Meta.Antithesis @AntithesisHQ
1K Followers 38 Following Antithesis autonomously searches your software for bugs, with 100% reproducibility, and provides unique debugging tools to easily diagnose and fix them.Paul Masurel 🦀 @fulmicoton
2K Followers 2K Following CEO of Quickwit, building a distributed big data Search Engine! https://t.co/PpYvMVEGcu https://t.co/KWqBHNBQgq mastodon: @[email protected]LanceDB @lancedb
1K Followers 48 Following Developer-friendly, open-source database for multi-modal AI https://t.co/wXn4tw66HVJay Chia - getdaft.io @JayChia5
54 Followers 66 Following Cofounder @ Eventual. Works on Daft (https://t.co/i5vV81AuTj) the Distributed Python Dataframe. Previously Freenome and Lyft L5. Talk to me about data systems!!Liang-Chi Hsieh @viirya
108 Followers 436 Following Data Engineering and Machine Learning; CS Ph.D. in Multimedia AnalysisGeorge Fraser @frasergeorgew
3K Followers 276 Following I mostly tweet about software, data, and the difficulty of changing our opinions based on what the data tell us.The Data Stack Show @DataStackShow
856 Followers 1K Following A podcast exploring the world of data. Hosted by @ericdodds & @KostasPardalis. ⚡️ Powered by @RudderStackBoulder Police Dept. @boulderpolice
64K Followers 351 Following Page is NOT monitored 24/7. Please use 911 for emergencies and 303.441.3333 for non emergent calls #Boulder #BoulderColoradoParadeDB @paradedb
470 Followers 2 Following Postgres for Search and Analytics ⭐ Star us: https://t.co/UL5Eovbw2O 🧑🤝🧑 Slack: https://t.co/BUU8x1XiVHPhilippe Noël @philippemnoel
1K Followers 208 Following Postgres for Search & Analytics @ParadeDB • H'20 • 🇫🇷🇨🇦 • prev @WhistHQ @MicrosoftBain Capital Ventures @BainCapVC
28K Followers 2K Following Business builders and domain experts partnering with iconic businesses to reimagine the way we live and work.extism @extism
836 Followers 783 Following Compile to #wasm & run it in your app. The @extism framework supports 16+ languages. Across server, browser, desktop, edge, etc. 🫡 oss contributors, @dylibso.Felipe O. Carvalho @_Felipe
4K Followers 2K Following SWE @VoltronData / @ApacheArrow. Past: @Spotify. Databases/Compilers/DistSys. Imperative prog. heathen. C++/OCaml/TLA⁺🇧🇷 → 🇸🇪 → 🌎 https://t.co/vxbdByfADIGlareDB @GlareDB
373 Followers 35 Following GlareDB is a fast SQL database for querying and analyzing distributed data. GitHub: https://t.co/V8jgq2ZaPs Discord: https://t.co/vFjrbXk669Nicole Dorfman @DorfmanNicole
218 Followers 256 Following Crime and Public Safety Reporter for The Daily Camera & Longmont Times-Call (send news tips to [email protected])Timothy Chen @tnachen
4K Followers 4K Following Founder of @EssenceVenture, Co-host of @ossstartup, @yaigdev. Ex-CEO of @hyperpilotio. SVP of Engineering @cosmos.Kostas Pardalis @KostasPardalis
659 Followers 674 Following Entrepreneur | Done some cool product stuff with @trinodb | host @DataStackShow | ex-RudderStack | previously CEO @ Blendo | Always an EngineerEzra® @ezrainc
3K Followers 44 Following On a mission to detect cancer early for everyone in the world, one full-body MRI at a time.DB Tsai @dbtsai
211 Followers 88 FollowingTyler Hillery @_TylerHillery
1K Followers 548 Following Data Engineer @supabase ⚡, Data Systems, Previously @NasdaqDan Chiarlone @danologue
333 Followers 187 Following Software Engineer at Wasm Upstream @DeisLabs @Microsoft 🕸🦀 · Star Trek, and SSBU fan · https://t.co/jrkxSA3nOGMarc Randolph @mbrandolph
100K Followers 661 Following Co-founder of @Netflix & 6 other companies 🏔🏄♂️🚴Outdoor enthusiastRocky Mountain Rep Ra.. @rmrrf
2K Followers 2K Following The #RockyMountainRepRapFestival! The Ranch Events Complex, 5280 Arena Cir, Loveland, CO 80538, April 20th and 21st 2024. 36,000 square feetMake It Colorado @MakeColorado
145 Followers 208 Following Specialty events focusing on the traditional and modern maker movement. #MakeItColorado #iamamaker #nonprofit. A project of [i am a maker].Makers Grove @Makers_Grove
21 Followers 19 FollowingDelta Lake @DeltaLakeOSS
8K Followers 66 Following Delta Lake is an open-source storage framework that enables building a Lakehouse architecture for Spark, Flink, Trino, Hive, Scala, Java, Rust, Python, & more!Natalie Vais @natalievais
2K Followers 1K Following GP @sparkcapital // databases, distributed systems, and developer tools // formerly @AmplifyPartners @GoogleCloud @Oracle // fight on 🌎🌱Larry Wright @larrywright
1K Followers 767 Following Curious person; maker of things. Cloud architect. Opinions are my own.BoredPerson @BoredPerson__
134 Followers 178 Following Co-Founder and Security Researcher @Neodyme, TUM studentKyle Barron @kylebarr.. @kylebarron2
3K Followers 704 Following Creating the next generation of geospatial data tools for Python & the browser with GeoArrow, GeoParquet & GeoRust @developmentseed | he/him 🌈Does managing a team of @ApacheSpark developers sound fun to you? Great news $dayjob is looking for a new manager — linkedin.com/posts/holdenka… if you’ve got management and Spark experience you should apply🪄.
@andygrove_io Wanted to help but #rustlang is still way too much to handle 😬
If the idea is to have one implementation of the spec with language specific bindings, they’d be better off implementing Arrow flight and exposing this as a service.
A new file format from Meta that could be a successor to Parquet: github.com/facebookextern…. Now if only we could get a #rustlang implementation of the spec. Obviously, any Parquet replacement will take years to really gather momentum, but the features had me nodding in agreement.
@andygrove_io haha yeah, I noticed that on closer look. If this thing actually has real differentiation that matters, seems like a native Rust implementation would make sense. Or someone creates a new spec inspired by this implementation and gives it a new name. Formats require specs!
@steeve So they should have started with a Rust implementation and let everybody wrap that 😁
A new file format from Meta that could be a successor to Parquet: github.com/facebookextern…. Now if only we could get a #rustlang implementation of the spec. Obviously, any Parquet replacement will take years to really gather momentum, but the features had me nodding in agreement.
Impressive. Chips Act funding got basically every major chipmaker to build in the US. “That means the US will become the only country in the world with facilities run by all of the top manufacturers.”
Devin Petersohn from @SnowflakeDB talked to @CMUDB yesterday about how dataframes are meeting databases. If you missed it, you can catch up here: youtu.be/7TyIjqvfWto?fe… Lots of exciting technical nuggets in the talk. Another interesting part is Devin talking about hanging out…
🤿 Dive into the future of data systems with @InfluxDB's Andrew Lamb on our latest episode. Highlights: ➡️ The shift from monolithic to specialized databases 📈 The 10x advantage of specialized systems ⚛️ Data Fusions critical role at InfluxDB spoti.fi/49UqXvQ
It was great geeking out on the @DataStackShow discussing challenges with time series, and how @InfluxDB handles them, and (natch) @ApacheDataFusio and what it enables. Kudos to @ericdodds and @KostasPardalis for making it easy. Pleasantly surprised by the quotes. 💯
🤿 Dive into the future of data systems with @InfluxDB's Andrew Lamb on our latest episode. Highlights: ➡️ The shift from monolithic to specialized databases 📈 The 10x advantage of specialized systems ⚛️ Data Fusions critical role at InfluxDB spoti.fi/49UqXvQ
I wrote a new post on the Sympathetic Ink blog to sum up The Deconstructed Database and what makes it composable. Learn more about the role of @ApacheParquet, @ApacheArrow, @ApacheDataFusio n, @ApacheIceberg, @ApacheCalcite and @OpenLineage sympathetic.ink/2024/04/29/The…
My talk from DataCouncil "Bulding InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet": youtube.com/watch?v=I-Z7kF…
@andygrove_io Congratulations Andy. You have done well to shepherd this project to maturity.
@andygrove_io Congrats, great achievement and milestone!
@andygrove_io And this comes right at the moment when I even considered a quick look at #Rust having seen all the recent #ApacheSpark-focused advances under #ApacheArrow's umbrella 👏👏👏