Andrew Lamb @andrewlamb1111
Apache Arrow PMC, Database Engineer andrew.nerdnetworks.org Joined November 2020-
Tweets251
-
Followers1K
-
Following40
-
Likes113
I wrote a new post on the Sympathetic Ink blog to sum up The Deconstructed Database and what makes it composable. Learn more about the role of @ApacheParquet, @ApacheArrow, @ApacheDataFusio n, @ApacheIceberg, @ApacheCalcite and @OpenLineage sympathetic.ink/2024/04/29/The…
We just need 8 more people to star the @ApacheDataFusio repo to get to the 5k popularity contest: datafusion.apache.org
It was great geeking out on the @DataStackShow discussing challenges with time series, and how @InfluxDB handles them, and (natch) @ApacheDataFusio and what it enables. Kudos to @ericdodds and @KostasPardalis for making it easy. Pleasantly surprised by the quotes. 💯
It was great geeking out on the @DataStackShow discussing challenges with time series, and how @InfluxDB handles them, and (natch) @ApacheDataFusio and what it enables. Kudos to @ericdodds and @KostasPardalis for making it easy. Pleasantly surprised by the quotes. 💯 https://t.co/2BAuJE16CM
🤿 Dive into the future of data systems with @InfluxDB's Andrew Lamb on our latest episode. Highlights: ➡️ The shift from monolithic to specialized databases 📈 The 10x advantage of specialized systems ⚛️ Data Fusions critical role at InfluxDB spoti.fi/49UqXvQ
My talk from DataCouncil "Bulding InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet": youtube.com/watch?v=I-Z7kF…
This is pretty amazing -- thanks @andygrove_io for helping make it happen. We are working through the various logistics of moving URLs, etc but it is getting close now
This is pretty amazing -- thanks @andygrove_io for helping make it happen. We are working through the various logistics of moving URLs, etc but it is getting close now
It's official! Apache DataFusion is now a top-level Apache project. 😃 github.com/apache/datafus… The URLs for the subprojects have also been updated. github.com/apache/datafus… github.com/apache/datafus… github.com/apache/datafus… Congratulations to the community for this big milestone!
Yet another reason I love open source. A great analysis from @Brayanjuls showing @ApacheDataFusio does the right thing with the parquet "zip bomb" from @hfmuehleisen duckdb.org/2024/03/26/42-…: github.com/apache/arrow-d…
DataFusion WASM is amazing! Big thanks to @wayne17229928 for their hard work, and @andrewlamb1111 for getting it included. I'm eager to collaborate more closely with the @ApacheDataFusio community to enhance the #OpenDAL object_store integration! github.com/datafusion-con…
A mark of age / maturity DataFusion has hit PR/Issue 10,000: github.com/apache/arrow-d…. It takes a lot of work to make that many issues, even with Dependabot helping
This is a pretty sweet example of using DataFusion's extension APIs to build UDFs in whatever language you want. In this case WASM! github.com/milenkovicm/wa… (it would be straightforward to apply the same approach for Python UDFs, for example 🎣)
Here is a paper I helped write that is in VLDB this year. vldb.org/pvldb/vol17/p1… Among other things it confirms my believe that join ordering is not a solved problem in databases. Maybe someone will use DataFusion to explore better algorithms 🎣
Submit to #NEDB2024 by 4/12 (11:59pm EST): cmt3.research.microsoft.com/NEDBDAY2024/ * Talks (2-page abstracts) on recent/new research or industry experience * Posters (1-page abstract) Stay tuned for more details on registration! bu-disc.github.io/nedbday/2024/ @samrmadden @vkalavri @__ssarkar
#TIL Apache Arrow DataFusion: a Fast, Embeddable, Modular Analytic Query Engine [paper] github.com/apache/arrow-d… #Rust #database
Congrats to @spice_ai for announcing a cool sounding project built on @ApacheArrow and @ApacheDataFusio blog.spiceai.org/posts/2024/03/…
Another great day with the #DevCommunity at the #DataFusion Meetup in Austin! ⚡ Big thanks to #InfluxDB Staff Engineer & Apache Arrow PMC @andrewlamb1111 for leading this event, & to our Founder @pauldix & DataFusion creator @andygrove_io for sharing their insights! #Rustlang
Andy Grove @andygrove_io
3K Followers 584 Following @ApacheArrow PMC Chair. Apache DataFusion PMC. Original creator of Apache DataFusion.Xuanwo @OnlyXuanwo
9K Followers 941 Following ASF Member. Apache #OpenDAL PMC Chair. VISION: Access data freely across services by any method.Paul Dix @pauldix
9K Followers 1K Following CTO of @InfluxDB (YC W13), founder of NYC Machine Learning, series editor for Addison Wesley's Data & Analytics, author of Service Oriented Design with Ruby.Mim @mim_djo
9K Followers 3K Following #Fabric Enthusiast, Small Data And self service, #Microsoftemployee since Nov 2023 , but my tweets are my own卡比卡比 @jakevin7
2K Followers 347 Following Love open source!想做有成长性的事和有趣的事 Github: https://t.co/fB3t3tUqIVsundyli @sundyli1
1K Followers 217 Following Working at datafuselabs. Email: [email protected] Apache OpenDAL PPMCplantegg @plantegg
40K Followers 389 Following 工程师,网络、性能、CPU等领域。个人介绍:https://t.co/sdAwtv1et3 欢迎加入我的知识星球:https://t.co/IxNVHUg5qpGreptime @Greptime
752 Followers 128 Following Greptime provides fast and efficient industrial services for time-series data. Star us on GitHub: https://t.co/i2CnWJDeoFKiv @kivdaychen
3K Followers 997 Following cmu mcse '24 | broke things @M5tTrading @RisingWaveLabs @Hyperledger, @BytedanceTalk and 3 others.Bohu @BohuTANG
3K Followers 138 Following Co-founder & CEO @DatabendLabs. Creator of Databend. Skateboarding🛹 Snowboarding🏂Chojan Shang — oss/.. @repsiace
1K Followers 330 Following Data Is Dead, Long Live Value | https://t.co/sJroS3w8hy 🥰 https://t.co/6hhaVpBBC0 | Apache #OpenDAL PMC Member.Batuhan Taskaya (e/si.. @isidentical
4K Followers 313 Following head of eng/silicon at @fal (fal ai labs). also a python core developer / @thePSF fellow. building the most efficient inference engine for diffusion models.Hannes Mühleisen @hfmuehleisen
5K Followers 936 Following I like databases. Co-creator of @duckdb, Co-Founder and CEO @duckdblabs. Professor of Data Engineering @Radboud_Uni» teej @teej_m
9K Followers 2K Following » Working on Titan » https://t.co/aZwqUSdNXn » my friends call me teejLiam Brannigan @braaannigan
2K Followers 2K Following Polars course discount: https://t.co/XhxIjPe989🕺💃🤟 Alexande.. @emaxerrno
4K Followers 2K Following Founder & CEO of @RedpandaData - A Kafka® replacement for mission critical systems. 10x Faster; Safe; API compatible. 🇨🇴Divya Rani @heydivyaa
743 Followers 1K Following MTS 3 @vmware | Generation Google Scholar’19 | GSoC'2019 with CERN-HSF | Outreachy'17 @opendatakitSurya @VasantTeja
280 Followers 887 FollowingAlex P @ifesdjeen
12K Followers 1K Following Distributed and Storage Systems. Apache Cassandra Committer and PMC member. Author of Database Internals @therealdatabass. Discord: https://t.co/8LwhZom9eQCarl Sverre @carlsverre
2K Followers 838 Following Exploring technology from first principles. Building SQLSync, real-time collaborative SQLite in the browser: https://t.co/SdLEfZo9eJ栗子沐 @songzhi_li
5 Followers 126 FollowingVishnu @v_1shnu
369 Followers 4K Following Co-Founder and CEO - https://t.co/8zuxUZ9l2h ; ❤️ Data Infrastructure; 🤕 😵💫 finding my way through lakes, warehouses, lakehouses, meshes, fabrics…Pratim🥑 @BhosalePratim
37K Followers 2K Following Backend stuff| Developer Advocate @SurrealDB | Prev: @nhost | @UBS | Go GDE | Rust 🦀 | (Views are personal)Michael Leow @leowmjw
354 Followers 1K FollowingThe Data Stack Show @DataStackShow
856 Followers 1K Following A podcast exploring the world of data. Hosted by @ericdodds & @KostasPardalis. ⚡️ Powered by @RudderStackZanda @zanda_HA
26 Followers 232 FollowingLili Cosic ☁ @lili@.. @LiliCosic
6K Followers 280 Following ☁ Software Engineer ☁ Distributed Systems ☁ Advisor to @PolarSignalsio ☁ Past: HashiCorp, RedHat,... https://t.co/KUwsZxJ3MM (she/her) 🇸🇮Kevin S. 🦀🧬 @FanBoyShi
45 Followers 151 Following Unironic AMD Investor. Geometric Algebra in everything. Science can remain mysterious longer than you can stay solvent.Omer Ozarslan @ofozarslan
14 Followers 273 Followingwisedb @wisedb185004
0 Followers 29 FollowingHengfei Yang @HengfeiYang
6 Followers 15 FollowingUwe Sommerlatt @usommerl
18 Followers 140 FollowingEric Marnadi @EricMarnadi
5 Followers 41 Followingtinyrobots @tinyrobots
809 Followers 5K Followingkenri.dev 🇨🇺 @kenriortega
438 Followers 1K Following Data/Devops engineer , #OSS contributor & enthusiast. ⚡️I am @golang, @python & #kafka DEV, gamer, I like listening to 🎧 music , my blog 👇Yasin @Yasin17121992
5 Followers 283 Following番茄 @sJgQEeqqfQJtjEI
12 Followers 75 FollowingAyush @AnandAbhinav157
46 Followers 694 Following Wanderer,Rebel, Software Engineer, AI, Python, Pytorch, PhysicsBryan Russett @bcrussett
1K Followers 1K Following CEO @ Caurus (data/infra/applied research) // working on @ Serval (OSS adaptive compute project) // Co-Founder @ datalogue (acq. by nike)tony @chico_rente
75 Followers 2K Following Verões são mais longos onde os suicidas se enforcam e as moscas comem tartes de lama LEIC FEUPaksh1618 @aksh1618@ko.. @aksh1618
51 Followers 559 Following Tech Lead @ 99Acres. Interested in all things Spring, Kotlin & Android. Excited about Rust & Wasm. #BackendDev #SpringBoot #AndroidDev #Java #Kotlin #RustPradeep Chhetri @p_chhetri
373 Followers 2K FollowingLouis Kuang @louis_lp
53 Followers 374 FollowingBlkd Dev @Blkd_dev
37 Followers 1K Following Software Developer Kotlin | Android | KMP | Spring boot | Go |Snail Brain @SnailBrain123
50 Followers 104 FollowingJake @aerialfly
9 Followers 35 Following Data stuff at places like @okta, @shopify, @cargurus, @oreillymediaBrian Mullen @btmulls
682 Followers 862 Following Marketing @influxdb, California, and Oxford commas.Yingjian Fu @yingjianfu
18 Followers 471 Following魚米团子 @suyanhanx
130 Followers 244 FollowingLettnem @Lettnem
337 Followers 3K FollowingSaif Ahmed @__Saif_Ahmed
19 Followers 213 FollowingXiao Meng @xiaomeng
86 Followers 570 Following Building @goldskyio. Formally @Activision/@Demonware.Andy Grove @andygrove_io
3K Followers 584 Following @ApacheArrow PMC Chair. Apache DataFusion PMC. Original creator of Apache DataFusion.Andy Pavlo (@andy_pav.. @andy_pavlo
29K Followers 205 Following Associate Prof. of Databases @CarnegieMellon. Co-Founder @OtterTuneAIPaul Dix @pauldix
9K Followers 1K Following CTO of @InfluxDB (YC W13), founder of NYC Machine Learning, series editor for Addison Wesley's Data & Analytics, author of Service Oriented Design with Ruby.卡比卡比 @jakevin7
2K Followers 347 Following Love open source!想做有成长性的事和有趣的事 Github: https://t.co/fB3t3tUqIVsundyli @sundyli1
1K Followers 217 Following Working at datafuselabs. Email: [email protected] Apache OpenDAL PPMCBohu @BohuTANG
3K Followers 138 Following Co-founder & CEO @DatabendLabs. Creator of Databend. Skateboarding🛹 Snowboarding🏂Hannes Mühleisen @hfmuehleisen
5K Followers 936 Following I like databases. Co-creator of @duckdb, Co-Founder and CEO @duckdblabs. Professor of Data Engineering @Radboud_Unipolars data @DataPolars
5K Followers 7 Following Dataframes powered by a multithreaded, vectorized query engine, written in RustJulien Le Dem @J_
4K Followers 2K Following Architect, Founder, Angel, Advisor, OSS: @OpenLineage @MarquezProject, ASF: Parquet Arrow Iceberg 🐖. 🦋 https://t.co/4VQUXaZ5vu . he/himKavi Kanagaraj 🦀 @kvrajk
473 Followers 536 Following Programmer, Go & Rust hacker, Distributed Systems, Linux enthusiast and Rational thinker. @grafana’s Loki maintainer. @recursecenter F1‘19Micah Wylde @mwylde
304 Followers 296 Following Co-founder @ArroyoSystems (YC W23), building next-gen streaming systems. Prev @Splunk, @LyftEng, @GetSift, @QuantcastAlex Miller @AlexMillerDB
510 Followers 192 Following All original thoughts occur on @[email protected] instead.Synnada @synnadahq
179 Followers 166 Following Synnada powers mission-critical real-time data apps w/ unified data processing, online ML & collaborative notebooks via the cloud. Empower data pros!SIGMOD/PODS 2024 @SIGMODConf
1K Followers 117 Following 2024 ACM SIGMOD/PODS International Conference on Management of Data.Ray @rayyy1024
1K Followers 958 FollowingRust Language @rustlang
142K Followers 2 Following A programming language empowering everyone to build reliable and efficient software.Niko Matsakis @nikomatsakis
14K Followers 464 Following Weird Al meets Grace Hopper. Rustacean. He/him. I work for @AWSCloud. Opinions on twitter and elsewhere are my own.Andrew Gallant @burntsushi5
8K Followers 97 Following I love to code. I rarely check DMs. My email address is on my web site.BoredPerson @BoredPerson__
134 Followers 178 Following Co-Founder and Security Researcher @Neodyme, TUM studentPedro Holanda @holanda_pe
1K Followers 183 Following Ph.D. in Database Architectures. Tunning knobs for a living.Yingjun Wu @RisingWav.. @YingjunWu
3K Followers 746 Following Founder @RisingWaveLabs. Building https://t.co/dZvDVNQ62r. Previously @awscloud Redshift, @IBMResearch Almaden. PhD @NUSingapore. Alumnus @CMUDB.Peter Boncz @peterabcz
1K Followers 71 Following Professor Analytical Data Systems @cwi_da and @VUamsterdam. researcher, systems architect, educator, entrepreneurDutch Seminar on Data.. @dsdsdnl
587 Followers 93 Following Monthly talks on data systems organized by research groups in the Netherlands. For attendance URLs subscribe to our mailing list via https://t.co/au0TwzBihWPVLDB (@pvldb@botsin... @pvldb
4K Followers 11 Following The Proceedings of the VLDB Endowment (PVLDB) RSS Feed: https://t.co/5wEKOfqAEb Mastodon: https://t.co/hdequfpdhGDuckDB @duckdb
13K Followers 3 Following DuckDB is an in-process SQL OLAP database management system. "DuckDB" and the DuckDB logo are registered trademarks of the DuckDB Foundation.DuckDB Labs @duckdblabs
3K Followers 3 Following DuckDB Labs is a small artisanal data processing company that provides services around DuckDB directly from its creators.Jon Mease @jonmmease
886 Followers 177 Following Creator of @vegafusion_io, acquired by @_hex_tech. @vega_vis Altair co-maintainer. Former Chief Scientist at @plotlygraphsKun Liu @FixLKun
44 Followers 51 FollowingJosh Patterson @datametrician
5K Followers 1K Following Co-founder and CEO @voltrondata. Originator of @RAPIDSai former @PIFgov (#44). Building bridges not walls. Making Data Science more efficient.NorthSouth Rail Link @NSRailLink
1K Followers 495 Following The North South Rail Link will improve efficiency, mobility & capacity throughout Mass., New England & the Northeast Corridor #DukakisKeith Kraus @keithjkraus
1K Followers 1K Following CTO and Co-Founder @VoltronData, @RAPIDSAI maintainer, @condaforge core. Previously @NVIDIA. My thoughts are my own.Wes McKinney @wesmckinn
59K Followers 909 Following Principal Architect @posit_pbc, GP @ComposedVC, Co-founder @voltrondata. OSS: @ApacheArrow @pandas_dev @IbisData, "Python for Data Analysis" bookDaniël Heres @daniel.. @daniel_heres
160 Followers 91 Following @ApacheArrow PMC and Open Sourceror | Senior Engineer @ Coralogix (Query Engine)Sam Madden @samrmadden
3K Followers 423 Following Professor, MIT EECS and Chief Scientist, Cambridge Mobile TelematicsDaniel Abadi @daniel_abadi
7K Followers 69 Following Darnell-Kanal Professor of Computer Science at University of Maryland, College ParkI wrote a new post on the Sympathetic Ink blog to sum up The Deconstructed Database and what makes it composable. Learn more about the role of @ApacheParquet, @ApacheArrow, @ApacheDataFusio n, @ApacheIceberg, @ApacheCalcite and @OpenLineage sympathetic.ink/2024/04/29/The…
@andrewlamb1111 @ApacheDataFusio Mission accomplished!
The recording from my #DataCouncil talk is now up! Check it out if you’re curious about how we can apply SQL to streams: youtu.be/d__f8B9WJB8?si…
🤿 Dive into the future of data systems with @InfluxDB's Andrew Lamb on our latest episode. Highlights: ➡️ The shift from monolithic to specialized databases 📈 The 10x advantage of specialized systems ⚛️ Data Fusions critical role at InfluxDB spoti.fi/49UqXvQ
@andrewlamb1111 Code is here github.com/GreptimeTeam/g… It's achieved exactly in the way you mentioned in that issue. I translate the index apply result which is a group of line numbers into RowSelection, and pass it to the parquet reader.
this is an unusual architecture, running a massive numbers of cloud functions using #DuckDB to process logs, a Million Invocation per day, peak 4 GB/s for very cheap cost youtube.com/watch?v=TrmJil…
It's official! Apache DataFusion is now a top-level Apache project. 😃 github.com/apache/datafus… The URLs for the subprojects have also been updated. github.com/apache/datafus… github.com/apache/datafus… github.com/apache/datafus… Congratulations to the community for this big milestone!
Congrats to @andygrove_io and the new VP @andrewlamb1111 on spinning off DataFusion as its own TLP!
Happy to be elected as an ASF member and thanks @tison1096 for nomination. The ASF is committed to developing software for the public good. And my personal vision is to enable free access to data across services by any method. We are on the same way!
[NEWS] Apache Software Foundation Welcomes 59 New Members bit.ly/4416AMn #opensource #ASF25Years
@andrewlamb1111 @OnlyXuanwo @ariesdevil77 I think Parquet is actually very good, especially with a page index. Page read granularity is almost ideal for object storage.
TIL about @paradedb: high-perf OLAP capabilities for #Postgres, based on #Parquet for storage and #DataFusion for query execution. Amazing to see all these OSS building blocks out there for making things like this possible without starting from scratch. github.com/paradedb/parad…
@gunnarmorling @criccomini @paradedb @PostgreSQL You can read our announcement blog post here: blog.paradedb.com/pages/introduc… And I'm happy to share more details directly as well
One of those books which everyone recommends and no one reads? ;p
MIT’s “Introduction to Algorithms,” published #otd in 1990, is the world’s most cited CS text, with 67K citations & over a million copies sold. bit.ly/3y1yMPR @mitpress
@andrewlamb1111 @OnlyXuanwo I thought of some new optimization methods, I will try them today.
DataFusion WASM is amazing! Big thanks to @wayne17229928 for their hard work, and @andrewlamb1111 for getting it included. I'm eager to collaborate more closely with the @ApacheDataFusio community to enhance the #OpenDAL object_store integration! github.com/datafusion-con…
Submit to #NEDB2024 by 4/12 (11:59pm EST): cmt3.research.microsoft.com/NEDBDAY2024/ * Talks (2-page abstracts) on recent/new research or industry experience * Posters (1-page abstract) Stay tuned for more details on registration! bu-disc.github.io/nedbday/2024/ @samrmadden @vkalavri @__ssarkar
🌶️ Exciting News! 🌶️ This will be my last week working at NVIDIA. It has been a wonderful experience working here for the past four years with a really talented team. However, I will be heading over to Apple next week to work on Apache DataFusion and the Comet Spark native…
#TIL Apache Arrow DataFusion: a Fast, Embeddable, Modular Analytic Query Engine [paper] github.com/apache/arrow-d… #Rust #database