Jacek Laskowski @[email protected] @jaceklaskowski
Freelance Data Engineer | #ApacheSpark #DeltaLake #Databricks #ApacheKafka #KafkaStreams | Java Champion | @theASF | #DatabricksBeacons linkedin.com/in/jaceklaskow… Warsaw, Poland Joined May 2009-
Tweets26K
-
Followers7K
-
Following874
-
Likes15K
We're starting to build a backlog of good first issues in DataFusion ⚛️ Comet ☄️ Contributing to these issues is a great way to start learning about Spark, DataFusion, and Arrow internals. github.com/apache/datafus…
A new file format from Meta that could be a successor to Parquet: github.com/facebookextern…. Now if only we could get a #rustlang implementation of the spec. Obviously, any Parquet replacement will take years to really gather momentum, but the features had me nodding in agreement.
Flatbuffers is a better way to store metadata than Thrift for sure. I do wish it had existed when I designed this aspect of Parquet.
We just need 8 more people to star the @ApacheDataFusio repo to get to the 5k popularity contest: datafusion.apache.org
Even wondered what happens after CREATE [[GLOBAL] TEMPORARY] VIEW AS statement is executed in #ApacheSpark #SparkSQL? Start here ➡️ books.japila.pl/spark-sql-inte… ...and follow along until you know it all or got qqs that I could answer in a follow-up 😉
I guess it is official now, #DuckDB is adding support to #Deltalake databricks.com/dataaisummit/s…
Wonder if you know what I'm talking about when I say "I've been up to my eyeballs in the projects lately"? Is this used often my dear English natives? 🤔 ➡️ dictionary.cambridge.org/dictionary/eng… No one's gonna understand what I'm saying any longer! 😆 #English
It's official! Apache DataFusion is now a top-level Apache project. 😃 github.com/apache/datafus… The URLs for the subprojects have also been updated. github.com/apache/datafus… github.com/apache/datafus… github.com/apache/datafus… Congratulations to the community for this big milestone!
TIL about @paradedb: high-perf OLAP capabilities for #Postgres, based on #Parquet for storage and #DataFusion for query execution. Amazing to see all these OSS building blocks out there for making things like this possible without starting from scratch. github.com/paradedb/parad…
TIL about @paradedb: high-perf OLAP capabilities for #Postgres, based on #Parquet for storage and #DataFusion for query execution. Amazing to see all these OSS building blocks out there for making things like this possible without starting from scratch. github.com/paradedb/parad…
I'm excited to be getting started with some contributions to the Apache DataFusion Comet project for accelerating Apache Spark. If you would like to get involved in the project, a great place to start would be to help implement Spark-compatible CAST expressions. Some of these…
#TIL #ApacheSpark #SparkSQL 3.5 comes with Named Function Arguments feature that lets you specify arguments by name 🥳 TVFs and a few built-in standard functions are supported only (but more is coming up in 4.0) 👏👏👏 ➡️ books.japila.pl/spark-sql-inte…
And we know what's coming in #ApacheSpark 4.0.0. This version surely makes us all long-time Spark users soooo OLD! 😆 And I'd not be surprised if some tricks of mine may've happened to be outdated already 😉 Named parameters in SQL statements are already available since 3.5.
Me not attending #DataAISummit this year but you really should. Reasons to attend: 🎤 500+ sessions over 4 days 🤖 Discover the latest advancements in #GenAI 🤝Connect and network with thousands of data and AI community peers Register now👇 dbricks.co/4cBIag9
Ouch...got kicked out of #DataAISummit this year and won't make it to the event 🤷♂️ I'm kinda OKish with it given I was going to talk about #DeltaLake which is fairly mature already and there are other more interesting stuff to tune in like #GenAI, #RAG, #LLM et al. So be it!
There are quite a few new standard functions in #ApacheSpark #SparkSQL 3.5 alone yet there are way more added in the recent versions. One of them is max_by standard aggregate function that got added as early as in 3.3 🥰 ➡️ books.japila.pl/spark-sql-inte…
Compression Codecs for Apache Parquet. Compression algorithms reduces the size of data files, making storage & data transfer more efficient. This is important specifically when dealing with larger data volumes as it might have significant impact on performance and costs.
The #KafkaSummit London sessions are now available! Best thing for the weekend! Dig into all the recorded sessions from the event with the best and brightest minds in #ApacheKafka and data streaming. confluent.io/resources/gene…
Generate Unary Logical Operator in #ApacheSpark ➡️ books.japila.pl/spark-sql-inte… Got somewhat curious what's inside of this operator and ended up with more qqs than what I'd started with! 😆 What qqs you're asking yourself after reading these snippets? 🤔 LMK 🙏
Matei Zaharia @matei_zaharia
39K Followers 1K Following CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZDelta Lake @DeltaLakeOSS
8K Followers 66 Following Delta Lake is an open-source storage framework that enables building a Lakehouse architecture for Spark, Flink, Trino, Hive, Scala, Java, Rust, Python, & more!Maciej Walkowiak 🍃 @maciejwalkowiak
36K Followers 995 Following Independent #Java #Spring #AWS 🧑💻 making @justxrun 👉 https://t.co/5hDONs8nrh 📺 https://t.co/xtk152k8qmDatabricks @databricks
70K Followers 1K Following Databricks is the data and AI company, helping data teams solve the world’s toughest problems.Rock the JVM @rockthejvm
8K Followers 215 Following Teaching #Scala, #Kotlin, #Spark, #Flink and tech on the JVM. 📹 Videos at https://t.co/1ODhzZCpb9 🔖 Articles at https://t.co/vwfzUXMF4CKonrad ‘ktoso’ Ma.. @ktosopl
8K Followers 3K Following “Life is Study!”; [email protected]; Swift Actors & Distributed Systems @ Previously: Reactive Streams (TCK), @akkateam Actors, HTTP & Streams, @geeconGwen (Chen) Shapira @gwenshap
26K Followers 9K Following Co-founder of @niledatabase. Making SaaS global, elastic and chill. Find me at: https://t.co/uyuHg400cpJosh Long @starbuxman
77K Followers 4K Following Spring Developer Advocate (@Java_Champions & @Kotlin @GoogleDevExpert) @VMwareTanzu 🍃🐲 📽️ https://t.co/A2wBUe0b0AAdi Polak @AdiPolak
14K Followers 802 Following DevX @ Confluent • Cloud • ML/AI & Data Platforms • Ex Microsoft, Akamai • Keynote Speaker • Author of Scaling ML Systems(O'Reilly) • Opinions are mine∃ugene -Yokota 🥙.. @eed3si9n
5K Followers 606 Following enjoys music, good food, coding, and talking about them. learning machines at Netflix. @scala_sbt core dev. mastodon: @[email protected]javinpaul @javinpaul
94K Followers 8K Following Blogger - https://t.co/Cxgp9zzN3y Creator - https://t.co/GYls4Lx9DW newsletter - https://t.co/P8jiQ5GW16 youtube - https://t.co/vs4WjwaEQ6Robin Moffatt 🍻�.. @rmoff
10K Followers 661 Following DevEx Engineer at @Decodableco. Doing fun stuff with data and open source. 🌐 https://t.co/WparjfmCF5 🔗 Mastodon: @[email protected]Bartosz Konieczny @waitingforcode
2K Followers 81 Following Freelance Data Engineer and instructor, enjoy solving data problems with #ApacheSpark #AWS #GCP #Azure 👨🏭 | [email protected]Mim @mim_djo
9K Followers 3K Following #Fabric Enthusiast, Small Data And self service, #Microsoftemployee since Nov 2023 , but my tweets are my ownABC @Ubunta
3K Followers 3K Following Data & ML Infrastructure for Healthcare https://t.co/FwocCiCQAT Opinions are पड़ोसी' In 🇩🇪Berlin from 🇮🇳Kolkata/छत्तीसगढ़Venkat Subramaniam �.. @venkat_s
66K Followers 434 Following Programmer, author, speaker, founder Agile Developer, Inc., co-founder of @dev2next Conference, professor @CSatUHJames Ward @_JamesWard
17K Followers 3K Following Mutability was the "trillion-dollar mistake" and more hot-takes on my podcast: @HappyPathProg Disclaimer: I work for @AWSCloud & my opinions are my own.koziolek @koziolek
886 Followers 416 Following Ja się z tobą nie kłócę, ja ci tłumaczę, dlaczego nie masz racji. I/0 ps. jakbyśmy umarli to » @[email protected] https://t.co/Epfzvr818sholden karau @holdenkarau
17K Followers 2K Following she/her, OSS Big Data. ❤️🛵 ☕️ spark. I don't represent my employer. Live @ https://t.co/uOyeZtBXx0 , https://t.co/GB3Ok0vbVASharat Chander | 🟧.. @Sharat_Chander
20K Followers 5K Following 🏳️🌈 🇺🇸 🇮🇳 🇺🇦 🇮🇱 🇵🇸 | LEGACY VERIFIED | 🥃 lover | #DevRel expert | 💜 @DepecheMode | @UofMaryland & @LoyolaMaryland grad | Views mineShani Shapp @ShaniShapp
1 Followers 28 Followingk.b 👨💻 @jordandesi
56 Followers 264 Following Senior Software Engineer (Backend and Data Engineering). I like to share knowledge snippets on tech and quote lyrics.gokul @gxxxl05
31 Followers 157 Following Speak my language in Data, AI & ML, Tech, Gadgets, Formula 1 and sometimes politics 👀jrfv2h1qmbhybzjk @mr7gimyhv3
2 Followers 128 Followingaicas @aicas_IoT
950 Followers 4K Following Create, Deploy and Manage Cloud-to-Edge. Fast. Simple. Reliable. aicasGalvanize @Edge_techmotion
192 Followers 2K FollowingAmplify Data @amplifydata
7 Followers 75 Following Amplify is a white-labeled solution for companies to share data natively with customers. No ETL, APIs, or engineering needed.Alejandro Duarte @alejandro_du
3K Followers 797 Following #Java #SQL #Programming #RaspberryPi #Vaadin #MariaDB #DevRel Published Author · Software Engineer · Developer Relations Engineer at MariaDBAbby @efe026
7 Followers 216 FollowingLalit @lalitdmum
74 Followers 921 FollowingAndy Samu @Andysamu
383 Followers 458 Following Editor at #DisruptionBanking. Fan of #DeFi, Capital Markets and Relationship Alpha. You can also find me at @IoDFin_FinTech, disrupting the UK #Fintech sceneSimon Martinelli @simas_ch
7K Followers 2K Following Java Champion, Vaadin Champion, Oracle ACE Associate, Speaker, Programming Architect, and Lecturer for Software Architecture, Java, Persistence, and DevOps.Monty @montyinspired
99 Followers 4K FollowingJonathan Vila 🥑 �.. @vilojona
4K Followers 2K Following Developer Advocate 🥑 #Java @sonarsource | @Java_Champions | @BarcelonaJUG organiser | @Dev_Bcn & @JBCNConf cofounder | Ex @TetrateIO @RedHat @OcadoTechnologyDhananjay Jagtap @Dkjagtap191
468 Followers 4K Following | Optimistic ✨| Dreamer⚡ | Achiver | clicker📸I Coder💻a-win @Osgilian
85 Followers 101 Following Grad Student at UTD time is an illusion, lunchtime doubly soMichał Wilczyński @Czupelek
8 Followers 15 Followingnull @from_0_to_null
400 Followers 1K FollowingAntonio @antoniomfc90
248 Followers 142 Following Es la inteligencia la que nos convierte en hombres.Valentin @valentin_oneone
108 Followers 355 Following Software Engineer в общем, Data Engineer в частностиAshwin Dinoriya @AshwinDinoriya
48 Followers 376 FollowingPablo Ordorica @pablordoricaw
55 Followers 513 FollowingRaxit @raxit65535
28 Followers 186 Following well I don't think that much About my self. there are lots of other interesting topics & problems to invest time in.douglasX @Douglas74706081
158 Followers 3K Following I just enjoy researching and sharing about computer science, computer graphics, and artificial intelligence.Nathan @Nathan236008
394 Followers 3K FollowingArif Ahmad @ArifAhm92263086
243 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAIlulu @lukaeae
21 Followers 162 FollowingKunal @Kunalc232
6 Followers 114 FollowingTesfaye A. Naramo @Tesfaye_Naramo
14 Followers 325 Following “The path to success is to take massive, determined actions.”David Regalado @thecodemancer_
2K Followers 4K Following - VP of Engineering @ Stealth Startup - Founder @DataEngiLatam - Mentor #ArtificialIntelligence #AI #datascience #dataengineeringYakhin.Mehkum @Yakhin_Mehkum
311 Followers 5K Following "Be a free thinker and don't accept everything you hear as truth. Be critical and evaluate what you believe in." -- Aristotle (384 - 322 BC).Rynb @Ryan21291079
13 Followers 985 FollowingArun @arniekvr
2 Followers 45 Followingdig8italX @dig8italX
134 Followers 2K Following dig8italX, the leading artificial intelligence firm that specializes in creating customized AI solutions for businesses.data_sports fusion @_gamepulse
91 Followers 304 Following #cricket | #formula1 | #data | #platform engineering | #databricks | #azure | follow and get a follow backالاب غاليان.. @madbutsad
34 Followers 790 Following مثقف و جاهل و حنين و غبي و جبان و شهم و عبقري و فاشل و متسرع و حويط و فنان و دفش و رياضي و مدخن وسافل و مؤدب .. انما مغرور جدا مافيهاش كلامAlex Xu @alexxubyte
228K Followers 387 Following Co-Founder of ByteByteGo | Author of the bestselling book series: ‘System Design Interview’ | YouTube: https://t.co/9gPSJSrtPUMatei Zaharia @matei_zaharia
39K Followers 1K Following CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZJaana Dogan ヤナ �.. @rakyll
114K Followers 1K Following Distinguished Engineer at GitHub, working on Copilot. Previously Google, AWS, and several small companies. Personal opinions.Delta Lake @DeltaLakeOSS
8K Followers 66 Following Delta Lake is an open-source storage framework that enables building a Lakehouse architecture for Spark, Flink, Trino, Hive, Scala, Java, Rust, Python, & more!Databricks @databricks
70K Followers 1K Following Databricks is the data and AI company, helping data teams solve the world’s toughest problems.Rock the JVM @rockthejvm
8K Followers 215 Following Teaching #Scala, #Kotlin, #Spark, #Flink and tech on the JVM. 📹 Videos at https://t.co/1ODhzZCpb9 🔖 Articles at https://t.co/vwfzUXMF4CGwen (Chen) Shapira @gwenshap
26K Followers 9K Following Co-founder of @niledatabase. Making SaaS global, elastic and chill. Find me at: https://t.co/uyuHg400cpAdi Polak @AdiPolak
14K Followers 802 Following DevX @ Confluent • Cloud • ML/AI & Data Platforms • Ex Microsoft, Akamai • Keynote Speaker • Author of Scaling ML Systems(O'Reilly) • Opinions are mineRobin Moffatt 🍻�.. @rmoff
10K Followers 661 Following DevEx Engineer at @Decodableco. Doing fun stuff with data and open source. 🌐 https://t.co/WparjfmCF5 🔗 Mastodon: @[email protected]Bartosz Konieczny @waitingforcode
2K Followers 81 Following Freelance Data Engineer and instructor, enjoy solving data problems with #ApacheSpark #AWS #GCP #Azure 👨🏭 | [email protected]Venkat Subramaniam �.. @venkat_s
66K Followers 434 Following Programmer, author, speaker, founder Agile Developer, Inc., co-founder of @dev2next Conference, professor @CSatUHAlvin Alexander @alvinalexander
5K Followers 390 Following Learn recursion for free! https://t.co/0IQUqvxhy1Adam Warski @adamwarski
7K Followers 185 Following Software Engineer, R&D @softwaremill, Scala/functional programmer, blogger. Also: @[email protected]IntelliJ IDEA, a JetB.. @intellijidea
148K Followers 23 Following @intellijidea – the Leading Java and Kotlin IDE, by @JetBrains Tips: #IntelliJIDEATips New Features: #NewInIntelliJIDEA Our YT channel https://t.co/GuAlWUIi7QAndy Pavlo (@andy_pav.. @andy_pavlo
29K Followers 205 Following Associate Prof. of Databases @CarnegieMellon. Co-Founder @OtterTuneAIJames Ward @_JamesWard
17K Followers 3K Following Mutability was the "trillion-dollar mistake" and more hot-takes on my podcast: @HappyPathProg Disclaimer: I work for @AWSCloud & my opinions are my own.holden karau @holdenkarau
17K Followers 2K Following she/her, OSS Big Data. ❤️🛵 ☕️ spark. I don't represent my employer. Live @ https://t.co/uOyeZtBXx0 , https://t.co/GB3Ok0vbVATowards Data Science @TDataScience
224K Followers 2K Following A Medium publication sharing concepts, ideas, and codes. Share your insights and projects with our global audience: https://t.co/Mh1ZLme1o4.Matthias J. Sax @MatthiasJSax
3K Followers 496 Following American Homeowner | Software Engineer @ConfluentInc working on @KafkaStreams | @TheASF Committer and PMC member (@ApacheKafka, @ApacheFlink, @ApacheStorm)Naveen Rao @NaveenGRao
28K Followers 785 Following VP GenAI @Databricks. Former CEO/cofounder MosaicML & Nervana/IntelAI. Neuro + CS. I like to build stuff that will eventually learn how to build other stuff.Decodable @Decodableco
3K Followers 2K Following Decodable is a serverless real-time data platform built on #ApacheFlink. No clusters to set up. No code to write. No PhD required.Akshay 🚀 @akshay_pachaar
135K Followers 417 Following Simplifying LLMs, MLOps, Python & Machine Learning for you! • AI Engineering @LightningAI • Lead DataScientist • BITS Pilani • 3 PatentsarXiv.org @arxiv
35K Followers 188 Following News from https://t.co/enurGFxpcS, a free distribution service and an open archive for scholarly articles. For help with arXiv, see https://t.co/LcWuhM0BOlPatrick Loeber @patloeber
55K Followers 887 Following Software Engineer • YouTube 250K+ • Helping you to learn Python and Machine Learning • AI Developer Advocate @AssemblyAI • @python_engineer founderApache XTable (Incuba.. @apachextable
296 Followers 19 Following Apache XTable is a cross-table interop of table formats Apache Hudi, Apache Iceberg, and Delta Lake. (prev OneTable) https://t.co/SXsyRuZMNDMilvus @milvusio
3K Followers 5K Following The most widely adopted open source vector database for #AI #OpenSource #VectorSearch 💬Discord: https://t.co/9yOD2GjWv4 🔗Find us: https://t.co/BbTzkz9bHNSam Raymond @sjraymond
72 Followers 112 Following 🇦🇺 🇺🇸 ML @ Databricks | Professor @ Dartmouth ### On a mission to democratize AI education and discover the new landscape of GenAI in research! ###bytewax @bytewax
494 Followers 192 Following 🌩️ Cloud native 🚀 highly scalable #streamprocessing 👐 open source 🐍 #Python native Slack 💛 GitHub ⭐️ & all other important links https://t.co/qVzqxixEeEPuffinDB @PuffinDB
306 Followers 10 Following Serverless HTAP cloud data platform powered by Arrow × DuckDB × IcebergAlex Monahan @__AlexMonahan__
3K Followers 703 Following Forward Deployed Software Engineer at MotherDuck! Docs & blogs at DuckDB Labs! Views expressed are my own and not my employers'.Rill Data @RillData
2K Followers 246 Following Rill is an operational BI tool that provides fast dashboards your team will actually use. Try Rill for free: curl https://t.co/yx4CT8dCym | shTabular @tabulario
888 Followers 94 Following Tabular is storage platform from the creators of Apache Iceberg, including ingestion, performance optimization, central RBAC and SaaS simplicity.MotherDuck @motherduck
5K Followers 121 Following Making analytics fun, frictionless and ducking awesome with a serverless easy-to-use data analytics platform based on @DuckDB in collab with @duckdblabs.DuckDB @duckdb
13K Followers 3 Following DuckDB is an in-process SQL OLAP database management system. "DuckDB" and the DuckDB logo are registered trademarks of the DuckDB Foundation.Andy Grove @andygrove_io
3K Followers 584 Following @ApacheArrow PMC Chair. Apache DataFusion PMC. Original creator of Apache DataFusion.heardingdata @heardingdata
20 Followers 39 Following ☝️ get it? I write code, I wrote about code, and I have a blast teaching and helping others fall in love with Engineering too.Anand Babu Periasamy @abperiasamy
2K Followers 150 Following MinIO, Gluster, Startups, Angel Investor. “Where there is love there is life.” ― Mahatma GandhiShreya Shankar @sh_reya
39K Followers 589 Following I study ML & AI engineers and try to make their lives a little better. PhD-ing in databases & HCI @Berkeley_EECS @UCBEPIC and MLOps-ing around town. She/they.Jim Dowling @jim_dowling
2K Followers 1K Following Co-founder and CEO @hopsworks. Organizer of the feature store summit. I am writing a book on Building ML Systems for O'Reilly.Voltron Data @VoltronData
4K Followers 26 Following We offer a new way to design and build composable data systems based on open source standards.PyQuant News 🐍 @pyquantnews
115K Followers 296 Following Where finance practitioners get started with Python for quant finance, algorithmic trading, and data analysis | Tweets & threads with free Python code & tools.Onehouse @Onehousehq
915 Followers 98 Following Onehouse is the universal data lakehouse, offering a cloud-native managed lakehouse built on @apachehudi, accessible across table formats, engines and clouds.Matthew Powers @neapowers
307 Followers 167 Following Spark / DataFrame nerd. Helping Latin@s / Brasilians in tech. Blog and write open source to make ppl more productive.Filecoin @Filecoin
677K Followers 344 Following Filecoin ⨎ is the largest decentralized data storage marketplace, protocol, & cryptocurrency. Twitter account is community managed.Ray Summit @Ray_Summit_Live
639 Followers 2 Following Join us the #Ray community in SF for keynotes, #Ray deep dives, #llm sessions and lightning talks exploring the future of machine learning and scalable #AI.ClickHouse @ClickHouseDB
8K Followers 58 Following ClickHouse is the fastest open-source OLAP database ⚡ Download: https://t.co/3JKlDJbkcH GitHub: https://t.co/bjCe9qIetg Slack: https://t.co/d95c6jVeJmOrange Book 🍊📖 @orangebook_
530K Followers 376 Following Thoughts triggering thoughts. Learning on the go. No label required.Dipankar Mazumdar🥑 @Dipankartnt
1K Followers 528 Following Staff Data Engineering Advocate @OnehouseHQ, prev DevRel @Dremio, R&D @Qlik, Data @OtisElevatorCo | Author (O’Reilly) | Research: https://t.co/AiDKzVJCGaBen Meer @SystemSunday
364K Followers 147 Following The Systems Guy • Follow me for systems on health, wealth, & free time ⚡ Cornell MBA • 2M+ audienceTwist @TwistWork
7K Followers 1K Following Async messaging for teams burned out by real-time chat, video calls, and email • #WorkAsyncNotASAP • by @Doist • Status updates: @TwistStatusdbt Labs @dbt_labs
8K Followers 1 Following The creators and maintainers of @getdbt. We’re hiring! https://t.co/HKiRwQXTHe…Stockfish Chess @stockfishchess
9K Followers 7 Following The strongest open source chess engine. Tweets by @daylenyangVolodymyr Zelenskyy /.. @ZelenskyyUa
7.4M Followers 1 Following President of Ukraine / Президент УкраїниIntroduction to Algor.. @clrs4e
8K Followers 3 Following The fourth edition of the iconic algorithms textbook. Tweets are by @thcormen.Dane Hillard 🐍 @da.. @easyaspython
3K Followers 658 Following 📚 Publishing Python Packages (https://t.co/cBGIxbRrA7) 🐍📦⬆️ 💻 Technical architect at @ithaka_org ✨ Web applications, design systems, and micro frontendsBaeldung @baeldung
72K Followers 878 Following Passionate about everything Java. Teaching Spring on https://t.co/vh3oOY6ka6. Java Champion.PVLDB (@pvldb@botsin... @pvldb
4K Followers 11 Following The Proceedings of the VLDB Endowment (PVLDB) RSS Feed: https://t.co/5wEKOfqAEb Mastodon: https://t.co/hdequfpdhGTobias_Petry.sql @tobias_petry
21K Followers 273 Following The Database Guy. I am helping you get better with MySQL and PostgreSQL. Building: @stackbricksapp https://t.co/Yfe8XeSBfg https://t.co/ckxS6s2G0rReal Python @realpython
243K Followers 165 Following Online #Python Training & Expert Community: Tutorials, Video Courses, Books, Quizzes...and More! Join 3,000,000 Monthly Readers at https://t.co/TyrG6Kkaq6Hojjat Jafarpour @Hojjat
830 Followers 138 Following Founder & CEO @DeltaStreamInc. We are HIRING! The creator of #ksqlDB (#KSQL), the streaming SQL engine for Apache Kafka, @ConfluentInc.‘lay’ = to place something down flat ‘lie’ = to be in a flat position on a surface These two are already confusing, and to top it all off, ‘lay’ is the past tense of ‘lie,’ and ‘laid’ is the past tense of ‘lay.’ So, let's show each other some grace...
I'm working on a Delta Lake TLA+ specification today. Mostly done. I suppose I should write a Delta Lake consistency blog post next.
Breaking: IBM/Red Hat to announce a merger of Terraform and Ansible named, wait for it, Terrible 🥳.
We're starting to build a backlog of good first issues in DataFusion ⚛️ Comet ☄️ Contributing to these issues is a great way to start learning about Spark, DataFusion, and Arrow internals. github.com/apache/datafus…
Flatbuffers is a better way to store metadata than Thrift for sure. I do wish it had existed when I designed this aspect of Parquet.
@andrewlamb1111 @ApacheDataFusio Mission accomplished!
Always remember that the CAP theorem is a law. Try to break it and the distributed systems police will arrest you.
Imagine getting laid off, interviewing, getting a job offer from Google, then getting sued for taking the job.
@IsForAt No company should own your right to work after they stop paying you. This is common sense. AWS even enforces non-competes after they lay people off.
If IBM decides to merge Terraform and Ansible, would that be Terrible?
People who are using LLMs for therapy have a fundamental lack of understanding of both psychotherapy and LLMs. Therapy is a guided debugging event, e.g. a debugger for your prompts. LLMs are tools to navigate a corpus based on a very biased and authoritative prompt.
My talk from DataCouncil "Bulding InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet": youtube.com/watch?v=I-Z7kF…
This is pretty amazing -- thanks @andygrove_io for helping make it happen. We are working through the various logistics of moving URLs, etc but it is getting close now
It's official! Apache DataFusion is now a top-level Apache project. 😃 github.com/apache/datafus… The URLs for the subprojects have also been updated. github.com/apache/datafus… github.com/apache/datafus… github.com/apache/datafus… Congratulations to the community for this big milestone!
@mim_djo That's cool, opening up more used cases. I wonder whether this extension will come from Databricks or DuckDB Labs. The speaker is from Databricks.
I guess it is official now, #DuckDB is adding support to #Deltalake databricks.com/dataaisummit/s…
@jaceklaskowski I love phrases like this (new to me—not sure if used frequently) and comparing differences between English and German (sure it works for Polish, too) Some are literally the same, some are similar with minor adjustments, and the best/most entertaining ones don’t translate at all!
@astonzhangAZ Congrats! Video podcasts is a great idea. Looking forward to learning more details.
Llama 3 has been my focus since joining the Llama team last summer. Together, we've been tackling challenges across pre-training and human data, pre-training scaling, long context, post-training, and evaluations. It's been a rigorous yet thrilling journey: 🔹Our largest models…
Puh, was concerned for a moment that @java wouldn't like me any more. Welcome back ;)