Airflow 3 comes with two new features: Edge execution and the task SDK. Powered by a HTTP API, these make it possible to write and execute Airflow tasks in any language from anywhere. In this session I will explain some of the APIs needed and show how to interact with them based on an embedded toy worker written in Rust and running on an ESP32-C3. Furthermore I will provide practical tips on writing your own edge worker and how to develop against a running instance of Airflow.
talk-data.com
Topic
Rust
104
tagged
Activity Trend
Top Events
In this episode, Conor and Bryce chat with Jared Hoberock about the NVIDIA Thrust Parallel Algorithms Library, Rust vs C++, Python and more. Link to Episode 240 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBryce Adelstein Lelbach: TwitterAbout the Guest Jared Hoberock joined NVIDIA Research in October 2008. His interests include parallel programming models and physically-based rendering. Jared is the co-creator of Thrust, a high performance parallel algorithms library. While at NVIDIA, Jared has contributed to the DirectX graphics driver, Gelato, a final frame film renderer, and OptiX, a high-performance, programmable ray tracing engine. Jared received a Ph.D in computer science from the University of Illinois at Urbana-Champaign. He is a two-time recipient of the NVIDIA Graduate Research Fellowship. Show Notes Date Generated: 2025-05-21 Date Released: 2025-06-27 ThrustThrust Docsiota Algorithmthrust::counting_iteratorthrust::sequenceMLIRNumPyNumbaIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
Join us for an in-depth Ask Me Anything (AMA) on how Rust is revolutionizing Lakehouse formats like Delta Lake and Apache Iceberg through projects like delta-rs and iceberg-rs! Discover how Rust’s memory safety, zero-cost abstractions and fearless concurrency unlock faster development and higher-performance data operations. Whether you’re a data engineer, Rustacean or Lakehouse enthusiast, bring your questions on how Rust is shaping the future of open table formats!
Change data feeds are a common tool for synchronizing changes between tables and performing data processing in a scalable fashion. Serverless architectures offer a compelling solution for organizations looking to avoid the complexity of managing infrastructure. But how can you bring CDFs into a serverless environment? In this session, we'll explore how to integrate Change Data Feeds into serverless architectures using Delta-rs and Delta-kernel-rs—open-source projects that allow you to read Delta tables and their change data feeds in Rust or Python. We’ll demonstrate how to use these tools with Lakestore’s serverless platform to easily stream and process changes. You’ll learn how to: Leverage Delta tables and CDFs in serverless environments Utilize Databricks and Unity Catalog without needing Apache Spark
Join us as we introduce Delta-Kernel-RS, a new Rust implementation of the Delta Lake protocol designed for unparalleled interoperability across query engines. In this session, we will explore how maintaining a native implementation of the Delta specification — with native C and C++ FFI support — can deliver consistent benefits across diverse data processing systems, eliminating the need for repetitive, engine-specific reimplementations. We will dive deep into a real-world case study where a query engine harnessed Delta-Kernel-RS to unlock significant data skipping improvements — enhancements achieved “for free” by leveraging the kernel. Attendees will gain insights into the architectural decisions, interoperability strategies and the practical impact of this innovation on performance and development efficiency in modern data ecosystems.
Delta Kernel makes it easy for engines and connectors to read and write Delta tables. It supports many Delta features and robust connectors, including DuckDB, Clickhouse, Spice AI and delta-dotnet. In this session, we'll cover lessons learned about how to build a high-performance library that lets engines integrate the way they want, while not having to worry about the details of the Delta protocol. We'll talk through how we streamlined the API as well as its changes and underlying motivations. We'll discuss some new highlight features like write support, and the ability to do CDF scans. Finally we'll cover the future roadmap for the Kernel project and what you can expect from the project over the coming year.
Five years ago, the delta-rs project embarked on a journey to bring Delta Lake's robust capabilities to the Rust & Python ecosystem. In this talk, we'll delve into the triumphs, tribulations and lessons learned along the way. We'll explore how delta-rs has matured alongside the thriving Rust data ecosystem, adapting to its evolving landscape and overcoming the challenges of maintaining a complex data project. Join us as we share insights into the project's evolution, the symbiotic relationship between delta-rs and the Rust community, and the current hurdles and future directions that lie ahead. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Supported by Our Partners • Graphite — The AI developer productivity platform. • Sentry — Error and performance monitoring for developers. — Reddit’s native mobile apps are more complex than most of us would assume: both the iOS and Android apps are about 2.5 million lines of code, have 500+ screens, and a total of around 200 native iOS and Android engineers work on them. But it wasn’t always like this. In 2021, Reddit started to double down on hiring native mobile engineers, and they quietly rebuilt the Android and iOS apps from the ground up. The team introduced a new tech stack called the “Core Stack” – all the while users remained largely unaware of the changes. What drove this overhaul, and how did the team pull it off? In this episode of The Pragmatic Engineer, I’m joined by three engineers from Reddit’s mobile platform team who led this work: Lauren Darcey (Head of Mobile Platform), Brandon Kobilansky (iOS Platform Lead), and Eric Kuck (Principal Android Engineer). We discuss how the team transitioned to a modern architecture, revamped their testing strategy, improved developer experience – while they also greatly improved the app’s user experience. We also get into: • How Reddit structures its mobile teams—and why iOS and Android remain intentionally separate • The scale of Reddit’s mobile codebase and how it affects compile time • The shift from MVP to MVVM architecture • Why Reddit took a bet on Jetpack Compose, but decided (initially) against using SwiftUI • How automated testing evolved at Reddit • Reddit’s approach to server-driven-mobile-UI • What the mobile platforms team looks for in a new engineering hire • Reddit’s platform team’s culture of experimentation and embracing failure • And much more! If you are interested in large-scale rewrites or native mobile engineering challenges: this episode is for you. — Timestamps (00:00) Intro (02:04) The scale of the Android code base (02:42) The scale of the iOS code base (03:26) What the compile time is for both Android and iOS (05:33) The size of the mobile platform teams (09:00) Why Reddit has so many mobile engineers (11:28) The different types of testing done in the mobile platform (13:20) The benefits and drawbacks of testing (17:00) How Eric, Brandon, and Lauren use AI in their workflows (20:50) Why Reddit grew its mobile teams in 2021 (26:50) Reddit’s modern tech stack, Corestack (28:48) Why Reddit shifted from MVP architecture to MVVM (30:22) The architecture on the iOS side (32:08) The new design system (30:55) The impact of migrating from Rust to GraphQL (38:20) How the backend drove the GraphQL migration and why it was worth the pain (43:17) Why the iOS team is replacing SliceKit with SwiftUI (48:08) Why the Android team took a bet on Compose (51:25) How teams experiment with server-driven UI—when it worked, and when it did not (54:30) Why server-driven UI isn’t taking off, and why Lauren still thinks it could work (59:25) The ways that Reddit’s modernization has paid off, both in DevX and UX (1:07:15) The overall modernization philosophy; fixing pain points (1:09:10) What the mobile platforms team looks for in a new engineering hire (1:16:00) Why startups may be the best place to get experience (1:17:00) Why platform teams need to feel safe to fail (1:20:30) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • The platform and program split at Uber • Why and how Notion went native on iOS and Android • Paying down tech debt • Cross-platform mobile development — See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Summary In this episode of the Data Engineering Podcast Viktor Kessler, co-founder of Vakmo, talks about the architectural patterns in the lake house enabled by a fast and feature-rich Iceberg catalog. Viktor shares his journey from data warehouses to developing the open-source project, Lakekeeper, an Apache Iceberg REST catalog written in Rust that facilitates building lake houses with essential components like storage, compute, and catalog management. He discusses the importance of metadata in making data actionable, the evolution of data catalogs, and the challenges and innovations in the space, including integration with OpenFGA for fine-grained access control and managing data across formats and compute engines.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Viktor Kessler about architectural patterns in the lakehouse that are unlocked by a fast and feature-rich Iceberg catalogInterview IntroductionHow did you get involved in the area of data management?Can you describe what LakeKeeper is and the story behind it? What is the core of the problem that you are addressing?There has been a lot of activity in the catalog space recently. What are the driving forces that have highlighted the need for a better metadata catalog in the data lake/distributed data ecosystem?How would you characterize the feature sets/problem spaces that different entrants are focused on addressing?Iceberg as a table format has gained a lot of attention and adoption across the data ecosystem. The REST catalog format has opened the door for numerous implementations. What are the opportunities for innovation and improving user experience in that space?What is the role of the catalog in managing security and governance? (AuthZ, auditing, etc.)What are the channels for propagating identity and permissions to compute engines? (how do you avoid head-scratching about permission denied situations)Can you describe how LakeKeeper is implemented?How have the design and goals of the project changed since you first started working on it?For someone who has an existing set of Iceberg tables and catalog, what does the migration process look like?What new workflows or capabilities does LakeKeeper enable for data teams using Iceberg tables across one or more compute frameworks?What are the most interesting, innovative, or unexpected ways that you have seen LakeKeeper used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on LakeKeeper?When is LakeKeeper the wrong choice?What do you have planned for the future of LakeKeeper?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links LakeKeeperSAPMicrosoft AccessMicrosoft ExcelApache IcebergPodcast EpisodeIceberg REST CatalogPyIcebergSparkTrinoDremioHive MetastoreHadoopNATSPolarsDuckDBPodcast EpisodeDataFusionAtlanPodcast EpisodeOpen MetadataPodcast EpisodeApache AtlasOpenFGAHudiPodcast EpisodeDelta LakePodcast EpisodeLance Table FormatPodcast EpisodeUnity CatalogPolaris CatalogApache GravitinoPodcast Episode KeycloakOpen Policy Agent (OPA)Apache RangerApache NiFiThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
This is a free preview of a paid episode. To hear more, visit dataengineeringcentral.substack.com
It’s time for another episode of the Data Engineering Central Podcast. In this episode, we cover … * Rust-based tool called UV to replace pip and poetry etc * Apache X-Table and the Future of the Lake House * How is AI going to affect you? Thanks for being a consumer of Data Engineering Central; your support means a lot. Please share this podcast with your friend…
Supported by Our Partners • WorkOS — The modern identity platform for B2B SaaS. • Vanta — Automate compliance and simplify security with Vanta. — Linux is the most widespread operating system, globally – but how is it built? Few people are better to answer this than Greg Kroah-Hartman: a Linux kernel maintainer for 25 years, and one of the 3 Linux Kernel Foundation Fellows (the other two are Linus Torvalds and Shuah Khan). Greg manages the Linux kernel’s stable releases, and is a maintainer of multiple kernel subsystems. We cover the inner workings of Linux kernel development, exploring everything from how changes get implemented to why its community-driven approach produces such reliable software. Greg shares insights about the kernel's unique trust model and makes a case for why engineers should contribute to open-source projects. We go into: • How widespread is Linux? • What is the Linux kernel responsible for – and why is it a monolith? • How does a kernel change get merged? A walkthrough • The 9-week development cycle for the Linux kernel • Testing the Linux kernel • Why is Linux so widespread? • The career benefits of open-source contribution • And much more! — Timestamps (00:00) Intro (02:23) How widespread is Linux? (06:00) The difference in complexity in different devices powered by Linux (09:20) What is the Linux kernel? (14:00) Why trust is so important with the Linux kernel development (16:02) A walk-through of a kernel change (23:20) How Linux kernel development cycles work (29:55) The testing process at Kernel and Kernel CI (31:55) A case for the open source development process (35:44) Linux kernel branches: Stable vs. development (38:32) Challenges of maintaining older Linux code (40:30) How Linux handles bug fixes (44:40) The range of work Linux kernel engineers do (48:33) Greg’s review process and its parallels with Uber’s RFC process (51:48) Linux kernel within companies like IBM (53:52) Why Linux is so widespread (56:50) How Linux Kernel Institute runs without product managers (1:02:01) The pros and cons of using Rust in Linux kernel (1:09:55) How LLMs are utilized in bug fixes and coding in Linux (1:12:13) The value of contributing to the Linux kernel or any open-source project (1:16:40) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: What TPMs do and what software engineers can learn from them The past and future of modern backend practices Backstage: an open-source developer portal — See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
In this episode, Conor and Ben chat with Tristan Brindle about plans for CppNorth 2025, plans for Flux, the slow death of Twitter and more! Link to Episode 225 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBen Deane: Twitter | BlueSkyAbout the Guest Tristan Brindle a freelance programmer and trainer based in London, mostly focussing on C++. He is a member of the UK national body (BSI) and ISO WG21. Occasionally I can be found at C++ conferences. He is also a director of C++ London Uni, a not-for-profit organisation offering free beginner programming classes in London and online. He has a few fun projects on GitHub that you can find out about here. Show Notes Date Generated: 2025-02-17 Date Released: 2025-03-14 CppNorth 2025FluxIteration Revisited: A Safer Iteration Model for C++ - Tristan Brindle - CppNorth 2023ADSP Episode 126: Flux (and Flow) with Tristan BrindleIterators and Ranges: Comparing C++ to D to Rust - Barry Revzin - [CppNow 2021]Keynote: Iterators and Ranges: Comparing C++ to D, Rust, and Others - Barry Revzin - CPPP 2021Iteration Inside and Out - Bob Nystrom BlogExpanding the internal iteration API #99std::distancestd::ranges::distanceC++ London MeetupDenver C++ MeetupIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
In this episode, Conor and Ben chat with Tristan Brindle about recent updates to Flux, internal iteration vs external iteration and more. Link to Episode 224 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBen Deane: Twitter | BlueSkyAbout the Guest Tristan Brindle a freelance programmer and trainer based in London, mostly focussing on C++. He is a member of the UK national body (BSI) and ISO WG21. Occasionally I can be found at C++ conferences. He is also a director of C++ London Uni, a not-for-profit organisation offering free beginner programming classes in London and online. He has a few fun projects on GitHub that you can find out about here. Show Notes Date Generated: 2025-02-17 Date Released: 2025-03-07 FluxLightning Talk: Faster Filtering with Flux - Tristan Brindle - CppNorth 2023Arrays, Fusion & CPUs vs GPUs.pdfIteration Revisited: A Safer Iteration Model for C++ - Tristan Brindle - CppNorth 2023ADSP Episode 126: Flux (and Flow) with Tristan BrindleIterators and Ranges: Comparing C++ to D to Rust - Barry Revzin - [CppNow 2021]Keynote: Iterators and Ranges: Comparing C++ to D, Rust, and Others - Barry Revzin - CPPP 2021Iteration Inside and Out - Bob Nystrom BlogExpanding the internal iteration API #99Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
Paul Horn will showcase the neo4rs Rust driver integration into the Rust ecosystem, how it compares to an official product driver, and future plans.
Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don’t), where industry insights meet laid-back banter. Whether you’re a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let’s get into the heart of data, unplugged style! This week, we dive into: The creative future with AI: is generative AI helping or hurting creators? Environmental concerns of AI: the hidden costs of AI’s growing capabilities—how much energy do these models actually consume, and is it worth it?AI copyright controversies: Mark Zuckerberg’s LLaMA model faces criticism for using copyrighted materials like content from the notorious LibGen database.Trump vs. AI regulation: The former president repeals Biden’s AI executive order, creating a Wild West approach to AI development in the U.S. How will this impact innovation and global competition?Search reimagined with Perplexity AI: A new era of search blending conversational AI and personalized data unification. Could this be the future of information retrieval?Apple Intelligence on pause: Apple's AI-generated news alerts face a bumpy road. For more laughs, check out the dedicated subreddit AppleIntelligenceFail.Rhai scripting for Rust: Empowering Rust developers with an intuitive embedded scripting language to make extensibility a breeze.Poisoned text for scrapers: Exploring creative ways to protect web content from unauthorized scraping by AI systems.The rise of the AI Data Engineer: Is this a new role in data science, or are we just rebranding existing skills?
Send us a text
Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don’t), where industry insights meet laid-back banter. Whether you’re a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let’s get into the heart of data, unplugged style! In this episode, we are joined by special guest Nico for a lively and wide-ranging tech chat. Grab your headphones and prepare for: Strava’s ‘Athlete Intelligence’ feature: A humorous dive into how workout apps are getting smarter—and a little sassier.Frontend frameworks: HTMX is a tough choice: A candid discussion on using React versus emerging alternatives like HTMX and when to keep things lightweight.Octoverse 2024 trends and language wars: Python takes the lead over JavaScript as the top GitHub language, and we dissect why Go, TypeScript, and Rust are getting love too.GenAI meets Minecraft: Imagine procedurally generated worlds and dreamlike coherence breaks—Minecraft-style. How GenAI could redefine gameplay narratives and NPC behavior.OpenAI’s O1 model leak: Insights on the recent leak, what’s new, and its implications for the future of AI.Tiger Beetle’s transactional databases and testing tales: Nico walks us through Tiger Style, deterministic simulation testing, and why it’s a game changer for distributed databases.Automated testing for LLMOps: A quick overview of automated testing for large language models and its role in modern AI workflows.DeepLearning.ai’s short courses: Quick, impactful learning to level up your AI skills.
In this talk, we will discuss how we implemented the Iceberg connector in Rust, replacing the original Java-wrapped version to address performance bottlenecks in serialization and memory usage. By following the Apache Iceberg specification, we built a native Rust connector that supports Iceberg’s advanced features, such as multi-catalog compatibility and streaming updates. We’ve contributed this new version to the apache/iceberg-rust repository, and will share insights into the architectural improvements and best practices for leveraging Iceberg in streaming environments.
Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that flow as smoothly as your morning coffee, where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style! In today's episode: Remote work and hybrid challenges: Insights from the IMF on remote productivity, plus the challenges of work-life balance and Amazon’s office return with other companies' strategies for bringing employees back to the office.The fall of Zapata AI: A look at the shutdown of Zapata AI and the struggles in building successful quantum computing ventures.WTF Python: Exploring Python’s type hints, overloads, and those confusing "WTF" moments. Check out WTFPython.Data profiling tools: A dive into YData Profiling and Sweetviz for detailed data analysis.GifCities and personal websites: Reflecting on the fall of GifCities, the retro GIF hub, and discussing Murilo’s blog journey.Rust’s complexity debate: Discussing the blog post My Negative Views on Rust and whether Rust is too complex or simply misunderstood..io domain controversy: Examining the future of the .io domain as the British Indian Ocean Territory transfers sovereignty. Read more on Every.to and MIT Technology Review.Ducks or AI? A fun challenge to distinguish real ducks from AI-generated ones in the Duck Imposter Game.Adobe's AI video generator: A discussion on Adobe Firefly’s AI-powered video generator and its potential impact on content creation.
Welcome to the Data Engineering Central Podcast —— a no-holds-barred discussion on the Data Landscape. Welcome to Episode 02 In today’s episode, we will talk about the following topics from the Data Engineering perspective … * Using OpenAI’s o1 Model to do Data Engineering work * Lord Save us from more ETL tools * Rust for the small things * Hosted (SaaS) vs Build
This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. We dive into conversations smoother than your morning coffee (but let’s be honest, just as caffeinated) where industry insights meet light-hearted banter. Whether you’re a data wizard or just curious about the digital chaos around us, kick back and get ready to talk shop—unplugged style! In this episode: Farewell Pandas, Hello Future: Pandas is out, and Ibis is in. We're talking faster, smarter data processing—featuring the rise of DuckDB and the powerhouse that is Polars. Is this the end of an era for Pandas?UV vs. Rye: Forget pip—are these new Python package managers built in Rust the future? We break down UV, Rye, and what it all means for your next Python project.AI-Generated Podcasts: Is AI about to take over your favorite podcasts? We explore the potential of Google’s Notebook LM to transform content into audio gold.When AI Steals Your Voice: Jeff Geerling’s voice gets cloned by AI—without his consent. We dive into the wild world of voice cloning, the ethics, and the future of AI-generated media.Hacking AI with Prompt Injection: Could you outsmart AI? We share some wild strategies from the game Gandalf that challenge your prompt injection skills and teach you how to jailbreak even the toughest guardrails.Jony Ive’s New Gadget Rumor: Is Jony Ive plotting an Apple killer? Rumors are swirling about a new AI-powered handheld device that could shake up the smartphone market.Zero-Downtime Deployments with Kamal Proxy: No more downtime! We geek out over Kamal Proxy, the sleek HTTP tool designed for effortless Docker deployments.Function Calling and LLMs: Get ready for the next evolution in AI—function calling. We discuss its rise in LLMs and dive into the Gorilla project, the leaderboard testing the future of smart APIs.