Kafka Streams in Action, Second Edition

2024-05-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bill Bejeck

API Kafka Data Streaming data data-engineering streaming-messaging

Everything you need to implement stream processing on Apache KafkaⓇ using Kafka Streams and the kqsIDB event streaming database. Kafka Streams in Action, Second Edition guides you through setting up and maintaining your streaming processing with Kafka. Inside, you’ll find comprehensive coverage of not only Kafka Streams, but the entire toolbox you’ll need for effective streaming—from the components of the Kafka ecosystem, to Producer and Consumer clients, Connect, and Schema Registry. In Kafka Streams in Action, Second Edition you’ll learn how to: Design streaming applications in Kafka Streams with the KStream and the Processor API Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry Build applications that respond immediately to events in either Kafka Streams or ksqlDB Craft materialized views over streams with ksqlDB This totally revised new edition of Kafka Streams in Action has been expanded to cover more of the Kafka platform used for building event-based applications. You’ll also find full coverage of ksqlDB, an event streaming database that makes it a snap to create applications that respond immediately to events, such as real-time push and pull updates. About the Technology Enterprise applications need to handle thousands—even millions—of data events every day. With an intuitive API and flawless reliability, the lightweight Kafka Streams library has earned a spot at the center of these systems. Kafka Streams provides exactly the power and simplicity you need to manage real-time event processing or microservices messaging. About the Book Kafka Streams in Action, Second Edition teaches you how to create event streaming applications on the amazing Apache Kafka platform. This thoroughly revised new edition now covers a wider range of streaming architectures and includes data integration with Kafka Connect. As you go, you’ll explore real-world examples that introduce components and brokers, schema management, and the other essentials. Along the way, you’ll pick up practical techniques for blending Kafka with Spring, low-level control of processors and state stores, storing event data with ksqlDB, and testing streaming applications. What's Inside Design efficient streaming applications Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry About the Reader For Java developers. No knowledge of Kafka or streaming applications required. About the Author Bill Bejeck is a Confluent engineer and a Kafka Streams contributor with over 15 years of software development experience. Bill is also a committer on the Apache KafkaⓇ project. Quotes Comprehensive streaming data applications are only a few years away from becoming the reality, and this book is the guide the industry has been waiting for to move beyond the hype. - Adi Polak, Director, Developer Experience Engineering, Confluent Covers all the key aspects of building applications with Kafka Streams. Whether you are getting started with stream processing or have already built Kafka Streams applications, it is an essential resource. - Mickael Maison, Principal Software Engineer, Red Hat Serves as both a learning and a resource guide, offering a perfect blend of ‘how-to’ and ‘why-to.’ Even if you have been using Kafka Streams for many years, I highly recommend this book. - Neil Buesing, CTO & Co-founder, Kinetic Edge

XML and Related Technologies

2024-05-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Atul kahate

C#/.NET Cyber Security XML data data-engineering storage-formats

About The Author – Atul Kahate has over 13 years of experience in Information Technology in India and abroad in various capacities. He has done his Bachelor of Science degree in Statistics and his Master of Business Administration in Computer Systems. He has authored 17 highly acclaimed books on various areas of Information Technology. Several of his books are being used as course textbooks or sources of reference in a number of universities/colleges/IT companies all over the world. Atul has been writing articles in newspapers about cricket, since the age of 12. He has also authored two books on cricket and has written over 2000 articles on IT and cricket. He has a deep interest in teaching, music, and cricket besides technology. He has conducted several training programs, in a number of educational institutions and IT organisations, on a wide range of technologies. Some of the prestigious institutions where he has conducted training programs, include IIT, Symbiosis, I2IT, MET, Indira Institute of Management, Fergusson College, MIT, VIIT, MIT, Walchand Government Engineering College besides numerous other colleges in India.

Book Content – 1. Introduction to XML, 2. XML Syntaxes, 3. Document Type Definitions, 4. XML Schemas 5. Cascading Style Sheets, 6. Extensible Stylesheet Language, 7. XML and Java, 8. XML and ASP.NET, 9. Web Services and AJAX, 10. XML Security, Appendix – Miscellaneous Topics

Loom et Java: contexte et usage dans l'écosystème Scala

2024-04-18 · PSUG à devoxx (soirée communautaire)

Birds-of-a-Feather

Scala jvm loom

Sujet du BoF: Loom a introduit les premières fonctionnalités et les threads virtuels dans Java 19 et 21 (LTS). Discussion sur la perception du projet Loom côté Scala, les changements opérationnels et l'intégration dans les projets. Après une courte présentation rappelant le contexte et l'historique des travaux dans Scala autour de Loom, échange sur les approches, les usages et les perspectives futures. BoF destiné à tous les niveaux d'expérience et à ceux qui s'intéressent à l'impact de Loom sur l'écosystème JVM.

A java developer walks into a serverless bar

2024-04-11 · Google Cloud Next '24

session

by Mohammed Aboullaite (Spotify)

CI/CD Cloud Computing GCP Cyber Security

In this session, we'll dive into deploying Java apps using Google Cloud's serverless platform. Designed for Java developers, it offers practical insights into consideration, challenges, tips and tricks for deploying JVM applications in Serverless platforms. We’ll also cover other best practices across different part of the application lifecycle, such as CI/CD pipelines, security, and observability. Through interactive demos, learn to build, secure, and monitor Java applications efficiently.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Roll up your sleeves: Craft real-world generative AI Java in Cloud Run

2024-04-11 · Google Cloud Next '24

session

by yanni Peng (Google Cloud) , Dan Dobrin (Google Cloud)

AI/ML Cloud Computing GCP GenAI Cloud Run LLM

Ready to supercharge your Java skills with cutting-edge generative AI? Dive into this immersive hands-on workshop and learn how to build and deploy powerful gen AI applications in Cloud Run using gen AI with Vertex and Gemini models. We'll explore fast Java development, leverage the scalability of Cloud Run, and tackle real-world gen AI use cases. Get ready and unleash the power of AI in your next application!

Work with a complete end-to-end sample application, guided at all times by the power of Gemini.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Cloud-powered, API-first testing with Testcontainers and Kotlin

2024-04-10 · Google Cloud Next '24

session

by Oleg Šelajev (Docker)

API BigQuery Cloud Computing GCP Cloud Run JavaScript Pub/Sub Python Rust

Adequately testing systems that use Google Cloud services can be a serious challenge. In this session we’ll show you how to shift testing to an API-first approach using Testcontainers. This approach helps us improve the feedback cycle and reliability for both our inner-dev loop and our competitive intelligence cycle. We’ll go through an end-to-end example that uses BigQuery and PubSub, Cloud Build, and Cloud Run. Examples will use Kotlin but it could be accomplished with other languages including Rust, Go, JavaScript, Python, Java, and more.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Java on Google Cloud: The enterprise, the serverless, and the native

2024-04-10 · Google Cloud Next '24

session

by Rustam Mehmandarov (Computas)

Cloud Computing GCP Cloud Run

Do you want to know your options for running Java on Google Cloud? We’ll explore various options for running workloads written using the latest Java and Jakarta EE versions on serverless offerings like Google App Engine and Google Cloud Run. Furthermore, we'll look at optimizing your run time performance using various frameworks.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Redis Stack for Application Modernization

2023-12-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mirko Ortensi , Luigi Fugaro

DataViz JavaScript JSON Luigi Python Redis Cyber Security data data-engineering nosql-databases

In "Redis Stack for Application Modernization," you will explore how the Redis Stack extends traditional Redis capabilities, allowing you to innovate in building real-time, scalable, multi-model applications. Through practical examples and hands-on sessions, this book equips you with skills to manage, implement, and optimize data flows and database features. What this Book will help me do Learn how to use Redis Stack for handling real-time data with JSON, hash, and other document types. Discover modern techniques for performing vector similarity searches and hybrid workflows. Become proficient in integrating Redis Stack with programming languages like Java, Python, and Node.js. Gain skills to configure Redis Stack server for scalability, security, and high availability. Master RedisInsight for data visualization, analysis, and efficient database management. Author(s) Luigi Fugaro and None Ortensi are experienced software professionals with deep expertise in database systems and application architecture. They bring years of experience working with Redis and developing real-world applications. Their hands-on approach to teaching and real-world examples make this book a valuable resource for professionals in the field. Who is it for? This book is ideal for database administrators, developers, and architects looking to leverage Redis Stack for real-time multi-model applications. It requires a basic understanding of Redis and any programming language such as Python or Java. If you wish to modernize your applications and efficiently manage databases, this book is for you.

Unleashing the Power of Graphs in Java Code Structure Analysis

2023-12-14 · Engineering Kiosk Alps Meetup Innsbruck

talk

graph algorithms graph analysis graph data science graphs machine learning node embeddings

This talk explores Java code structure analysis using Graphs. It provides an introduction to Graphs and underscores their significance in both Graph Analysis and the field of Graph Data Science. The journey begins with exploring queries to analyze code dependencies and progresses to the application of graph algorithms for tasks such as community detection, centrality, and similarity. Additionally, the talk provides an introduction to node embeddings for machine learning. By the end of this presentation, software professionals will be well-equipped to extract valuable insights from Java code bases effectively.

Démystifier la génération "aléatoire"

2023-12-12 · Meetup HumanTalks Paris @leboncoin

talk

"Le vrai hasard étant hasardeux, contentons-nous d'un pseudo-hasard et adaptons-le à nos besoins" (Jean-Paul Delahaye dans Pour la Science 1998)" . Présentation d'un générateur pseudo-aléatoire(Random Java), suivie d'une explication sur la manière le pirater(en retrouvant seed) avec une démonstration. Enfin, une conclusion et une discussion sur le concept du vrai aléatoire et la meilleure façon de générer de l'aléatoire seront abordées.

Addressing The Challenges Of Component Integration In Data Platform Architectures

2023-11-27 · Data Engineering Podcast Listen

podcast_episode

by Tobias Macey

AI/ML Airflow Analytics AWS AWS Lambda BI Cloud Computing Data Engineering Data Lake Data Lakehouse Data Management dbt +10 more

Summary

Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! Developing event-driven pipelines is going to be a lot easier - Meet Functions! Memphis functions enable developers and data engineers to build an organizational toolbox of functions to process, transform, and enrich ingested events “on the fly” in a serverless manner using AWS Lambda syntax, without boilerplate, orchestration, error handling, and infrastructure in almost any language, including Go, Python, JS, .NET, Java, SQL, and more. Go to dataengineeringpodcast.com/memphis today to get started! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'll be sharing an update on my own journey of building a data platform, with a particular focus on the challenges of tool integration and maintaining a single source of truth

Interview

Introduction How did you get involved in the area of data management? data sharing weight of history

existing integrations with dbt switching cost for e.g. SQLMesh de facto standard of Airflow

Single source of truth

permissions management across application layers Database engine Storage layer in a lakehouse Presentation/access layer (BI) Data flows dbt -> table level lineage orchestration engine -> pipeline flows

task based vs. asset based

Metadata platform as the logical place for horizontal view

Contact Info

LinkedIn Website

Parting Questio

Unlock innovation with AI by migrating enterprise apps to App Service | BRK207H

2023-11-16 · Microsoft Ignite 2023 Watch

video

by Ed Donahue , Stefan Schackow , Gaurav Seth (Microsoft) , Tulika Chaudharie , Michael YenChi Ho (Microsoft) , Scott Hunter (Microsoft) , Byron Tardif , Yutang Lin

AI/ML Azure Cloud Computing DevOps GitHub HTML Microsoft

Discover why Azure App Service is as fast growing as the managed platform of choice for migrating on-premises .NET and Java apps to the cloud. Learn how to deploy your web applications with ease, using built-in support for containers like GitHub and DevOps. Secure your apps with SSL, authentication, and firewall features. We'll also share latest tools and innovations to lower the cost and time to complete your migration projects.

To learn more, please check out these resources: * https://aka.ms/Ignite23CollectionsBRK207H * https://info.microsoft.com/ww-landing-contact-me-for-events-m365-in-person-events.html?LCID=en-us&ls=407628-contactme-formfill * https://aka.ms/azure-ignite2023-dataaiblog

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Gaurav Seth * Scott Hunter * Tulika Chaudharie * Byron Tardif * Ed Donahue * Michael YenChi Ho * Stefan Schackow * Yutang Lin

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

BRK207H | English (US) | AI & Apps

MSIgnite

Episode 155: Don't Hurt Yourself (with C++) with Jonathan O'Connor

2023-11-10 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce Adelstein Lelbach (NVIDIA) , Jonathan O’Connor (LADE GmbH)

GitHub

In this episode, Conor and Bryce conclude their conversation with Jonathan O’Connor and chat about a plethora of topics: multiparadigm languages, Ratfor, airport lounges, Meeting C++, code::dive and more. Link to Episode 155 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Jonathan O’Connor in 1988 joined Glockenspiel, a small Irish company. C++ had no virtual destructors, but it did have a coroutine library! I spent 2 years teaching C++ and OOP. In 2000, he switched over to Java. But by 2010, he started 7 wonderful years writing in Ruby. In 2016, he returned to a completely different C++, where one never had to see a pointer if you didn’t want to. These days he is helping to make the world a better place writing C++ code for LADE GmbH, a company building electric car charging infrastructure.

Show Notes

Date Recorded: 2023-10-18 Date Released: 2023-11-10 Jonathan O’Connor Meeting C++ BioRatforSoftware Tools by Brian Kernighan and P.J. PlaugerADSP Bingo BoardMeeting C++ Conferencecode::dive ConferenceIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 154: Programming Languages Galore with Jonathan O'Connor

2023-11-03 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce Adelstein Lelbach (NVIDIA) , Jonathan O’Connor (LADE GmbH)

GitHub

In this episode, Conor and Bryce continue their conversation with Jonathan O’Connor and chat about a plethora of programming languages! Link to Episode 154 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Jonathan O’Connor in 1988 joined Glockenspiel, a small Irish company. C++ had no virtual destructors, but it did have a coroutine library! I spent 2 years teaching C++ and OOP. In 2000, he switched over to Java. But by 2010, he started 7 wonderful years writing in Ruby. In 2016, he returned to a completely different C++, where one never had to see a pointer if you didn’t want to. These days he is helping to make the world a better place writing C++ code for LADE GmbH, a company building electric car charging infrastructure.

Show Notes

Date Recorded: 2023-10-18 Date Released: 2023-11-03 Jonathan O’Connor Meeting C++ BioAlgorithms + Data Structures = Programs BookPascal LanguageAda LanguageWhy Did C Succeed Over Pascal?Carbon GithubZig LanguageNim LanguageUiua LanguageEiffel LanguageBertrand MeyerRichard Feldman on TwitterSoftware Unscripted PodcastWhy Isn’t Functional Programming the Norm? – Richard FeldmanJames Gosling Keynote “Thoughts on language evolution” - reClojure 2022Clojure LanguageArrayCast Episode 41: John Earnest and Versions of kIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 153: Pascal vs C vs Ada with Jonathan O'Connor

2023-10-27 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce Adelstein Lelbach (NVIDIA) , Jonathan O’Connor (LADE GmbH)

GitHub

In this episode, Conor and Bryce conintue their conversation with Jonathan O’Connor and chat about Pascal, C, Ada and more! Link to Episode 153 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Jonathan O’Connor in 1988 joined Glockenspiel, a small Irish company. C++ had no virtual destructors, but it did have a coroutine library! I spent 2 years teaching C++ and OOP. In 2000, he switched over to Java. But by 2010, he started 7 wonderful years writing in Ruby. In 2016, he returned to a completely different C++, where one never had to see a pointer if you didn’t want to. These days he is helping to make the world a better place writing C++ code for LADE GmbH, a company building electric car charging infrastructure.

Show Notes

Date Recorded: 2023-10-18 Date Released: 2023-10-27 Jonathan O’Connor Meeting C++ BioProgtools on TwitterSpicy - aespa エスパ [Music Bank] | KBS WORLD TV 230519Oxide and Friends Episode 93 - Settling BeefAlgorithms + Data Structures = Programs BookStructure and Interpretation of Computer ProgrammingPascal LanguageAda LanguageWhy Did C Succeed Over Pascal?Alan Turing as a RunnerIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Episode 152: Ruby in Rwanda with Jonathan O'Connor

2023-10-20 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce Adelstein Lelbach (NVIDIA) , Jonathan O’Connor (LADE GmbH)

GitHub

In this episode, Conor and Bryce chat with Jonathan O’Connor about his career path from C++ to Java to Ruby and back to C++, as well as his work in Rwanda and a discussion about quines! Link to Episode 152 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Jonathan O’Connor in 1988 joined Glockenspiel, a small Irish company. C++ had no virtual destructors, but it did have a coroutine library! I spent 2 years teaching C++ and OOP. In 2000, he switched over to Java. But by 2010, he started 7 wonderful years writing in Ruby. In 2016, he returned to a completely different C++, where one never had to see a pointer if you didn’t want to. These days he is helping to make the world a better place writing C++ code for LADE GmbH, a company building electric car charging infrastructure.

Show Notes

Date Recorded: 2023-10-18 Date Released: 2023-10-20 Jonathan O’Connor Meeting C++ BioMeeting C++ ConferenceAlices adventures in Template Land - Jonathan O’Connor - Meeting C++ 2018Ruby String to_iRuby Integer to_sRuby Slices ..Number of Automated Teller Machines (ATMs), Country Wide for RwandaPython Index SlicingM-Pesa appCommon LispFranz LispFranz Liszt (composer)DylanPicoLispHistory of Lisps YouTube Video (Structure and Interpretation of Computer Programs - Chapter 1.1Rosetta Code: QuineLightning Talk: How to Write a Quine? - Dmitry Kandalov [ ACCU 2021 ]Quine-Relay (Uroboros)ACL2 LanguageIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

2023-10-15 · Data Engineering Podcast Listen

podcast_episode

by Eric Sammer (Decodable) , Tobias Macey

AI/ML Airbyte Analytics Flink API Kinesis BI CI/CD Cloud Computing Data Engineering Data Management Data Quality +21 more

Summary

Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams. In this episode Eric Sammer discusses why more companies are including real-time capabilities in their products and the ways that Decodable makes it faster and easier.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! As more people start using AI for projects, two things are clear: It’s a rapidly advancing field, but it’s tough to navigate. How can you get the best results for your use case? Instead of being subjected to a bunch of buzzword bingo, hear directly from pioneers in the developer and data science space on how they use graph tech to build AI-powered apps. . Attend the dev and ML talks at NODES 2023, a free online conference on October 26 featuring some of the brightest minds in tech. Check out the agenda and register today at Neo4j.com/NODES. Your host is Tobias Macey and today I'm interviewing Eric Sammer about starting your stream processing journey with Decodable

Interview

Introduction How did you get involved in the area of data management? Can you describe what Decodable is and the story behind it?

What are the notable changes to the Decodable platform since we last spoke? (October 2021) What are the industry shifts that have influenced the product direction?

What are the problems that customers are trying to solve when they come to Decodable? When you launched your focus was on SQL transformations of streaming data. What was the process for adding full Java support in addition to SQL? What are the developer experience challenges that are particular to working with streaming data?

How have you worked to address that in the Decodable platform and interfaces?

As you evolve the technical and product direction, what is your heuristic for balancing the unification of interfaces and system integration against the ability to swap different components or interfaces as new technologies are introduced? What are the most interesting, innovative, or unexpected ways that you have seen Decodable used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Decodable? When is Decodable the wrong choice? What do you have planned for the future of Decodable?

Contact Info

esammer on GitHub LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Decodable

Podcast Episode

Understanding the Apache Flink Journey Flink

Podcast Episode

Debezium

Podcast Episode

Kafka Redpanda

Podcast Episode

Kinesis PostgreSQL

Podcast Episode

Snowflake

Podcast Episode

Databricks Startree Pinot

Podcast Episode

Rockset

Podcast Episode

Druid InfluxDB Samza Storm Pulsar

Podcast Episode

ksqlDB

Podcast Episode

dbt GitHub Actions Airbyte Singer Splunk Outbox Pattern

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Neo4J: NODES Conference Logo

NODES 2023 is a free online conference focused on graph-driven innovations with content for all skill levels. Its 24 hours are packed with 90 interactive technical sessions from top developers and data scientists across the world covering a broad range of topics and use cases. The event tracks: - Intelligent Applications: APIs, Libraries, and Frameworks – Tools and best practices for creating graph-powered applications and APIs with any software stack and programming language, including Java, Python, and JavaScript - Machine Learning and AI – How graph technology provides context for your data and enhances the accuracy of your AI and ML projects (e.g.: graph neural networks, responsible AI) - Visualization: Tools, Techniques, and Best Practices – Techniques and tools for exploring hidden and unknown patterns in your data and presenting complex relationships (knowledge graphs, ethical data practices, and data representation)

Don’t miss your chance to hear about the latest graph-powered implementations and best practices for free on October 26 at NODES 2023. Go to Neo4j.com/NODES today to see the full agenda and register!Rudderstack:

Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstackMaterialize:

You shouldn't have to throw away the database to build with fast-changing data. Keep the familiar SQL, keep the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date.

That is Materialize, the only true SQL streaming database built from the ground up to meet the needs of modern data products: Fresh, Correct, Scalable — all in a familiar SQL UI. Built on Timely Dataflow and Differential Dataflow, open source frameworks created by cofounder Frank McSherry at Microsoft Research, Materialize is trusted by data and engineering teams at Ramp, Pluralsight, Onward and more to build real-time data products without the cost, complexity, and development time of stream processing.

Go to materialize.com today and get 2 weeks free!Datafold:

This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare…

Episode 144: 🇸🇮 SRT23 - Nigeria, Here We Come! (and How Bryce Almost Died)

2023-08-25 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce Adelstein Lelbach (NVIDIA)

GitHub

In this episode, Conor and Bryce record live from Slovenia, Croatia and Italy while driving and chat about next year’s 2024 Nigeria Road Trip as well as Bryce’s near death experience. This episode is very light on the technical content (so feel free to skip). Link to Episode 144 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2023-06-21 Date Released: 2023-08-25 PiranFireship Java YouTube Video (Java is mounting a huge comeback)Run for the Fun of It PodcastHaskell Programming LanguageClojure Programming LanguageIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Unlocking the Power of Databricks SDKs: The Power to Integrate, Streamline, and Automate

2023-07-26 · Databricks DATA + AI Summit 2023 Watch

video

by Serge Smertin (Databricks)

Data Lakehouse Databricks Python Terraform

In today's data-driven landscape, the demands placed upon data engineers are diverse and multifaceted. With the integration of Java, Python, or Go microservices, Databricks SDKs provide a powerful bridge between the established ecosystems and Databricks. They allow data engineers to unlock new levels of integration and collaboration, as well as integrate Unity Catalog into processes to create advanced workflows straight from notebooks.

In this session, learn best practices for when and how to use SDK, command-line interface, or Terraform integration to seamlessly integrate with Databricks and revolutionize how you integrate with the Databricks Lakehouse. The session covers using shell scripts to automate complex tasks and streamline operations that improve scalability.

Talk by: Serge Smertin

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Photon for Dummies: How Does this New Execution Engine Actually Work?

2023-07-25 · Databricks DATA + AI Summit 2023 Watch

video

by Holly Smith (Databricks)

Computer Science Databricks Spark Virtual Machine

Did you finish the Photon whitepaper and think, wait, what? I know I did; it’s my job to understand it, explain it, and then use it. If your role involves using Apache Spark™ on Databricks, then you need to know about Photon and where to use it. Join me, chief dummy, nay "supreme" dummy, as I break down this whitepaper into easy to understand explanations that don’t require a computer science degree. Together we will unravel mysteries such as:

Why is a Java Virtual Machine the current bottleneck for Spark enhancements?
What does vectorized even mean? And how was it done before?
Why is the relationship status between Spark and Photon "complicated?"

In this session, we’ll start with the basics of Apache Spark, the details we pretend to know, and where those performance cracks are starting to show through. Only then will we start to look at Photon, how it’s different, where the clever design choices are and how you can make the most of this in your own workloads. I’ve spent over 50 hours going over the paper in excruciating detail; every reference, and in some instances, the references of the references so that you don’t have to.

Talk by: Holly Smith

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

talk-data.com

Java

Activity Trend

Top Events

Top Speakers

Kafka Streams in Action, Second Edition

XML and Related Technologies

Loom et Java: contexte et usage dans l'écosystème Scala

A java developer walks into a serverless bar

Roll up your sleeves: Craft real-world generative AI Java in Cloud Run

Cloud-powered, API-first testing with Testcontainers and Kotlin

Java on Google Cloud: The enterprise, the serverless, and the native

Redis Stack for Application Modernization

Unleashing the Power of Graphs in Java Code Structure Analysis

Démystifier la génération "aléatoire"

Addressing The Challenges Of Component Integration In Data Platform Architectures

Unlock innovation with AI by migrating enterprise apps to App Service | BRK207H

MSIgnite

Episode 155: Don't Hurt Yourself (with C++) with Jonathan O'Connor

Episode 154: Programming Languages Galore with Jonathan O'Connor

Episode 153: Pascal vs C vs Ada with Jonathan O'Connor

Episode 152: Ruby in Rwanda with Jonathan O'Connor

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Episode 144: 🇸🇮 SRT23 - Nigeria, Here We Come! (and How Bryce Almost Died)

Unlocking the Power of Databricks SDKs: The Power to Integrate, Streamline, and Automate

Photon for Dummies: How Does this New Execution Engine Actually Work?