Kafka

Turning KAFKA Feeds Into Real Time Business Value With Volt Active Stream Processing

2024-09-18 · Big Data LDN 2024

Face To Face

by David Rolfe

Analytics

We’ve never had more data, and we’ve never had more immediate access to data thanks to the success of the Kafka protocol. But what use is data if you can’t process it quickly enough to make a difference? Or can’t handle the scale with which it’s being generated? Or can’t make robust ACID-grade decisions if the situation requires it?

Volt Active Data is a real time decisioning platform that’s been around for a decade now, and plays a silent role in the lives of over a billion people every day. It’s used to run the prepaid mobile phone system for over 700 million end users, control millions of energy meters in Europe, and provide real time user analytics for over 500 million video game users worldwide.

In this session we’ll show how Volt’s new Stream Processing component allows you to connect the most powerful information pipeline (Kafka) to the most powerful real time decision engine (Volt).

Why does this lead to value? The Kafka-verse is very good at telling us about disjoint events in the recent past, but is short of tools to turn all that raw data into a clear understanding of what’s happening right now. Volt is excellent at the kind of tasks you need to solve if you want to get to ‘right now’. Complex aggregations, joining streams of imperfectly related data, enrichment, filtering, routing and other aspects of ‘Data Plumbing’ are easy with Volt. Once you have a clear and robust view of ‘right now’ you can get Volt to make real time decisions, at scale and with 100% ACID consistency that allow you to make the most of your newfound understanding.

Volt is more than just a stream processing platform. It started out as a performant, scalable and 100% ACID in memory database before evolving into a real time decision engine. Now, thanks to our new Stream Processing module you can get Volt to operate at the scale of your business, with the response times needed for success and the reliability your customers and stockholders expect.

Building a Real-Time Financial Analytics Platform with ClickHouse and Kafka

2024-09-18 · Big Data LDN 2024

Face To Face

by Zach Naimon

Analytics ClickHouse

Some of the world's leading financial institutions use ClickHouse for real-time financial analytics use cases, including fraud detection, risk modeling, and stock price reporting. To demonstrate how ClickHouse delivers unparalleled performance and ease of use for financial analytics, we'll walk through ingesting live stock ticker data from a Kafka stream into ClickHouse to power a real-time web application.

Tool-In-Action : GitOps pour Apache Kafka, à l'échelle et en toute simplicité avec Jikkou !

2024-09-11 · Meetup de rentrée - meshIQ / Jikkou - observabilité et GitOps Kafka @ Criteo

talk

by Florian Hussonnois (Kestra)

jikkou

Dans cette présentation, je vous propose de découvrir Jikkou : un framework open-source qui permet aux développeurs et aux équipes DevOps de gérer, d'automatiser et de provisionner facilement toutes les ressources nécessaires à leur plateforme Apache Kafka, le tout en adoptant une approche Resource-as-Code ! NB : Si vous devez ouvrir un ticket Jira ou envoyer un email à votre équipe support Kafka pour créer un topic, ce talk est fait pour vous !

Observabilité de Kafka – Réduire les pannes et des temps d'arrêt dans les environnements de messagerie

2024-09-11 · Meetup de rentrée - meshIQ / Jikkou - observabilité et GitOps Kafka @ Criteo

talk

by Scott Corrigan (meshIQ)

activemq ibm mq rabbitmq

Un nombre croissant d'applications s'appuient sur Kafka pour sa fiabilité et le haut débit de son traitement des flux d'événements, dans des cas d'utilisation allant de l'analyse des flux de clics à l'intégration d'applications et à l'analyse en temps réel des données en mouvement. Alors que l'adoption de Kafka par les entreprises continue de croître, 90 % des implémentations nécessitent que Kafka fonctionne avec d'autres technologies de messagerie telles qu'IBM MQ, RabbitMQ, ActiveMQ et autres. Comment obtenir une visibilité et une surveillance de bout en bout pour les transactions couvrant les technologiques aussi diverses ? Dans cette présentation, nous discuterons des moyens d'assurer l'observabilité et la gestion d'infrastructures complexes de messagerie et de streaming afin d’éliminer les temps d'arrêt, fournir des topologies de flux de messages d'applications et le lignage des données.

Big Data on Kubernetes

2024-07-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Neylson Crepalde

Airflow BI Big Data Docker Kubernetes Python Spark SQL YAML data data-engineering streaming-messaging

Big Data on Kubernetes is your comprehensive guide to leveraging Kubernetes for scalable and efficient big data solutions. You will learn key concepts of Kubernetes architecture and explore tools like Apache Spark, Airflow, and Kafka. Gain hands-on experience building complete data pipelines to tackle real-world data challenges. What this Book will help me do Understand Kubernetes architecture and learn to deploy and manage clusters. Build and orchestrate big data pipelines using Spark, Airflow, and Kafka. Develop scalable and resilient data solutions with Docker and Kubernetes. Integrate and optimize data tools for real-time ingestion and processing. Apply concepts to hands-on projects addressing actual big data scenarios. Author(s) Neylson Crepalde is an experienced data specialist with extensive knowledge of Kubernetes and big data solutions. With deep practical experience, Neylson brings real-world insights to his writing. His approach emphasizes actionable guidance and relatable problem-solving with a strong foundation in scalable architecture. Who is it for? This book is ideal for data engineers, BI analysts, data team leaders, and tech managers familiar with Python, SQL, and YAML. Targeted at professionals seeking to develop or expand their expertise in scalable big data solutions, it provides practical insights into Docker, Kubernetes, and prominent big data tools.

Real-time plane-spotting

2024-06-17 · IN PERSON: Apache Kafka® x Apache Flink® x Apache Druid® Meetup

talk

by Hugh Evans (Imply)

Grafana apache druid raspberry pi

Hidden from our eyes, aircraft in our skies are constantly transmitting data. Join us as we use some simple tech and the power of open source to fly through this data set. In this talk, see a Raspberry Pi, Apache Kafka, Apache Druid, and Grafana coming together for real-time data production, transport, OLAP, and interactive visualisation.

Real-time plane-spotting

2024-06-17 · IN PERSON: Apache Kafka® x Apache Flink x Apache Druid® Meetup

talk

by Hugh Evans (Imply)

Grafana apache druid raspberry pi

Hidden from our eyes, aircraft in our skies are constantly transmitting data. Join us as we use some simple tech and the power of open source to fly through this data set. In this talk, see a Raspberry Pi, Apache Kafka, Apache Druid, and Grafana coming together for real-time data production, transport, OLAP, and interactive visualisation.

Taming the Cost of Kafka Workloads In the Cloud

2024-06-11 · Streaming Buzzwords: Stream Processing Meetup, Berlin Buzzwords edition

talk

by Stefan Sprenger (Confluent)

Cloud Computing

Kafka Streams in Action, Second Edition

2024-05-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bill Bejeck

API Java Data Streaming data data-engineering streaming-messaging

Everything you need to implement stream processing on Apache KafkaⓇ using Kafka Streams and the kqsIDB event streaming database. Kafka Streams in Action, Second Edition guides you through setting up and maintaining your streaming processing with Kafka. Inside, you’ll find comprehensive coverage of not only Kafka Streams, but the entire toolbox you’ll need for effective streaming—from the components of the Kafka ecosystem, to Producer and Consumer clients, Connect, and Schema Registry. In Kafka Streams in Action, Second Edition you’ll learn how to: Design streaming applications in Kafka Streams with the KStream and the Processor API Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry Build applications that respond immediately to events in either Kafka Streams or ksqlDB Craft materialized views over streams with ksqlDB This totally revised new edition of Kafka Streams in Action has been expanded to cover more of the Kafka platform used for building event-based applications. You’ll also find full coverage of ksqlDB, an event streaming database that makes it a snap to create applications that respond immediately to events, such as real-time push and pull updates. About the Technology Enterprise applications need to handle thousands—even millions—of data events every day. With an intuitive API and flawless reliability, the lightweight Kafka Streams library has earned a spot at the center of these systems. Kafka Streams provides exactly the power and simplicity you need to manage real-time event processing or microservices messaging. About the Book Kafka Streams in Action, Second Edition teaches you how to create event streaming applications on the amazing Apache Kafka platform. This thoroughly revised new edition now covers a wider range of streaming architectures and includes data integration with Kafka Connect. As you go, you’ll explore real-world examples that introduce components and brokers, schema management, and the other essentials. Along the way, you’ll pick up practical techniques for blending Kafka with Spring, low-level control of processors and state stores, storing event data with ksqlDB, and testing streaming applications. What's Inside Design efficient streaming applications Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry About the Reader For Java developers. No knowledge of Kafka or streaming applications required. About the Author Bill Bejeck is a Confluent engineer and a Kafka Streams contributor with over 15 years of software development experience. Bill is also a committer on the Apache KafkaⓇ project. Quotes Comprehensive streaming data applications are only a few years away from becoming the reality, and this book is the guide the industry has been waiting for to move beyond the hype. - Adi Polak, Director, Developer Experience Engineering, Confluent Covers all the key aspects of building applications with Kafka Streams. Whether you are getting started with stream processing or have already built Kafka Streams applications, it is an essential resource. - Mickael Maison, Principal Software Engineer, Red Hat Serves as both a learning and a resource guide, offering a perfect blend of ‘how-to’ and ‘why-to.’ Even if you have been using Kafka Streams for many years, I highly recommend this book. - Neil Buesing, CTO & Co-founder, Kinetic Edge

Tech Talk: Simplifying Kafka and Troubleshooting data

2024-05-14 · Real-time Data Streaming and Processing (NYC)

talk

My Journey with Kafka so far..

2024-04-30 · BBC North East Tech Hub Meet Up #6

talk

Hear my journey with Kafka so far and what some of the use cases for it are as well as some of the challenges and common pitfalls with Kafka and how to alleviate these.

How to use Google Cloud Storage to unify your data for analytics workloads

2024-04-11 · Google Cloud Next '24

session

by Henry Gray (Uber) , Brad Kelemen (Google Cloud) , Abhi Khune (Uber) , Vivek Saraswat (Google Cloud)

Analytics BigQuery Cloud Computing Cloud Storage Data Analytics GCP Hadoop Spark

Unifying storage for your data analytics workloads doesn‘t have to be hard. See how Google Cloud Storage brings your data closer to compute and meets your applications where they are, all while achieving exabyte scale, strong consistency, and lower costs. You'll get new product announcements and see enterprise customers present real-world solutions using Cloud Storage with BigQuery, Hadoop, Spark, Kafka, and more.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Choosing the Right Abstraction Level for Your Kafka Project by Carlos Manuel Duclos-Vergara

2024-04-10 · DATA MINER Big Data Europe Conference 2020 Watch

video

by Carlos Manuel Duclos-Vergara (Schibsted)

AI/ML Big Data Data Science

Big Data Europe Onsite and online on 22-25 November in 2022 Learn more about the conference: https://bit.ly/3BlUk9q

Join our next Big Data Europe conference on 22-25 November in 2022 where you will be able to learn from global experts giving technical talks and hand-on workshops in the fields of Big Data, High Load, Data Science, Machine Learning and AI. This time, the conference will be held in a hybrid setting allowing you to attend workshops and listen to expert talks on-site or online.

Real Time Streaming Data from AWS MSK Kafka to Cloudera by Lidor Gerstel

2024-04-10 · DATA MINER Big Data Europe Conference 2020 Watch

video

by Lidor Gerstel (Centerity)

AI/ML AWS Big Data Data Science Data Streaming

Big Data Europe Onsite and online on 22-25 November in 2022 Learn more about the conference: https://bit.ly/3BlUk9q

Join our next Big Data Europe conference on 22-25 November in 2022 where you will be able to learn from global experts giving technical talks and hand-on workshops in the fields of Big Data, High Load, Data Science, Machine Learning and AI. This time, the conference will be held in a hybrid setting allowing you to attend workshops and listen to expert talks on-site or online.

Development of a Kafka-Powered Advanced Stream Commerce Platform by Andrea Spina

2024-04-10 · DATA MINER Big Data Europe Conference 2020 Watch

video

by Andrea Spina (Radicalbit)

AI/ML Big Data Data Science

Big Data Europe Onsite and online on 22-25 November in 2022 Learn more about the conference: https://bit.ly/3BlUk9q

Join our next Big Data Europe conference on 22-25 November in 2022 where you will be able to learn from global experts giving technical talks and hand-on workshops in the fields of Big Data, High Load, Data Science, Machine Learning and AI. This time, the conference will be held in a hybrid setting allowing you to attend workshops and listen to expert talks on-site or online.

Codeless GenAI Pipelines with Flink, Kafka, NiFi

2024-03-28 · Discover Data Delights: A Slice of Real-Time Analytics and GenAI!

talk

apache nifi flink genai

Explore the power of real-time streaming with GenAI using Apache NiFi. Learn how NiFi simplifies data engineering workflows, allowing you to focus on creativity over technical complexities. Tim Spann will guide you through practical examples, showcasing NiFi's automation impact from ingestion to delivery. Whether you're a seasoned data engineer or new to GenAI, this talk offers valuable insights into optimizing workflows. Join us to unlock the potential of real-time streaming and witness how NiFi makes data engineering a breeze for GenAI applications!

Workshop – Beyond Kafka: Cutting Costs and Complexity With WarpStream and S3

2024-03-26 · Data Council Austin 2024 - Day 1 Watch

workshop

by Ryan Worl

S3

Safeguarding Our Kafka Kingdom: A Journey into Authentication and Authorisation

2024-03-21 · Kafka meetup with Jay Kreps @ Criteo !

talk

by Ilyas Toumlilt (Criteo)

authentication authorization jwt oauth

We will share Criteo's journey of integrating authentication and authorisation into our Kafka infrastructure, including how we incorporated OAuth and JWT authentication systems into Kafka to enhance the security of our data streams. The talk covers the obstacles we faced and the lessons learned during transforming an open Kafka infra into a safeguarded platform.

History and evolution of Apache Kafka

2024-03-21 · Kafka meetup with Jay Kreps @ Criteo !

talk

by Jay Kreps (Confluent)

confluent

Discussion about why Kafka, why Confluent was created, evolution of Kafka and data streaming, open source and communities.

Real-time analytics with RisingWave

2024-03-04 · Stream Processing

workshop

by Noel Kwan (RisingWave)

incremental view maintenance risingwave

Hands-on workshop: ingest data into RisingWave to do data processing, learn common streaming patterns, and integrate it as part of a data pipeline. Ride-Hailing data is used as a case study. Topics include Incremental View Maintenance, stateful queries (joins and aggregations), and integration with external components like Kafka to ingest and sink data.

talk-data.com

Activity Trend

Top Events

Top Speakers

Turning KAFKA Feeds Into Real Time Business Value With Volt Active Stream Processing

Building a Real-Time Financial Analytics Platform with ClickHouse and Kafka

Tool-In-Action : GitOps pour Apache Kafka, à l'échelle et en toute simplicité avec Jikkou !

Observabilité de Kafka – Réduire les pannes et des temps d'arrêt dans les environnements de messagerie

Big Data on Kubernetes

Real-time plane-spotting

Real-time plane-spotting

Taming the Cost of Kafka Workloads In the Cloud

Kafka Streams in Action, Second Edition

Tech Talk: Simplifying Kafka and Troubleshooting data

My Journey with Kafka so far..

How to use Google Cloud Storage to unify your data for analytics workloads

Choosing the Right Abstraction Level for Your Kafka Project by Carlos Manuel Duclos-Vergara

Real Time Streaming Data from AWS MSK Kafka to Cloudera by Lidor Gerstel

Development of a Kafka-Powered Advanced Stream Commerce Platform by Andrea Spina

Codeless GenAI Pipelines with Flink, Kafka, NiFi

Workshop – Beyond Kafka: Cutting Costs and Complexity With WarpStream and S3

Safeguarding Our Kafka Kingdom: A Journey into Authentication and Authorisation

History and evolution of Apache Kafka

Real-time analytics with RisingWave