talk-data.com talk-data.com

Topic

Flink

Apache Flink

stream_processing batch_processing big_data

28

tagged

Activity Trend

7 peak/qtr
2020-Q1 2026-Q1

Activities

28 activities · Newest first

AWS re:Invent 2025 - Autonomous agents powered by streaming data and Retrieval Augmented Generation

Unlock the potential of intelligent autonomous agents that combine real-time streaming data with Retrieval Augmented Generation (RAG) for dynamic decision-making. You will learn how to use streaming technologies like Amazon Kinesis, Amazon MSK, and Managed Service for Apache Flink create a robust pipeline to transform raw events into actionable insights. This session will show you how autonomous agents leverage these real-time insights with RAG architecture powered by OpenSearch, enabling immediate, context-aware responses to changing conditions. This practical architecture drives real-world value in critical scenarios like predictive maintenance, automated incident response, and intelligent customer service automation, with improved accuracy and reduced latency.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Powering your Agentic AI experience with AWS Streaming and Messaging (ANT310)

Organizations are accelerating innovation with generative AI and agentic AI use cases. This session explores how AWS streaming and messaging services such as Amazon Managed Streaming for Apache Kafka, Kinesis Data Streams, Amazon Managed Service for Apache Flink, and Amazon SQS build intelligent, responsive applications. Discover how streaming supports real-time data ingestion and processing, while messaging ensures reliable coordination between AI agents, orchestrates workflows, and delivers critical information at scale. Learn architectural patterns that highlight how a unified approach acts on data as fast as needed, providing the reliability and scale to grow for your next generation of AI.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Operating Apache Kafka and Apache Flink at scale (ANT307)

Enterprises use Apache Kafka and Apache Flink for an increasing number of mission-critical use-cases, real-time analytics, application messaging, and machine learning. As this usage grows in size and scale, so does the criticality, scale, and cost of managing the Kafka and Flink clusters. Learn how customers can achieve the same or higher availability and durability of their growing clusters, both at lower unit costs and with operational simplicity with Amazon MSK (Managed Streaming for Apache Kafka), and Amazon MSF (Managed Streaming for Apache Flink).

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Sponsored by: Confluent | Turn SAP Data into AI-Powered Insights with Databricks

Learn how Confluent simplifies real-time streaming of your SAP data into AI-ready Delta tables on Databricks. In this session, you'll see how Confluent’s fully managed data streaming platform—with unified Apache Kafka® and Apache Flink®—connects data from SAP S/4HANA, ECC, and 120+ other sources to enable easy development of trusted, real-time data products that fuel highly contextualized AI and analytics. With Tableflow, you can represent Kafka topics as Delta tables in just a few clicks—eliminating brittle batch jobs and custom pipelines. You’ll see a product demo showcasing how Confluent unites your SAP and Databricks environments to unlock ERP-fueled AI, all while reducing the total cost of ownership (TCO) for data streaming by up to 60%.

Extending the Lakehouse: Power Interoperable Compute With Unity Catalog Open APIs

The lakehouse is built for storage flexibility, but what about compute? In this session, we’ll explore how Unity Catalog enables you to connect and govern multiple compute engines across your data ecosystem. With open APIs and support for the Iceberg REST Catalog, UC lets you extend access to engines like Trino, DuckDB, and Flink while maintaining centralized security, lineage, and interoperability. We will show how you can get started today working with engines like Apache Spark and Starburst to read and write to UC managed tables with some exciting demos. Learn how to bring flexibility to your compute layer—without compromising control.

Scaling Identity Graph Ingestion to 1M Events/Sec with Spark Streaming & Delta Lake

Adobe’s Real-Time Customer Data Platform relies on the identity graph to connect over 70 billion identities and deliver personalized experiences. This session will showcase how the platform leverages Databricks, Spark Streaming and Delta Lake, along with 25+ Databricks deployments across multiple regions and clouds — Azure & AWS — to process terabytes of data daily and handle over a million records per second. The talk will highlight the platform’s ability to scale, demonstrating a 10x increase in ingestion pipeline capacity to accommodate peak traffic during events like the Super Bowl. Attendees will learn about the technical strategies employed, including migrating from Flink to Spark Streaming, optimizing data deduplication, and implementing robust monitoring and anomaly detection. Discover how these optimizations enable Adobe to deliver real-time identity resolution at scale while ensuring compliance and privacy.

Apache Iceberg with Unity Catalog at HelloFresh

Table formats like Delta Lake and Iceberg have been game changers for pushing lakehouse architecture into modern Enterprises. The acquisition of Tabular added Iceberg to the Databricks ecosystem, an open format that was already well supported by processing engines across the industry. At HelloFresh we are building a lakehouse architecture that integrates many touchpoints and technologies all across the organization. As such we chose Iceberg as the table format to bridge the gaps in our decentralized managed tech landscape. We are leveraging Unity Catalog as the Iceberg REST catalog of choice for storing metadata and managing tables. In this talk we will outline our architectural setup between Databricks, Spark, Flink and Snowflake and will explain the native Unity Iceberg REST catalog, as well as catalog federation towards connected engines. We will highlight the impact on our business and discuss the advantages and lessons learned from our early adopter experience.

How an Open, Scalable and Secure Data Platform is Powering Quick Commerce Swiggy's AI

Swiggy, India's leading quick commerce platform, serves ~13 million users across 653 cities, with 196,000 restaurant partners and 17,000 SKUs. To handle this scale, Swiggy developed a secure, scalable AI platform processing millions of predictions per second. The tech stack includes Apache Kafka for real-time streaming, Apache Spark on Databricks for analytics and ML, and Apache Flink for stream processing. The Lakehouse architecture on Delta ensures data reliability, while Unity Catalog enables centralized access control and auditing. These technologies power critical AI applications like demand forecasting, route optimization, personalized recommendations, predictive delivery SLAs, and generative AI use cases.Key Takeaway:This session explores building a data platform at scale, focusing on cost efficiency, simplicity, and speed, empowering Swiggy to seamlessly support millions of users and AI use cases.

Alexey Novakov: Streamhouse Architecture with Flink and Paimon

🌟 Session Overview 🌟

Session Name: Speaker: Alexey Novakov Session Description: Today, many data teams choose lakehouse architecture for their data platforms. But what if they process all data in streaming mode? Then they end up building a streaming lakehouse, or 'streamhouse' for short! This means they use stream processing engines to ingest, transform, and analyze business data in near real-time. However, they still want to use inexpensive storage infrastructure. How can they achieve that?

This talk introduces data teams to tools like Apache Paimon in combination with Flink. Paimon has been built with a strong focus on streaming workflows, serving as a table format in a lakehouse. It takes the stream processing approach in lakehouse architecture to the next level compared to other table formats that are more oriented towards batch data. After this talk, data teams will know how to use Paimon and Flink to build a cost-efficient and fast data layer for different data processing scenarios.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Gunnar Morling: Data Contracts In Practice With Debezium and Apache Flink

🌟 Session Overview 🌟

Session Name: Data Contracts In Practice With Debezium and Apache Flink Speaker: Gunnar Morling Session Description: Log-based change data capture (CDC) is an invaluable part of the data engineering toolbox: it enables a variety of use cases such as real-time analytics, full-text search, or cache invalidation by publishing data change events from your database. But when publishing change event streams across context or team boundaries, aren’t you tying external consumers to your application’s data model, thus limiting yourself in evolving the same?

Enter data contracts—consciously designed abstractions between your internal data model and the outside world. Come and join us for this session to learn about:

Challenges you may encounter when exposing table-level change event streams and how data contracts can mitigate them Implementation strategies for data contracts, such as the outbox pattern and stream processing Evolving your data model and the corresponding data contracts without breaking any existing consumers We’ll also touch on some advanced topics at the intersection of CDC and stream processing, such as hydrating partial change events, using the popular change stream processing duo of Debezium and Apache Flink.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Olena Kutsenko: Sentiment Analysis in Action: Building Your Real-time Pipeline

🌟 Session Overview 🌟

Session Name: Sentiment Analysis in Action: Building Your Real-time Pipeline Speaker: Olena Kutsenko Session Description: Monitoring and interpreting the sentiment of data records is important for a variety of use cases. However, traditional human-based methods fall short in handling huge volumes of information with the required speed and efficiency. AI, however, can address this challenge.

AI is only part of the solution. We need to build a data pipeline that ingests data from various channels, processes it using AI-driven sentiment analysis models to classify the sentiment of each individual record, and prepares it to be consumed by applications for aggregation and analysis.

In this session, we'll build a system using open-source technologies Apache Kafka and Apache Flink with AI models to obtain real-time sentiment from social media data. Apache Kafka's scalability ensures that no record is left behind, making it a reliable foundation for sentiment analysis. Apache Flink, with its adaptability to fluctuations in data volume and velocity, will enable the analysis of a continuous data stream using an AI model.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

AWS re:Invent 2024 - Operate and scale managed Apache Kafka and Apache Flink clusters (ANT342)

Enterprises use Apache Kafka and Apache Flink for an increasing number of mission-critical use cases, real-time analytics, application messaging, and machine learning. As this usage grows in size and scale, so does the criticality, scale, and cost of managing the Kafka and Flink clusters. Learn how AWS customers can achieve the same or higher availability and durability of their growing clusters, both at lower unit costs and with operational simplicity, with Amazon MSK and Amazon Managed Service for Apache Flink.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

Volker Janz: Real-time Customer Engagement in Gaming Using Kafka and Flink

Join Volker Janz to unlock the magic of Real-Time Customer Engagement in gaming with Kafka and Flink, processing 1.5 billion daily events for immersive player experiences. 🎮✨ #Gaming #RealTimeEngagement

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Timothy Spann: Building Real-time Travel Alerts

Join Timothy Spann as he takes you on a journey of 'Building Real-time Travel Alerts' 🌍🚀. Learn how to construct a dynamic streaming application using Apache NiFi, Apache Kafka, and Apache Flink, ensuring optimal performance, productivity, and development simplicity in delivering timely travel advisories. 🌐🛫 #RealTimeAlerts #Streaming #ApacheStack

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Real-Time analytics with open-source connectors in MS Fabric | OD46

This session focuses on the use of open-source connectors to enable real-time analytics in Microsoft Fabric and will cover the use of connectors such as Apache Kafka, Apache Flink, Apache Spark, Open Telemetry, Logstash etc. to ingest and process data in real-time. Attendees will learn how to analyze data ingested via open-source connectors to generate insights.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Akshay Dixit

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

OD46 | English (US) | Data

MSIgnite

Cross-Platform Data Lineage with OpenLineage

There are more data tools available than ever before, and it is easier to build a pipeline than it has ever been. These tools and advancements have created an explosion of innovation, resulting in data within today's organizations becoming increasingly distributed and can't be contained within a single brain, a single team, or a single platform. Data lineage can help by tracing the relationships between datasets and providing a map of your entire data universe.

OpenLineage provides a standard for lineage collection that spans multiple platforms, including Apache Airflow, Apache Spark™, Flink®, and dbt. This empowers teams to diagnose and address widespread data quality and efficiency issues in real time. In this session, we will show how to trace data lineage across Apache Spark and Apache Airflow. There will be a walk-through of the OpenLineage architecture and a live demo of a running pipeline with real-time data lineage.

Talk by: Julien Le Dem,Willy Lulciuc

Here’s more to explore: Data, Analytics, and AI Governance: https://dbricks.co/44gu3YU

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Delta Kernel: Simplifying Building Connectors for Delta

Since the release of Delta 2.0, the project has been growing at a breakneck speed. In this session, we will cover all the latest capabilities that makes Delta Lake the best format for the lakehouse. Based on lessons learned from this past year, we will introduce Project Aqueduct and how we will simplify building Delta Lake APIs from Rust and Go to Trino, Flink, and PySpark.

Talk by: Tathagata Das and Denny Lee

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc