Search – talk-data.com

Title & Speakers	Event
AI agents with Apache Pinot 2024-09-12 · 17:00 Join us for a deep dive into building agentic applications using Apache Pinot, a powerful real-time OLAP database. This meetup will delve into how AI-driven agents are reshaping real-time analytics, highlighting their potential to revolutionize data retrieval, analysis, and decision-making processes. Attendees will explore practical examples of agentic AI in action, such as retrieving top purchasers, identifying active users, and accessing detailed user profiles in real time. The discussion will also cover the importance of hybrid search capabilities that blend keyword-based and vector search methods for more accurate results. By leveraging the power of AI and LLMs, organizations can future-proof their analytics systems, ensuring they remain adaptable and efficient in the face of ever-changing data landscapes. This session is ideal for professionals seeking to stay ahead of the curve in real-time analytics and AI innovation. Hubert Dulay, Developer Advocate, StarTree\, \\| @hkdulay For more content and chats, join our community on Slack>	AI agents with Apache Pinot
Streaming Databases 2024-08-15 Ralph Matthias Debusmann – author , Hubert Dulay – author Real-time applications are becoming the norm today. But building a model that works properly requires real-time data from the source, in-flight stream processing, and low latency serving of its analytics. With this practical book, data engineers, data architects, and data analysts will learn how to use streaming databases to build real-time solutions. Authors Hubert Dulay and Ralph M. Debusmann take you through streaming database fundamentals, including how these databases reduce infrastructure for real-time solutions. You'll learn the difference between streaming databases, stream processing, and real-time online analytical processing (OLAP) databases. And you'll discover when to use push queries versus pull queries, and how to serve synchronous and asynchronous data emanating from streaming databases. This guide helps you: Explore stream processing and streaming databases Learn how to build a real-time solution with a streaming database Understand how to construct materialized views from any number of streams Learn how to serve synchronous and asynchronous data Get started building low-complexity streaming solutions with minimal setup data data-engineering streaming-messaging streaming-architecture Analytics Data Streaming	O'Reilly Data Engineering Books
Using MinIO as a Deep Store meetup 2024-07-30 · 18:00 Note: the event will be streamed live on Linkedin! Please complete your registration here> ---------------------------- In this meetup, we’ll dive into Apache Pinot’s Deep Store using Minio and its critical role in disaster recovery. We'll explore how Deep Store offloads older segments to a cost-effective object store, enhancing system efficiency and ensuring quick data retrieval in case of failures. We'll demonstrate the setup and configuration using Docker, providing practical insights into managing and retrieving data with Apache Pinot and Minio. Hubert Dulay, Developer Advocate, StarTree\, \\| @hkdulay	Using MinIO as a Deep Store meetup
Real-Time Gen AI x Pinot 2024-05-22 · 16:00 We're excited to invite you to join us for a Gen Ai x Pinot session! 9 AM PDT \\| 12 PM EST Join the event online here> ------- RAG stands for Retrieval-Augmented Generation. It’s a technique that enhances the responses returned by a large language model (LLM) to be more accurate. LLMs alone can answer many questions provided to them. But to respond with accurate answers and cite sources, the LLM needs help to do some research. In this post, we’ll show how to do this in real-time using Apache Pinot. Hubert Dulay Author of the O'Reilly book “streaming data mesh." Developer Advocate at StarTree. He is currently writing his second book "Streaming Databases." Hubert is a veteran engineer with over 20 years of experience in big & fast data and ML. Join our community on Slack here>	Real-Time Gen AI x Pinot
Optimizing Real-Time Analytics: Unraveling JOINs in Flink and Pinot 2024-01-30 · 23:00 In this meetup, we will talk about when to perform JOINs. Should you preprocess joins in a stream processor or do them in Pinot. We will go over how this all relates to “one big table” (OBT) and UPSERTs. At the end, we will better understand how to model your real-time analytics. Hubert Dulay, Developer Advocate - StarTree Hubert Dulay is an O'Reilly author of “Streaming Data Mesh” and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.	Optimizing Real-Time Analytics: Unraveling JOINs in Flink and Pinot
Analyze Wikipedia Changes with Pinot and Jupyter Notebook 2023-12-06 · 23:00 In this meetup, we will capture Wikipedia page change events and send them to Apache Pinot. We will leverage two important features in Apache Pinot: UPSERT and Ingestion Transformation. At the end, we'll visualize the data in a Jupyter Notebook. Hubert Dulay, Developer Advocate - StarTree Hubert Dulay is an O'Reilly author of “Streaming Data Mesh” and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.	Analyze Wikipedia Changes with Pinot and Jupyter Notebook
How to Performance Apache Pinot Using Gatling 2023-10-11 · 22:00 Apache Pinot, an open-source distributed data store, has emerged as a powerful tool for handling high-velocity, low-latency data at scale. To ensure that Pinot meets the demands of real-time analytics, it's essential to conduct comprehensive performance testing. This presentation aims to guide you through the process of performance testing Apache Pinot using Gatling, a widely used load-testing tool. We'll explore the key concepts, methodologies, and best practices for assessing the performance and scalability of your Pinot cluster. Hubert Dulay , Developer Advocate @ StarTree, is the author of “Streaming Data Mesh.” He is a veteran engineer with over 20 years of experience in big and fast data and MLOps. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems. For more discussions and content updates, join the Community on Slack> Join the event Online here>	How to Performance Apache Pinot Using Gatling
Streaming Data Mesh 2023-05-11 Stephen Mooney – author , Hubert Dulay – author Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services. Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly. With this book, you will: Design a streaming data mesh using Kafka Learn how to identify a domain Build your first data product using self-service tools Apply data governance to the data products you create Learn the differences between synchronous and asynchronous data services Implement self-services that support decentralized data data data-engineering streaming-messaging streaming-architecture Data Governance DevOps Kafka MLOps Data Streaming	O'Reilly Data Engineering Books

AI agents with Apache Pinot 2024-09-12 · 17:00

Join us for a deep dive into building agentic applications using Apache Pinot, a powerful real-time OLAP database. This meetup will delve into how AI-driven agents are reshaping real-time analytics, highlighting their potential to revolutionize data retrieval, analysis, and decision-making processes. Attendees will explore practical examples of agentic AI in action, such as retrieving top purchasers, identifying active users, and accessing detailed user profiles in real time. The discussion will also cover the importance of hybrid search capabilities that blend keyword-based and vector search methods for more accurate results. By leveraging the power of AI and LLMs, organizations can future-proof their analytics systems, ensuring they remain adaptable and efficient in the face of ever-changing data landscapes. This session is ideal for professionals seeking to stay ahead of the curve in real-time analytics and AI innovation.

Hubert Dulay, Developer Advocate, StarTree\, \| @hkdulay For more content and chats, join our community on Slack>

AI agents with Apache Pinot

Streaming Databases 2024-08-15

Ralph Matthias Debusmann – author , Hubert Dulay – author

Real-time applications are becoming the norm today. But building a model that works properly requires real-time data from the source, in-flight stream processing, and low latency serving of its analytics. With this practical book, data engineers, data architects, and data analysts will learn how to use streaming databases to build real-time solutions. Authors Hubert Dulay and Ralph M. Debusmann take you through streaming database fundamentals, including how these databases reduce infrastructure for real-time solutions. You'll learn the difference between streaming databases, stream processing, and real-time online analytical processing (OLAP) databases. And you'll discover when to use push queries versus pull queries, and how to serve synchronous and asynchronous data emanating from streaming databases. This guide helps you: Explore stream processing and streaming databases Learn how to build a real-time solution with a streaming database Understand how to construct materialized views from any number of streams Learn how to serve synchronous and asynchronous data Get started building low-complexity streaming solutions with minimal setup

data data-engineering streaming-messaging streaming-architecture Analytics Data Streaming

O'Reilly Data Engineering Books

Using MinIO as a Deep Store meetup 2024-07-30 · 18:00

Note: the event will be streamed live on Linkedin! Please complete your registration here> ----------------------------

In this meetup, we’ll dive into Apache Pinot’s Deep Store using Minio and its critical role in disaster recovery. We'll explore how Deep Store offloads older segments to a cost-effective object store, enhancing system efficiency and ensuring quick data retrieval in case of failures. We'll demonstrate the setup and configuration using Docker, providing practical insights into managing and retrieving data with Apache Pinot and Minio.

Hubert Dulay, Developer Advocate, StarTree\, \| @hkdulay

Using MinIO as a Deep Store meetup

Real-Time Gen AI x Pinot 2024-05-22 · 16:00

We're excited to invite you to join us for a Gen Ai x Pinot session! 9 AM PDT \| 12 PM EST Join the event online here>

------- RAG stands for Retrieval-Augmented Generation. It’s a technique that enhances the responses returned by a large language model (LLM) to be more accurate. LLMs alone can answer many questions provided to them. But to respond with accurate answers and cite sources, the LLM needs help to do some research. In this post, we’ll show how to do this in real-time using Apache Pinot.

Hubert Dulay Author of the O'Reilly book “streaming data mesh." Developer Advocate at StarTree. He is currently writing his second book "Streaming Databases." Hubert is a veteran engineer with over 20 years of experience in big & fast data and ML. Join our community on Slack here>

Real-Time Gen AI x Pinot

Optimizing Real-Time Analytics: Unraveling JOINs in Flink and Pinot 2024-01-30 · 23:00

In this meetup, we will talk about when to perform JOINs. Should you preprocess joins in a stream processor or do them in Pinot. We will go over how this all relates to “one big table” (OBT) and UPSERTs. At the end, we will better understand how to model your real-time analytics.

Hubert Dulay, Developer Advocate - StarTree

Hubert Dulay is an O'Reilly author of “Streaming Data Mesh” and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.

Optimizing Real-Time Analytics: Unraveling JOINs in Flink and Pinot

Analyze Wikipedia Changes with Pinot and Jupyter Notebook 2023-12-06 · 23:00

In this meetup, we will capture Wikipedia page change events and send them to Apache Pinot. We will leverage two important features in Apache Pinot: UPSERT and Ingestion Transformation. At the end, we'll visualize the data in a Jupyter Notebook.

Hubert Dulay, Developer Advocate - StarTree Hubert Dulay is an O'Reilly author of “Streaming Data Mesh” and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.

Analyze Wikipedia Changes with Pinot and Jupyter Notebook

How to Performance Apache Pinot Using Gatling 2023-10-11 · 22:00

Apache Pinot, an open-source distributed data store, has emerged as a powerful tool for handling high-velocity, low-latency data at scale. To ensure that Pinot meets the demands of real-time analytics, it's essential to conduct comprehensive performance testing.

This presentation aims to guide you through the process of performance testing Apache Pinot using Gatling, a widely used load-testing tool. We'll explore the key concepts, methodologies, and best practices for assessing the performance and scalability of your Pinot cluster.

Hubert Dulay , Developer Advocate @ StarTree, is the author of “Streaming Data Mesh.” He is a veteran engineer with over 20 years of experience in big and fast data and MLOps. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.

For more discussions and content updates, join the Community on Slack>
Join the event Online here>

How to Performance Apache Pinot Using Gatling

Streaming Data Mesh 2023-05-11

Stephen Mooney – author , Hubert Dulay – author

Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services. Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly. With this book, you will: Design a streaming data mesh using Kafka Learn how to identify a domain Build your first data product using self-service tools Apply data governance to the data products you create Learn the differences between synchronous and asynchronous data services Implement self-services that support decentralized data

data data-engineering streaming-messaging streaming-architecture Data Governance DevOps Kafka MLOps Data Streaming

O'Reilly Data Engineering Books

talk-data.com

People (5 results)

Activities & events