talk-data.com

Topic

Docker

containerization devops virtualization

Activities

109

tagged

Activity Trend

14 peak/qtr

2020-Q1 2026-Q2

Top Events

O'Reilly Data Engineering Books 25 Data Engineering Podcast 15 Google Cloud Next '25 5 O'Reilly Data Science Books 4 DataTalks.Club 4 Live Workshop: Automate Image Builds using Pulumi and Docker Build Cloud 3 Microsoft Ignite 2025 3 Airflow Summit 2023 3 Airflow Summit 2020 3 The Pragmatic Engineer 2 Build & Learn: Data Science with Coffee 2 Docker Fundamentals & Optimisations Workshop (FLINTA only event) 2

Top Speakers

Tobias Macey 15 Lindsey 6 Diana Esteves (Pulumi) 5 Michael Irwin (Docker) 5 Michael YenChi Ho (Microsoft) 3 Brian Redmond (Microsoft) 3 Ramcharan Kakarla 2 Jarek Potiuk (Apache Software Foundation) 2 Eben Hewitt 2 Kaxil Naik 2 Marinka (Spiced Academy) 2 Gergely Orosz 2

Activities

109 activities · Newest first

All Video Podcast Book

Cassandra: The Definitive Guide, 2nd Edition

2016-07-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Eben Hewitt , Jeff Carpenter

Cassandra Cloud Computing Data Modelling ELK Hadoop Java JavaScript Python Spark data data-engineering nosql-databases

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene

Learning RabbitMQ

2015-12-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Martin Toshev

data data-engineering rabbitmq streaming-messaging

Learning RabbitMQ offers developers and system administrators a clear and practical guide to mastering RabbitMQ, the popular message-broker solution. By going through concrete scenarios and examples, this book equips you with the skills to configure, manage, and optimize RabbitMQ instances effectively. What this Book will help me do Understand and apply common messaging patterns using RabbitMQ. Set up and manage RabbitMQ clusters with high availability. Integrate RabbitMQ with popular tools like Spring, MuleESB, and Docker. Optimize RabbitMQ performance and ensure secure messaging. Troubleshoot and extend RabbitMQ for different use cases. Author(s) None Toshev, the author of Learning RabbitMQ, is an expert in middleware and distributed systems, with extensive hands-on experience working with message brokers. Toshev's practical approach and clear explanations make complex topics easy to understand, helping readers effectively apply best practices when using RabbitMQ. Who is it for? This book is for developers and system administrators who aim to incorporate RabbitMQ as part of their applications. Beginners with a basic understanding of messaging will find it foundational, while experienced users will appreciate the advanced insights on integration, performance tuning, and troubleshooting.

Field Guide to Hadoop

2015-03-02 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marshall Presser , Kevin Sitto

Avro Big Data Cassandra Chef Cloud Computing Data Management Hadoop Apache HBase HDFS Hive JSON MongoDB +5 more

If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You’ll quickly understand how Hadoop’s projects, subprojects, and related technologies work together. Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you’ll have a good grasp of the playing field. Topics include: Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark Database and data management—Cassandra, HBase, MongoDB, and Hive Serialization—Avro, JSON, and Parquet Management and monitoring—Puppet, Chef, Zookeeper, and Oozie Analytic helpers—Pig, Mahout, and MLLib Data transfer—Scoop, Flume, distcp, and Storm Security, access control, auditing—Sentry, Kerberos, and Knox Cloud computing and virtualization—Serengeti, Docker, and Whirr

Build Bigger With Small Ai: Running Small Models Locally

· Small Data SF 2024 Watch

video

AI/ML Cloud Computing Data Engineering DuckDB Linux LLM RAG SQL

It's finally possible to bring the awesome power of Large Language Models (LLMs) to your laptop. This talk will explore how to run and leverage small, openly available LLMs to power common tasks involving data, including selecting the right models, practical use cases for running small models, and best practices for deploying small models effectively alongside databases.

Bio: Jeffrey Morgan is the founder of Ollama, an open-source tool to get up and run large language models. Prior to founding Ollama, Jeffrey founded Kitematic, which was acquired by Docker and evolved into Docker Desktop. He has previously worked at companies including Docker, Twitter, and Google.

➡️ Follow Us LinkedIn: https://www.linkedin.com/company/small-data-sf/ X/Twitter : https://twitter.com/smalldatasf Website: https://www.smalldatasf.com/

Discover how to run large language models (LLMs) locally using Ollama, the easiest way to get started with small AI models on your Mac, Windows, or Linux machine. Unlike massive cloud-based systems, small open source models are only a few gigabytes, allowing them to run incredibly fast on consumer hardware without network latency. This video explains why these local LLMs are not just scaled-down versions of larger models but powerful tools for developers, offering significant advantages in speed, data privacy, and cost-effectiveness by eliminating hidden cloud provider fees and risks.

Learn the most common use case for small models: combining them with your existing factual data to prevent hallucinations. We dive into retrieval augmented generation (RAG), a powerful technique where you augment a model's prompt with information from a local data source. See a practical demo of how to build a vector store from simple text files and connect it to a model like Gemma 2B, enabling you to query your own data using natural language for fast, accurate, and context-aware responses.

Explore the next frontier of local AI with small agents and tool calling, a new feature that empowers models to interact with external tools. This guide demonstrates how an LLM can autonomously decide to query a DuckDB database, write the correct SQL, and use the retrieved data to answer your questions. This advanced tutorial shows you how to connect small models directly to your data engineering workflows, moving beyond simple chat to create intelligent, data-driven applications.

Get started with practical applications for small models today, from building internal help desks to streamlining engineering tasks like code review. This video highlights how small and large models can work together effectively and shows that open source models are rapidly catching up to their cloud-scale counterparts. It's never been a better time for developers and data analysts to harness the power of local AI.

Build NGINX Dockerfile in Pulumi and DBC

· Live Workshop: Automate Image Builds using Pulumi and Docker Build Cloud

talk

by Michael Irwin (Docker) , Diana Esteves (Pulumi)

Pulumi TypeScript docker build cloud

Demonstrate building an NGINX Dockerfile using Pulumi and Docker Build Cloud to leverage external caching.

Create and configure a Docker Build Cloud (DBC) builder

· Live Workshop: Automate Image Builds using Pulumi and Docker Build Cloud

talk

by Michael Irwin (Docker) , Diana Esteves (Pulumi)

Pulumi TypeScript docker build cloud

Learn how to create and configure a Docker Build Cloud (DBC) builder.

Create a Pulumi program in TypeScript to define IaC

· Live Workshop: Automate Image Builds using Pulumi and Docker Build Cloud

talk

by Michael Irwin (Docker) , Diana Esteves (Pulumi)

Pulumi TypeScript iac

Use TypeScript to define infrastructure as code (IaC) for Docker builds via Pulumi.

Kubernetes and Pulumi: Modern cloud-native deployment and management

· Workshop: Pulumi and Kubernetes - Better Together

talk

by Josh Kodroff (Pulumi)

Kubernetes Pulumi gitops helm

Explore how to use Pulumi with Kubernetes to deploy and manage containerized workloads, integrate Pulumi with existing Kubernetes resources (manifests or Helm charts), and run Pulumi IaC programs in a GitOps fashion.

Orchestrating the Cloud with Kubernetes

· Google Cloud Next '25

session

Cloud Computing GCP Kubernetes

In this hands-on lab, you'll explore the power of Kubernetes and learn how to orchestrate cloud applications with ease. Using Google Kubernetes Engine, you’ll provision a fully managed Kubernetes cluster and deploy Docker containers using kubectl. Break down a monolithic application into microservices using Kubernetes Deployments and Services, and gain insights into the latest innovations in resource efficiency, developer productivity, and automated operations. By the end, you'll be ready to streamline application management in any environment.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

Page 6 of 6

← Previous

1 ... 4 5 6