talk-data.com

Topic

Talend

data_integration etl big_data

Activities

tagged

Activity Trend

4 peak/qtr

2020-Q1 2026-Q2

Top Events

O'Reilly Data Engineering Books 4 Big Data & AI Paris 2025 4 Data Engineering Podcast 2 O'Reilly Data Science Books 1 Data + AI Summit 2025 1 O'Reilly Business Intelligence Books 1

Top Speakers

Tobias Macey 2 Thomas Bie (Qlik) 1 Kent Graziano (SnowflakeDB) 1 Prabhjot Kaur 1 Niranjanamurthy M. 1 Jonathan Bowen 1 Geetika Dhand 1 Gleb Mezhanskiy (Datafold) 1 Christopher Bergh (DataKitchen) 1 Cyril Monti (Qlik) 1 Romain BOMMELAER (QLIK) 1 Juan Valladares 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: O'Reilly Data Engineering Books ×

Next-Generation Big Data: A Practical Guide to Apache Kudu, Impala, and Spark

2018-06-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Butch Quinto

Alteryx Analytics BI Big Data Cloud Computing Data Governance DataViz DWH Apache HBase HDFS Kafka MySQL +7 more

Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies. Next-Generation Big Data takes a holistic approach, covering the most important aspects of modern enterprise big data. The book covers not only the main technology stack but also the next-generation tools and applications used for big data warehousing, data warehouse optimization, real-time and batch data ingestion and processing, real-time data visualization, big data governance, data wrangling, big data cloud deployments, and distributed in-memory big data computing. Finally, the book has an extensive and detailed coverage of big data case studies from Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard. What You’ll Learn Install Apache Kudu, Impala, and Spark to modernize enterprise data warehouse and business intelligence environments, complete with real-world, easy-to-follow examples, and practical advice Integrate HBase, Solr, Oracle, SQL Server, MySQL, Flume, Kafka, HDFS, and Amazon S3 with Apache Kudu, Impala, and Spark Use StreamSets, Talend, Pentaho, and CDAP for real-time and batch data ingestion and processing Utilize Trifacta, Alteryx, and Datameer for data wrangling and interactive data processing Turbocharge Spark with Alluxio, a distributed in-memory storage platform Deploy big data in the cloud using Cloudera Director Perform real-time data visualization and time series analysis using Zoomdata, Apache Kudu, Impala, and Spark Understand enterprise big data topics such as big data governance, metadata management, data lineage, impact analysis, and policy enforcement, and how to use Cloudera Navigator to perform common data governance tasks Implement big data use cases such as big data warehousing, data warehouse optimization, Internet of Things, real-time data ingestion and analytics, complex event processing, and scalable predictive modeling Study real-world big data case studies from innovative companies, including Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard Who This Book Is For BI and big data warehouse professionals interested in gaining practical and real-world insight into next-generation big data processing and analytics using Apache Kudu, Impala, and Spark; and those who want to learn more about other advanced enterprise topics

Self-Service Analytics

2016-03-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sandra Swanson

Analytics Data Governance Cyber Security data data-engineering data-lake storage-repositories

Organizations today are swimming in data, but most of them manage to analyze only a fraction of what they collect. To help build a stronger data-driven culture, many organizations are adopting a new approach called self-service analytics. This O’Reilly report examines how this approach provides data access to more people across a company, allowing business users to work with data themselves and create their own customized analyses. The result? More eyes looking at more data in more ways. Along with the perceived benefits, author Sandra Swanson also delves into the potential pitfalls of self-service analytics: balancing greater data access with concerns about security, data governance, and siloed data stores. Read this report and gain insights from enterprise tech (Yahoo), government (the City of Chicago), and disruptive retail (Warby Parker and Talend). Learn how these organizations are handling self-service analytics in practice. Sandra Swanson is a Chicago-based writer who’s covered technology, science, and business for dozens of publications, including ScientificAmerican.com. Connect with her on Twitter (@saswanson) or at www.saswanson.com.

Talend Open Studio Cookbook

2013-10-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rick Barton

data data-engineering integration-solutions

Talend Open Studio Cookbook is a comprehensive guide for both beginners and intermediate users of Talend Open Studio, the leading open-source data integration software. Through practical recipes, this book covers all aspects of Talend development, from schemas and data mapping to advanced debugging and deployment techniques. What this Book will help me do Master the use of schemas for forming solid data structures. Effectively utilize tMap for data transformation and integration. Develop skills to manage and manipulate various file formats. Understand how to test and debug Talend jobs to ensure robust solutions. Learn to deploy, schedule, and manage Talend integrations in production environments. Author(s) None Barton is an experienced developer and a passionate advocate for open-source data tools. With years of hands-on experience in data integration and Talend development, they bring a practical and results-driven perspective to their writing, aiming to empower developers with actionable insights and real-world expertise. Who is it for? Ideal readers for this book are beginner and intermediate developers seeking to enhance their understanding of Talend Open Studio. Whether you've used the software for basic tasks or are completely new to it, this cookbook format is structured to guide you through practical challenges and deeper concepts. If your goal is to build confidence and efficiency in data integration tasks, this book is designed for you.

Getting Started with Talend Open Studio for Data Integration

2012-11-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jonathan Bowen

Data Management SQL data data-engineering integration-solutions

Discover how to leverage Talend Open Studio for Data Integration to manage and optimize your data workflow. This book provides a hands-on introduction to creating integration jobs and automating data processes using Talend's drag-and-drop interface. Explore practical examples, and realize how powerful and approachable data integration can be. What this Book will help me do Develop and deploy scalable data integration pipelines using Talend Open Studio. Master common data operations like filtering, sorting, transforming, and aggregating. Gain expertise in connecting various data sources, both relational and non-relational. Implement complex flow logic, including conditional processing and dependencies. Learn to package and manage production-ready integration jobs for real-world scenarios. Author(s) Jonathan Bowen is an experienced technologist and author specializing in data integration and software tools. With years of hands-on experience, Jonathan has guided many organizations in adopting efficient data workflows. He conveys technical concepts with clarity and provides practical, actionable content to help readers succeed. Who is it for? This book is perfect for developers, business analysts, and IT professionals tasked with integration projects. Whether you're a novice to data integration or looking to deepen your hands-on experience with Talend, this guide will support your journey. Some prior familiarity with SQL and a data management background are advantageous. Choose this book if you aim to become a proficient data integrator.