talk-data.com talk-data.com

A

Speaker

Andy Oram

6

talks

author

Filter by Event / Source

Talks & appearances

6 activities · Newest first

Search activities →
The Evolving Role of the Data Engineer

Companies working to become data driven often view data scientists as heroes, but that overlooks the vital role that data engineers play in the process. While data scientists focus on finding new insights from datasets, data engineers deal with preparation—obtaining, cleaning, and creating enhanced versions of the data an organization needs. In this report, Andy Oram examines how the role of data engineer has quickly evolved. DBAs, software engineers, developers, and students will explore the responsibilities of modern data engineers and the skills and tools necessary to do the job. You’ll learn how to deal with software engineering concepts such as rapid and continuous development, automation and orchestration, modularity, and traceability. Decision makers considering a move to the cloud will also benefit from the in-depth discussion this report provides. This report covers: Major tasks of data engineers today The different levels of structure in data and ways to maximize its value Capabilities of third-party cloud options Tools for ingestion, transfer, and enrichment Using containers and VMs to run the tools Software engineering development Automation and orchestration of data engineering

Streaming Data

Managers and staff responsible for planning, hiring, and allocating resources need to understand how streaming data can fundamentally change their organizations. Companies everywhere are disrupting business, government, and society by using data and analytics to shape their business. Even if you don’t have deep knowledge of programming or digital technology, this high-level introduction brings data streaming into focus. You won’t find math or programming details here, or recommendations for particular tools in this rapidly evolving space. But you will explore the decision-making technologies and practices that organizations need to process streaming data and respond to fast-changing events. By describing the principles and activities behind this new phenomenon, author Andy Oram shows you how streaming data provides hidden gems of information that can transform the way your business works. Learn where streaming data comes from and how companies put it to work Follow a simple data processing project from ingesting and analyzing data to presenting results Explore how (and why) big data processing tools have evolved from MapReduce to Kubernetes Understand why streaming data is particularly useful for machine learning projects Learn how containers, microservices, and cloud computing led to continuous integration and DevOps

Data Lake Maturity Model

Data is changing everything. Many industries today are being fundamentally transformed through the accumulation and analysis of large quantities of data, stored in diversified but flexible repositories known as data lakes. Whether your company has just begun to think about big data or has already initiated a strategy for handling it, this practical ebook shows you how to plan a successful data lake migration. You’ll learn the value of data lakes, their structure, and the problems they attempt to solve. Using Zaloni’s data lake maturity model, you’ll then explore your organization’s readiness for putting a data lake into action. Do you have the tools and data architectures to support big data analysis? Are your people and processes prepared? The data lake maturity model will help you rate your organization’s readiness. This report includes: The structure and purpose of a data lake Descriptive, predictive, and prescriptive analytics Data lake curation, self-service, and the use of data lake zones How to rate your organization using the data lake maturity model A complete checklist to help you determine your strategic path forward

Delivering Embedded Analytics in Modern Applications

Organizations are rapidly consuming more data than ever before, and to drive their competitive advantage, they’re demanding interactive visualizations and interactive analyses of that data be embedded in their applications and business processes. This will enable them to make faster and more effective decisions based on data, not guesses. This practical book examines the considerations that software developers, product managers, and vendors need to take into account when making visualization and analytics a seamlessly integrated part of the applications they deliver, as well as the impact of migrating their applications to modern data platforms. Authors Federico Castanedo (Vodafone Group) and Andy Oram (O’Reilly Media) explore the basic requirements for embedding domain expertise with fast, powerful, and interactive visual analytics that will delight and inform customers more than spreadsheets and custom-generated charts. Particular focus is placed on the characteristics of effective visual analytics for big and fast data. Learn the impact of trends driving embedded analytics Review examples of big data applications and their analytics requirements in retail, direct service, cybersecurity, the Internet of Things, and logistics Explore requirements for embedding visual analytics in modern data environments, including collection, storage, retrieval, data models, speed, microservices, parallelism, and interactivity Take a deep dive into the characteristics of effective visual analytics and criteria for evaluating modern embedded analytics tools Use a self-assessment rating chart to determine the value of your organization’s BI in the modern data setting

Managing the Data Lake

Organizations across many industries have recently created fast-growing repositories to deal with an influx of new data from many sources and often in multiple formats. To manage these data lakes, companies have begun to leave the familiar confines of relational databases and data warehouses for Hadoop and various big data solutions. But adopting new technology alone won’t solve the problem. Based on interviews with several experts in data management, author Andy Oram provides an in-depth look at common issues you’re likely to encounter as you consider how to manage business data. You’ll explore five key topic areas, including: Acquisition and ingestion: how to solve these problems with a degree of automation. Metadata: how to keep track of when data came in and how it was formatted, and how to make it available at later stages of processing. Data preparation and cleaning: what you need to know before you prepare and clean your data, and what needs to be cleaned up and how. Organizing workflows: what you should do to combine your tasks—ingestion, cataloging, and data preparation—into an end-to-end workflow. Access control: how to address security and access controls at all stages of data handling. Andy Oram, an editor at O’Reilly Media since 1992, currently specializes in programming. His work for O'Reilly includes the first books on Linux ever published commercially in the United States.

Search-Driven Business Analytics

Compared to the speed and convenience of major web search engines, most business intelligence (BI) products are slow, stiff, and unresponsive. Business leaders today often wait days or weeks to get BI reports on inquiries about customers, products, or markets. But the latest BI products show that a significant change is taking place—a change led by search. This O’Reilly report examines three recent products with intelligent search capabilities: the ThoughtSpot Analytical Search Appliance, Microsoft’s Power BI service, and an offering from Adatao. You’ll learn how these products can provide you with answers and visualizations as quickly as questions come to mind. You’ll investigate: The convergence of BI and search What a search-driven user experience looks like The intelligence required for analytical search Data sources and their associated data modeling requirements Turning on-the-fly calculations into visualizations Applying enterprise scale and security to search