talk-data.com talk-data.com

Topic

RDBMS

Relational Database Management System (RDBMS)

databases sql data_storage

14

tagged

Activity Trend

5 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services

This book is your practical and comprehensive guide to learning Google Cloud Platform (GCP) for data science, using only the free tier services offered by the platform. Data science and machine learning are increasingly becoming critical to businesses of all sizes, and the cloud provides a powerful platform for these applications. GCP offers a range of data science services that can be used to store, process, and analyze large datasets, and train and deploy machine learning models. The book is organized into seven chapters covering various topics such as GCP account setup, Google Colaboratory, Big Data and Machine Learning, Data Visualization and Business Intelligence, Data Processing and Transformation, Data Analytics and Storage, and Advanced Topics. Each chapter provides step-by-step instructions and examples illustrating how to use GCP services for data science and big data projects. Readers will learn how to set up a Google Colaboratory account and run Jupyternotebooks, access GCP services and data from Colaboratory, use BigQuery for data analytics, and deploy machine learning models using Vertex AI. The book also covers how to visualize data using Looker Data Studio, run data processing pipelines using Google Cloud Dataflow and Dataprep, and store data using Google Cloud Storage and SQL. What You Will Learn Set up a GCP account and project Explore BigQuery and its use cases, including machine learning Understand Google Cloud AI Platform and its capabilities Use Vertex AI for training and deploying machine learning models Explore Google Cloud Dataproc and its use cases for big data processing Create and share data visualizations and reports with Looker Data Studio Explore Google Cloud Dataflow and its use cases for batch and stream data processing Run data processing pipelines on Cloud Dataflow Explore Google Cloud Storageand its use cases for data storage Get an introduction to Google Cloud SQL and its use cases for relational databases Get an introduction to Google Cloud Pub/Sub and its use cases for real-time data streaming Who This Book Is For Data scientists, machine learning engineers, and analysts who want to learn how to use Google Cloud Platform (GCP) for their data science and big data projects

Python for Data Science For Dummies, 3rd Edition

Let Python do the heavy lifting for you as you analyze large datasets Python for Data Science For Dummies lets you get your hands dirty with data using one of the top programming languages. This beginner’s guide takes you step by step through getting started, performing data analysis, understanding datasets and example code, working with Google Colab, sampling data, and beyond. Coding your data analysis tasks will make your life easier, make you more in-demand as an employee, and open the door to valuable knowledge and insights. This new edition is updated for the latest version of Python and includes current, relevant data examples. Get a firm background in the basics of Python coding for data analysis Learn about data science careers you can pursue with Python coding skills Integrate data analysis with multimedia and graphics Manage and organize data with cloud-based relational databases Python careers are on the rise. Grab this user-friendly Dummies guide and gain the programming skills you need to become a data pro.

Beginning Power BI for Business Users

Discover the utility of your organization’s data with Microsoft Power BI In Beginning Power BI for Business Users: Learning to Turn Data into Insights, accomplished data professional and business intelligence expert Paul Fuller delivers an intuitive and accessible handbook for professionals seeking to use Microsoft’s Power BI to access, analyze, understand, report, and act on the data available to their organizations. In the book, you’ll discover Power BI’s robust feature set, learn to ingest and model data, visualize and report on that data, and even use the DAX scripting language to unlock still more utility from Microsoft’s popular program. Beginning with general principles geared to readers with no or little experience with reporting or data analytics tools, the author walks you through how to manipulate common, publicly available data sources—including Excel files and relational databases. You’ll also learn to: Use the included and tested sample code to work through the helpful examples included by the author Conduct data orchestration and visualization to better understand and gain insights from your data An essential resource for business analysts and Excel power users reaching the limits of that program’s capabilities, Beginning Power BI for Business Users will also benefit data analysts who seek to prepare reports for their organizations using Microsoft’s flexible and intuitive software.

The Data Wrangling Workshop - Second Edition

The Data Wrangling Workshop is your beginner's guide to the essential techniques and practices of data manipulation using Python. Throughout the book, you will progressively build your skills, learning key concepts such as extracting, cleaning, and transforming data into actionable insights. By the end, you'll be confident in handling various data wrangling tasks efficiently. What this Book will help me do Understand and apply the fundamentals of data wrangling using Python. Combine and aggregate data from diverse sources like web data, SQL databases, and spreadsheets. Use descriptive statistics and plotting to examine dataset properties. Handle missing or incorrect data effectively to maintain data quality. Gain hands-on experience with Python's powerful data science libraries like Pandas, NumPy, and Matplotlib. Author(s) Brian Lipp, None Roychowdhury, and Dr. Tirthajyoti Sarkar are experienced educators and professionals in the fields of data science and engineering. Their collective expertise spans years of teaching and working with data technologies. They aim to make data wrangling accessible and comprehensible, focusing on practical examples to equip learners with real-world skills. Who is it for? The Data Wrangling Workshop is ideal for developers, data analysts, and business analysts aiming to become data scientists or analytics experts. If you're just getting started with Python, you will find this book guiding you step-by-step. A basic understanding of Python programming, as well as relational databases and SQL, is recommended for smooth learning.

Learning Apache Drill

Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you’ll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis Query file types including logfiles, Parquet, JSON, and other complex formats Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL Connect to Drill programmatically using a variety of languages Use Drill even with challenging or ambiguous file formats Perform sophisticated analysis by extending Drill’s functionality with user-defined functions Facilitate data analysis for network security, image metadata, and machine learning

Learning Pentaho Data Integration 8 CE - Third Edition

"Learning Pentaho Data Integration 8 CE" is your comprehensive guide to mastering data manipulation and integration using Pentaho Data Integration (PDI) 8 Community Edition. Through step-by-step instructions and practical examples, you'll learn to explore, transform, validate, and integrate data from multiple sources, equipping you to handle real-world data challenges efficiently. What this Book will help me do Effectively install and understand the foundational concepts of Pentaho Data Integration 8 Community Edition. Efficiently organize, clean, and transform raw data from various sources into useful formats. Perform advanced data operations like metadata injection, managing relational databases, and implementing ETL solutions. Design, create, and deploy comprehensive data warehouse solutions using modern best practices. Streamline daily data processing tasks with flexibility and accuracy while handling errors gracefully. Author(s) The author, Carina Roldán, is an experienced professional in the field of data science and ETL (Extract, Transform, Load) development. Her expertise in leveraging tools like Pentaho Data Integration has allowed her to contribute significantly to BI and data management projects. Her approach in writing this book reflects her commitment to simplifying complex topics for aspiring professionals. Who is it for? This book is ideal for software developers, data analysts, business intelligence professionals, and IT students aiming to enhance their skills in ETL processes using Pentaho Data Integration. Beginners who wish to learn PDI comprehensively and professionals looking to deepen their expertise will both find value in this resource. It's also suitable for individuals involved in data warehouse design and implementation. This book will equip you with the skills to handle diverse data transformation tasks effectively.

Pro Tableau: A Step-by-Step Guide

Leverage the power of visualization in business intelligence and data science to make quicker and better decisions. Use statistics and data mining to make compelling and interactive dashboards. This book will help those familiar with Tableau software chart their journey to being a visualization expert. Pro Tableau demonstrates the power of visual analytics and teaches you how to: Connect to various data sources such as spreadsheets, text files, relational databases (Microsoft SQL Server, MySQL, etc.), non-relational databases (NoSQL such as MongoDB, Cassandra), R data files, etc. Write your own custom SQL, etc. Perform statistical analysis in Tableau using R Use a multitude of charts (pie, bar, stacked bar, line, scatter plots, dual axis, histograms, heat maps, tree maps, highlight tables, box and whisker, etc.) What you'll learn Connect to various data sources such as relational databases (Microsoft SQL Server, MySQL), non-relational databases (NoSQL such as MongoDB, Cassandra), write your own custom SQL, join and blend data sources, etc. Leverage table calculations (moving average, year over year growth, LOD (Level of Detail), etc. Integrate Tableau with R Tell a compelling story with data by creating highly interactive dashboards Who this book is for All levels of IT professionals, from executives responsible for determining IT strategies to systems administrators, to data analysts, to decision makers responsible for driving strategic initiatives, etc. The book will help those familiar with Tableau software chart their journey to a visualization expert.

Excel Power Pivot and Power Query For Dummies

A guide to PowerPivot and Power Query no data cruncher should be without! Want to familiarize yourself with the rich set of Microsoft Excel tools and reporting capabilities available from PowerPivot and Power Query? Look no further! Excel PowerPivot & Power Query For Dummies shows you how this powerful new set of tools can be leveraged to more effectively source and incorporate 'big data' Business Intelligence and Dashboard reports. You'll discover how PowerPivot and Power Query not only allow you to save time and simplify your processes, but also enable you to substantially enhance your data analysis and reporting capabilities. Gone are the days of relatively small amounts of data—today's data environment demands more from business analysts than ever before. Now, with the help of this friendly, hands-on guide, you'll learn to use PowerPivot and Power Query to expand your skill-set from the one-dimensional spreadsheet to new territories, like relational databases, data integration, and multi-dimensional reporting. Demonstrates how Power Query is used to discover, connect to, and import your data Shows you how to use PowerPivot to model data once it's been imported Offers guidance on using these tools to make analyzing data easier Written by a Microsoft MVP in the lighthearted, fun style you've come to expect from the For Dummies brand If you spend your days analyzing data, Excel PowerPivot & Power Query For Dummies will get you up and running with the rich set of Excel tools and reporting capabilities that will make your life—and work—easier.

Data Science For Dummies

Discover how data science can help you gain in-depth insight into your business - the easy way! Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles in organizations. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you'll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals before moving on to working with relational databases and unstructured data and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It's a big, big data world out there - let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

Data Mining: Concepts and Techniques, 3rd Edition

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data

Tapping into Unstructured Data: Integrating Unstructured Data and Textual Analytics into Business Intelligence

“The authors, the best minds on the topic, are breaking new ground. They show how every organization can realize the benefits of a system that can search and present complex ideas or data from what has been a mostly untapped source of raw data.” --Randy Chalfant, CTO, Sun Microsystems The Definitive Guide to Unstructured Data Management and Analysis--From the World’s Leading Information Management Expert A wealth of invaluable information exists in unstructured textual form, but organizations have found it difficult or impossible to access and utilize it. This is changing rapidly: new approaches finally make it possible to glean useful knowledge from virtually any collection of unstructured data. William H. Inmon--the father of data warehousing--and Anthony Nesavich introduce the next data revolution: unstructured data management. Inmon and Nesavich cover all you need to know to make unstructured data work for your organization. You’ll learn how to bring it into your existing structured data environment, leverage existing analytical infrastructure, and implement textual analytic processing technologies to solve new problems and uncover new opportunities. Inmon and Nesavich introduce breakthrough techniques covered in no other book--including the powerful role of textual integration, new ways to integrate textual data into data warehouses, and new SQL techniques for reading and analyzing text. They also present five chapter-length, real-world case studies--demonstrating unstructured data at work in medical research, insurance, chemical manufacturing, contracting, and beyond. This book will be indispensable to every business and technical professional trying to make sense of a large body of unstructured text: managers, database designers, data modelers, DBAs, researchers, and end users alike. Coverage includes What unstructured data is, and how it differs from structured data First generation technology for handling unstructured data, from search engines to ECM--and its limitations Integrating text so it can be analyzed with a common, colloquial vocabulary: integration engines, ontologies, glossaries, and taxonomies Processing semistructured data: uncovering patterns, words, identifiers, and conflicts Novel processing opportunities that arise when text is freed from context Architecture and unstructured data: Data Warehousing 2.0 Building unstructured relational databases and linking them to structured data Visualizations and Self-Organizing Maps (SOMs), including Compudigm and Raptor solutions Capturing knowledge from spreadsheet data and email Implementing and managing metadata: data models, data quality, and more William H. Inmon is founder, president, and CTO of Inmon Data Systems. He is the father of the data warehouse concept, the corporate information factory, and the government information factory. Inmon has written 47 books on data warehouse, database, and information technology management; as well as more than 750 articles for trade journals such as Data Management Review, Byte, Datamation, and ComputerWorld. His b-eye-network.com newsletter currently reaches 55,000 people. Anthony Nesavich worked at Inmon Data Systems, where he developed multiple reports that successfully query unstructured data. Preface xvii 1 Unstructured Textual Data in the Organization 1 2 The Environments of Structured Data and Unstructured Data 15 3 First Generation Textual Analytics 33 4 Integrating Unstructured Text into the Structured Environment 47 5 Semistructured Data 73 6 Architecture and Textual Analytics 83 7 The Unstructured Database 95 8 Analyzing a Combination of Unstructured Data and Structured Data 113 9 Analyzing Text Through Visualization 127 10 Spreadsheets and Email 135 11 Metadata in Unstructured Data 147 12 A Methodology for Textual Analytics 163 13 Merging Unstructured Databases into the Data Warehouse 175 14 Using SQL to Analyze Text 185 15 Case Study--Textual Analytics in Medical Research 195 16 Case Study--A Database for Harmful Chemicals 203 17 Case Study--Managing Contracts Through an Unstructured Database 209 18 Case Study--Creating a Corporate Taxonomy (Glossary) 215 19 Case Study--Insurance Claims 219 Glossary 227 Index 233

Data Analysis Using SQL and Excel

One of the leading experts on business data mining shows managers how to leverage SQL and Excel to perform sophisticated types of business analysis without the expense of data mining tools and consultants Explains how to use the relatively simple tools of SQL and Excel to extract useful business information from relational databases Each chapter discusses why and when to perform a particular type of business analysis to obtain a useful business result, how to design and perform the analysis using SQL and Excel, and what the results look like in SQL and Excel Presents hints, warnings, and technical asides about Excel, SQL, and data analysis/mining

Up and Running with DB2 UDB ESE: Partitioning for Performance in an e-Business Intelligence World

Data warehouses in the 1990s were for the privileged few business analysts. Business Intelligence is now being democratized by being shared with the rank and file employee demanding higher levels of RDBMS scalability and ease of use, being delivered through Web portals. To support this emerging e-Business Intelligence world, the challenges that face the enterprises for their centralized data warehouse RDBMS technology are scalability, performance, availability and smart manageability. This IBM Redbooks publication focuses on the innovative technical functionalities of DB2 UDB ESE V8.1 and discusses: This book positions the new functionalities, so you can understand and evaluate their applicability in your own enterprise data warehouse environment, and get started prioritizing and implementing them. Please note that the additional material referenced in the text is not available from IBM.

Enhance Your Business Applications: Simple Integration of Advanced Data Mining Functions

Today data mining is no longer thought of as a set of stand-alone techniques, far from the business applications, and used only by data mining specialists or statisticians. Integrating data mining with mainstream applications is becoming an important issue for e-business applications. To support this move to applications, data mining is now an extension of the relational databases that database administrators or IT developers use. They use data mining as they would use any other standard relational function that they manipulate. This IBM Redbooks publication positions the new DB2 data mining functions: Part 1 of this book helps business analysts and implementers to understand and position these new DB2 data mining functions. Part 2 provides examples for implementers on how to easily and quickly integrate the data mining functions in business applications to enhance them. And part 3 helps database administrators and IT developers to configure these functions once to prepare them for use and integration in any application. Please note that the additional material referenced in the text is not available from IBM.