talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

528

Collection of O'Reilly books on Data Science.

Filtering by: Analytics ×

Sessions & talks

Showing 26–50 of 528 · Newest first

Search within this event →
Time Series Analysis with Spark

Time Series Analysis with Spark provides a practical introduction to leveraging Apache Spark and Databricks for time series analysis. You'll learn to prepare, model, and deploy robust and scalable time series solutions for real-world applications. From data preparation to advanced generative AI techniques, this guide prepares you to excel in big data analytics. What this Book will help me do Understand the core concepts and architectures of Apache Spark for time series analysis. Learn to clean, organize, and prepare time series data for big data environments. Gain expertise in choosing, building, and training various time series models tailored to specific projects. Master techniques to scale your models in production using Spark and Databricks. Explore the integration of advanced technologies such as generative AI to enhance predictions and derive insights. Author(s) Yoni Ramaswami, a Senior Solutions Architect at Databricks, has extensive experience in data engineering and AI solutions. With a focus on creating innovative big data and AI strategies across industries, Yoni authored this book to empower professionals to efficiently handle time series data. Yoni's approachable style ensures that both foundational concepts and advanced techniques are accessible to readers. Who is it for? This book is ideal for data engineers, machine learning engineers, data scientists, and analysts interested in enhancing their expertise in time series analysis using Apache Spark and Databricks. Whether you're new to time series or looking to refine your skills, you'll find both foundational insights and advanced practices explained clearly. A basic understanding of Spark is helpful but not required.

Effective Data Analysis

Learn the technical and soft skills you need to succeed in your career as a data analyst. You’ve learned how to use Python, R, SQL, and the statistical skills needed to get started as a data analyst—so, what’s next? Effective Data Analysis bridges the gap between foundational skills and real-world application. This book provides clear, actionable guidance on transforming business questions into impactful data projects, ensuring you’re tracking the right metrics, and equipping you with a modern data analyst’s essential toolbox. In Effective Data Analysis, you’ll gain the skills needed to excel as a data analyst, including: Maximizing the impact of your analytics projects and deliverables Identifying and leveraging data sources to enhance organizational insights Mastering statistical tests, understanding their strengths, limitations, and when to use them Overcoming the challenges and caveats at every stage of an analytics project Applying your expertise across a variety of domains with confidence Effective Data Analysis is full of sage advice on how to be an effective data analyst in a real production environment. Inside, you’ll find methods that enhance the value of your work—from choosing the right analysis approach, to developing a data-informed organizational culture. About the Technology Data analysts need top-notch knowledge of statistics and programming. They also need to manage clueless stakeholders, navigate messy problems, and advocate for resources. This unique book covers the essential technical topics and soft skills you need to be effective in the real world. About the Book Effective Data Analysis helps you lock down those skills along with unfiltered insight into what the job really looks like. You’ll build out your technical toolbox with tips for defining metrics, testing code, automation, sourcing data, and more. Along the way, you’ll learn to handle the human side of data analysis, including how to turn vague requirements into efficient data pipelines. And you’re sure to love author Mona Khalil’s illustrations, industry examples, and a friendly writing style. What's Inside Identify and incorporate external data Communicate with non-technical stakeholders Apply and interpret statistical tests Techniques to approach any business problem About the Reader Written for early-career data analysts, but useful for all. About the Author Mona Khalil is the Senior Manager of Analytics Engineering at Justworks. Quotes Your roadmap to becoming a standout data analyst! An intriguing blend of technical expertise and practical wisdom. - Chester Ismay, MATE Seminars A thoughtful guide to delivering real-world data analysis. It will be an eye-opening read for all data professionals! - David Lee, Justworks Inc. Compelling insights into the relationship between organizations and data. The real-life examples will help you excel in your data career. - Jeremy Moulton, Greenhouse Mona’s wide range of experience shines in her thoughtful, relevant examples. - Jessica Cherny, Fivetran

Implementing Analytics Solutions Using Microsoft Fabric—DP-600 Exam Study Guide

Master the art of designing and implementing analytics solutions using Microsoft Fabric with this comprehensive guide. Whether you're preparing for the DP-600 certification exam or want to advance your career, this book offers expert insights into data analytics in Microsoft environments. What this Book will help me do Confidently pass the DP-600 certification exam by mastering exam-tested skills. Acquire practical expertise in deploying data analytics solutions with Microsoft Fabric. Understand and optimize data integration, security, and performance in analytics systems. Learn advanced techniques including semantic model optimization and advanced SQL querying. Prepare for real-world challenges through mock exams and hands-on exercises. Author(s) Jagjeet Singh Makhija and Charles Odunukwe, authors of this guide, are seasoned Microsoft specialists with decades of experience in data analytics, certification training, and technology consulting. Their clear and methodical approach ensures learners at all levels can grow their expertise. Who is it for? If you're a data analyst or IT professional looking to enhance your skills in analytics and Microsoft's technologies, this book is for you. It's ideal for those pursuing the DP-600 certification or aiming to improve their data integration and analysis capabilities.

The Impact of Algorithmic Technologies on Healthcare

The book explores the fundamental principles and transformative advancements in cutting-edge algorithmic technologies, detailing their application and impact on revolutionizing healthcare. This book provides an in-depth account of how technologies such as artificial intelligence (AI), machine learning (ML), and the Internet of Things (IoT) are reshaping healthcare, transitioning from traditional diagnostic and treatment approaches to data-driven solutions that improve predictive accuracy and patient outcomes. The text also addresses the challenges and considerations associated with adopting these technologies, including ethical implications, data security concerns, and the need for human-centered approaches in algorithmic medicine. After introducing digital twin technology and its potential to enhance healthcare delivery, the book examines the broader effects of digital technology on the healthcare system. Subsequent chapters explore topics such as innovations in medical imaging, predictive analytics for improved patient outcomes, and deep learning algorithms for brain tumor detection. Other topics include generative adversarial networks (GANs), convolutional neural networks (CNNs), smart wearables for remote patient monitoring, effective IoT solutions, telemedicine advancements, and blockchain security for healthcare systems. The integration of biometric systems driven by AI, securing cyber-physical systems in healthcare, and digitizing wellness through electronic health records (EHRs) and electronic medical records (EMRs) are also discussed. The book concludes with an extensive case study comparing the impacts of various healthcare applications, offering insights and encouraging further research and innovation in this dynamic field. Audience This book is suitable for academicians and professionals in health informatics, bioinformatics, biomedical science and engineering, artificial intelligence, as well as clinicians, IT specialists, and policymakers in healthcare.

Predictive Analytics with SAS and R: Core Concepts, Tools, and Implementation

Gain practical knowledge of application implementation using various programming approaches in predictive analytics. This book serves as a comprehensive guide for both beginners and professionals in the field of predictive analytics, offering core principles and practical insights without requiring an extensive mathematics or statistics background. The book starts with an introduction to analytics in decision making, protective analytics basics, and implementation in various industries. The book then takes you through types of regression, and simple linear regression in detail, followed by a demonstration of R Studio and SAS. Multiple Linear Regression is discussed next along with MLR model diagnostics. The book covers Multivariate Analysis and teaches you how to work with Principal Components Analysis, Factor Analysis, and much more. You also learn Time series Analysis with an understanding of Autoregressive Moving Average (ARMA) Models. After reading the book, you will be able to put predictive analytics principles into practice. What You Will Learn Understand modeling, estimating, and evaluating models for forecasting Implement Partial F-Test and Variable Selection Method Demonstrate each analysis model in R Studio and SAS Understand SLR and MLR Analysis models Who This Book Is For Students and professionals in the field of data analysis and intelligence applications

Analytics the Right Way

CLEAR AND CONCISE TECHNIQUES FOR USING ANALYTICS TO DELIVER BUSINESS IMPACT AT ANY ORGANIZATION Organizations have more data at their fingertips than ever, and their ability to put that data to productive use should be a key source of sustainable competitive advantage. Yet, business leaders looking to tap into a steady and manageable stream of “actionable insights” often, instead, get blasted with a deluge of dashboards, chart-filled slide decks, and opaque machine learning jargon that leaves them asking, “So what?” Analytics the Right Way is a guide for these leaders. It provides a clear and practical approach to putting analytics to productive use with a three-part framework that brings together the realities of the modern business environment with the deep truths underpinning statistics, computer science, machine learning, and artificial intelligence. The result: a pragmatic and actionable guide for delivering clarity, order, and business impact to an organization’s use of data and analytics. The book uses a combination of real-world examples from the authors’ direct experiences—working inside organizations, as external consultants, and as educators—mixed with vivid hypotheticals and illustrations—little green aliens, petty criminals with an affinity for ice cream, skydiving without parachutes, and more—to empower the reader to put foundational analytical and statistical concepts to effective use in a business context.

Learning AI Tools in Tableau

As businesses increasingly rely on data to drive decisions, the role of advanced analytics and AI in enhancing data interpretation is becoming crucial. For professionals tasked with optimizing data analytics platforms like Tableau, staying ahead of the curve with the latest tools isn't just beneficial—it's essential. This insightful guide takes you through the integration of Tableau Pulse and Einstein Copilot, explaining their roles within the broader Tableau and Salesforce ecosystems. Author Ann Jackson, an esteemed analytics professional with a deep expertise in Tableau, offers a step-by-step exploration of these tools, backed by real-world use cases that demonstrate their impact across various industries. By the end of this book, you will: Understand the functionalities of Tableau Pulse and Einstein Copilot and how to use them Learn to deploy Tableau Pulse effectively, ensuring it aligns with your business objectives Navigate discussions on AI's role within Tableau, enhancing your strategic conversations Visualize how Tableau Pulse operates through detailed images and scenarios Utilize Einstein Copilot in Tableau Desktop/Prep to streamline and enhance data analysis

Essential Data Analytics, Data Science, and AI: A Practical Guide for a Data-Driven World

In today’s world, understanding data analytics, data science, and artificial intelligence is not just an advantage but a necessity. This book is your thorough guide to learning these innovative fields, designed to make the learning practical and engaging. The book starts by introducing data analytics, data science, and artificial intelligence. It illustrates real-world applications, and, it addresses the ethical considerations tied to AI. It also explores ways to gain data for practice and real-world scenarios, including the concept of synthetic data. Next, it uncovers Extract, Transform, Load (ETL) processes and explains how to implement them using Python. Further, it covers artificial intelligence and the pivotal role played by machine learning models. It explains feature engineering, the distinction between algorithms and models, and how to harness their power to make predictions. Moving forward, it discusses how to assess machine learning models after their creation, with insights into various evaluation techniques. It emphasizes the crucial aspects of model deployment, including the pros and cons of on-device versus cloud-based solutions. It concludes with real-world examples and encourages embracing AI while dispelling fears, and fostering an appreciation for the transformative potential of these technologies. Whether you’re a beginner or an experienced professional, this book offers valuable insights that will expand your horizons in the world of data and AI. What you will learn: What are Synthetic data and Telemetry data How to analyze data using programming languages like Python and Tableau. What is feature engineering What are the practical Implications of Artificial Intelligence Who this book is for: Data analysts, scientists, and engineers seeking to enhance their skills, explore advanced concepts, and stay up-to-date with ethics. Business leaders and decision-makers across industries are interested in understanding the transformative potential and ethical implications of data analytics and AI in their organizations.

Modern Business Analytics

Deriving business value from analytics is a challenging process. Turning data into information requires a business analyst who is adept at multiple technologies including databases, programming tools, and commercial analytics tools. This practical guide shows programmers who understand analysis concepts how to build the skills necessary to achieve business value. Author Deanne Larson, data science practitioner and academic, helps you bridge the technical and business worlds to meet these requirements. You'll focus on developing these skills with R and Python using real-world examples. You'll also learn how to leverage methodologies for successful delivery. Learning methodology combined with open source tools is key to delivering successful business analytics and value. This book shows you how to: Apply business analytics methodologies to achieve successful results Cleanse and transform data using R and Python Use R and Python to complete exploratory data analysis Create predictive models to solve business problems in R and Python Use Python, R, and business analytics tools to handle large volumes of data Commit code to GitHub to collaborate with data engineers and data scientists Measure success in business analytics

DuckDB: Up and Running

DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool. Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL. Understand the purpose of DuckDB and its main functions Conduct data analytics tasks using DuckDB Integrate DuckDB with pandas, Polars, and JupySQL Use DuckDB to query your data Perform spatial analytics using DuckDB's spatial extension Work with a diverse range of data including Parquet, CSV, and JSON

The Data Science Handbook, 2nd Edition

Practical, accessible guide to becoming a data scientist, updated to include the latest advances in data science and related fields. Becoming a data scientist is hard. The job focuses on mathematical tools, but also demands fluency with software engineering, understanding of a business situation, and deep understanding of the data itself. This book provides a crash course in data science, combining all the necessary skills into a unified discipline. The focus of The Data Science Handbook is on practical applications and the ability to solve real problems, rather than theoretical formalisms that are rarely needed in practice. Among its key points are: An emphasis on software engineering and coding skills, which play a significant role in most real data science problems. Extensive sample code, detailed discussions of important libraries, and a solid grounding in core concepts from computer science (computer architecture, runtime complexity, and programming paradigms). A broad overview of important mathematical tools, including classical techniques in statistics, stochastic modeling, regression, numerical optimization, and more. Extensive tips about the practical realities of working as a data scientist, including understanding related jobs functions, project life cycles, and the varying roles of data science in an organization. Exactly the right amount of theory. A solid conceptual foundation is required for fitting the right model to a business problem, understanding a tool’s limitations, and reasoning about discoveries. Data science is a quickly evolving field, and this 2nd edition has been updated to reflect the latest developments, including the revolution in AI that has come from Large Language Models and the growth of ML Engineering as its own discipline. Much of data science has become a skillset that anybody can have, making this book not only for aspiring data scientists, but also for professionals in other fields who want to use analytics as a force multiplier in their organization.

Probabilistic Forecasts and Optimal Decisions

Account for uncertainties and optimize decision-making with this thorough exposition Decision theory is a body of thought and research seeking to apply a mathematical-logical framework to assessing probability and optimizing decision-making. It has developed robust tools for addressing all major challenges to decision making. Yet the number of variables and uncertainties affecting each decision outcome, many of them beyond the decider’s control, mean that decision-making is far from a ‘solved problem’. The tools created by decision theory remain to be refined and applied to decisions in which uncertainties are prominent. Probabilistic Forecasts and Optimal Decisions introduces a theoretically-grounded methodology for optimizing decision-making under conditions of uncertainty. Beginning with an overview of the basic elements of probability theory and methods for modeling continuous variates, it proceeds to survey the mathematics of both continuous and discrete models, supporting each with key examples. The result is a crucial window into the complex but enormously rewarding world of decision theory. Readers of Probablistic Forecasts and Optimal Decisions will also find: Extended case studies supported with real-world data Mini-projects running through multiple chapters to illustrate different stages of the decision-making process End of chapter exercises designed to facilitate student learning Probabilistic Forecasts and Optimal Decisions is ideal for advanced undergraduate and graduate students in the sciences and engineering, as well as predictive analytics and decision analytics professionals.

Collect, Combine, and Transform Data Using Power Query in Power BI and Excel, 2nd Edition

Transform your data analysis experience with Power Query, the ultimate tool for importing, reshaping, and cleansing data through a user-friendly interface. Whether youre using Power BI, Excel, or other Microsoft products, Power Querys capabilities are at your fingertips. Renowned Power Query experts Daniil Maslyuk and Gil Raviv guide you through mastering this indispensable tool, helping you eliminate tedious manual data preparation, tackle common issues, and avoid potential pitfalls. In this updated edition, youll delve into comprehensive analytics challenges, seamlessly integrating your skills into a realistic, final project. By the end, youll possess the expertise to handle any data and convert it into actionable insights. You will learn how to: Effortlessly prepare data by utilizing Power Query in Power BI and Excel to transform your data quickly and efficiently Overcome common data preparation problems with intuitive mouse clicks and straightforward formula edits Combine data from various sources, multiple queries, and mismatched tables with ease Reshape tables to suit your analysis needs Use the Power Query M formula language to create flexible data mashups and tailor transformations to your requirements Address and overcome collaboration challenges by using Power Querys powerful features Gain crucial insights from text feeds by enhancing your data analysis capabilities Profile data, diagnose queries, improve query performance, and more! About This Book For everyone who wants to get more done with Power Query in less time For business and financial professionals, developers, entrepreneurs, students, and others who need to efficiently manage and analyze data .

Intelligent Data Analytics for Bioinformatics and Biomedical Systems

The book analyzes the combination of intelligent data analytics with the intricacies of biological data that has become a crucial factor for innovation and growth in the fast-changing field of bioinformatics and biomedical systems. Intelligent Data Analytics for Bioinformatics and Biomedical Systems delves into the transformative nature of data analytics for bioinformatics and biomedical research. It offers a thorough examination of advanced techniques, methodologies, and applications that utilize intelligence to improve results in the healthcare sector. With the exponential growth of data in these domains, the book explores how computational intelligence and advanced analytic techniques can be harnessed to extract insights, drive informed decisions, and unlock hidden patterns from vast datasets. From genomic analysis to disease diagnostics and personalized medicine, the book aims to showcase intelligent approaches that enable researchers, clinicians, and data scientists to unravel complex biological processes and make significant strides in understanding human health and diseases. This book is divided into three sections, each focusing on computational intelligence and data sets in biomedical systems. The first section discusses the fundamental concepts of computational intelligence and big data in the context of bioinformatics. This section emphasizes data mining, pattern recognition, and knowledge discovery for bioinformatics applications. The second part talks about computational intelligence and big data in biomedical systems. Based on how these advanced techniques are utilized in the system, this section discusses how personalized medicine and precision healthcare enable treatment based on individual data and genetic profiles. The last section investigates the challenges and future directions of computational intelligence and big data in bioinformatics and biomedical systems. This section concludes with discussions on the potential impact of computational intelligence on addressing global healthcare challenges. Audience Intelligent Data Analytics for Bioinformatics and Biomedical Systems is primarily targeted to professionals and researchers in bioinformatics, genetics, molecular biology, biomedical engineering, and healthcare. The book will also suit academicians, students, and professionals working in pharmaceuticals and interpreting biomedical data.

Hands-On Prescriptive Analytics

Business decisions in any context—operational, tactical, or strategic—can have considerable consequences. Whether the outcome is positive and rewarding or negative and damaging to the business, its employees, and stakeholders is unknown when action is approved. These decisions are usually made under the proverbial cloud of uncertainty. With this practical guide, data analysts, data scientists, and business analysts will learn why and how maximizing positive consequences and minimizing negative ones requires three forms of rich information: Descriptive analytics explores the results from an action—what has already happened. Predictive analytics focuses on what could happen. The third, prescriptive analytics, informs us what should happen in the future. While all three are important for decision-makers, the primary focus of this book is on the third: prescriptive analytics. Author Walter R. Paczkowski, Ph.D. shows you: The distinction among descriptive, predictive, and prescriptive analytics How predictive analytics produces a menu of action options How prescriptive analytics narrows the menu of action options The forms of prescriptive analytics: eight prescriptive methods Two broad classes of these methods: non-stochastic and stochastic How to develop prescriptive analyses for action recommendations Ways to use an appropriate tool-set in Python

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib

Learn how to leverage the scientific computing and data analysis capabilities of Python, its standard library, and popular open-source numerical Python packages like NumPy, SymPy, SciPy, matplotlib, and more. This book demonstrates how to work with mathematical modeling and solve problems with numerical, symbolic, and visualization techniques. It explores applications in science, engineering, data analytics, and more. Numerical Python, Third Edition, presents many case study examples of applications in fundamental scientific computing disciplines, as well as in data science and statistics. This fully revised edition, updated for each library's latest version, demonstrates Python's power for rapid development and exploratory computing due to its simple and high-level syntax and many powerful libraries and tools for computation and data analysis. After reading this book, readers will be familiar with many computing techniques, including array-based and symbolic computing, visualization and numerical file I/O, equation solving, optimization, interpolation and integration, and domain-specific computational problems, such as differential equation solving, data analysis, statistical modeling, and machine learning. What You'll Learn Work with vectors and matrices using NumPy Review Symbolic computing with SymPy Plot and visualize data with Matplotlib Perform data analysis tasks with Pandas and SciPy Understand statistical modeling and machine learning with statsmodels and scikit-learn Optimize Python code using Numba and Cython Who This Book Is For Developers who want to understand how to use Python and its ecosystem of libraries for scientific computing and data analysis.

Statistics for Data Science and Analytics

Introductory statistics textbook with a focus on data science topics such as prediction, correlation, and data exploration Statistics for Data Science and Analytics is a comprehensive guide to statistical analysis using Python, presenting important topics useful for data science such as prediction, correlation, and data exploration. The authors provide an introduction to statistical science and big data, as well as an overview of Python data structures and operations. A range of statistical techniques are presented with their implementation in Python, including hypothesis testing, probability, exploratory data analysis, categorical variables, surveys and sampling, A/B testing, and correlation. The text introduces binary classification, a foundational element of machine learning, validation of statistical models by applying them to holdout data, and probability and inference via the easy-to-understand method of resampling and the bootstrap instead of using a myriad of “kitchen sink” formulas. Regression is taught both as a tool for explanation and for prediction. This book is informed by the authors’ experience designing and teaching both introductory statistics and machine learning at Statistics.com. Each chapter includes practical examples, explanations of the underlying concepts, and Python code snippets to help readers apply the techniques themselves. Statistics for Data Science and Analytics includes information on sample topics such as: Int, float, and string data types, numerical operations, manipulating strings, converting data types, and advanced data structures like lists, dictionaries, and sets Experiment design via randomizing, blinding, and before-after pairing, as well as proportions and percents when handling binary data Specialized Python packages like numpy, scipy, pandas, scikit-learn and statsmodels—the workhorses of data science—and how to get the most value from them Statistical versus practical significance, random number generators, functions for code reuse, and binomial and normal probability distributions Written by and for data science instructors, Statistics for Data Science and Analytics is an excellent learning resource for data science instructors prescribing a required intro stats course for their programs, as well as other students and professionals seeking to transition to the data science field.

LLMs and Generative AI for Healthcare

Large language models (LLMs) and generative AI are rapidly changing the healthcare industry. These technologies have the potential to revolutionize healthcare by improving the efficiency, accuracy, and personalization of care. This practical book shows healthcare leaders, researchers, data scientists, and AI engineers the potential of LLMs and generative AI today and in the future, using storytelling and illustrative use cases in healthcare. Authors Kerrie Holley, former Google healthcare professionals, guide you through the transformative potential of large language models (LLMs) and generative AI in healthcare. From personalized patient care and clinical decision support to drug discovery and public health applications, this comprehensive exploration covers real-world uses and future possibilities of LLMs and generative AI in healthcare. With this book, you will: Understand the promise and challenges of LLMs in healthcare Learn the inner workings of LLMs and generative AI Explore automation of healthcare use cases for improved operations and patient care using LLMs Dive into patient experiences and clinical decision-making using generative AI Review future applications in pharmaceutical R&D, public health, and genomics Understand ethical considerations and responsible development of LLMs in healthcare "The authors illustrate generative's impact on drug development, presenting real-world examples of its ability to accelerate processes and improve outcomes across the pharmaceutical industry." --Harsh Pandey, VP, Data Analytics & Business Insights, Medidata-Dassault Kerrie Holley is a retired Google tech executive, IBM Fellow, and VP/CTO at Cisco. Holley's extensive experience includes serving as the first Technology Fellow at United Health Group (UHG), Optum, where he focused on advancing and applying AI, deep learning, and natural language processing in healthcare. Manish Mathur brings over two decades of expertise at the crossroads of healthcare and technology. A former executive at Google and Johnson & Johnson, he now serves as an independent consultant and advisor. He guides payers, providers, and life sciences companies in crafting cutting-edge healthcare solutions.

Polars Cookbook

Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed.

DuckDB in Action

Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: Read and process data from CSV, JSON and Parquet sources both locally and remote Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the Technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the Book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's Inside Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Fast-paced SQL recap: From simple queries to advanced analytics About the Reader For data pros comfortable with Python and CLI tools. About the Authors Mark Needham is a blogger and video creator at @‌LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j. Quotes I use DuckDB every day, and I still learned a lot about how DuckDB makes things that are hard in most databases easy! - Jordan Tigani, Founder, MotherDuck An excellent resource! Unlocks possibilities for storing, processing, analyzing, and summarizing data at the edge using DuckDB. - Pramod Sadalage, Director, Thoughtworks Clear and accessible. A comprehensive resource for harnessing the power of DuckDB for both novices and experienced professionals. - Qiusheng Wu, Associate Professor, University of Tennessee Excellent! The book all we ducklings have been waiting for! - Gunnar Morling, Decodable

Microsoft Power BI Cookbook - Third Edition

Discover how to harness the full potential of Microsoft Power BI in "Microsoft Power BI Cookbook". Through its recipe-based structure, this book offers step-by-step guidance on mastering data integration, crafting impactful visualizations, and utilizing Power BI's latest features like Hybrid tables and enhanced scorecards. This edition equips you with the skills to transform raw data into actionable insights for your organization. What this Book will help me do Turn business data into actionable insights by utilizing Microsoft Data Fabric effectively. Create engaging and clear visualizations through Hybrid tables and advanced reporting techniques. Gain competence in managing real-time data accuracy and implementing dynamic analytics in Power BI. Ensure robust data compliance and governance integrated seamlessly into business reporting workflows. Leverage cutting-edge Power BI features to prepare for emerging trends in data intelligence. Author(s) Greg Deckler and None Powell, both esteemed professionals in the Power BI and data analytics domain, co-author this comprehensive guide. With decades of experience, they bring vast knowledge and practical skills to this work, presenting it in a structured and approachable manner. Both are dedicated to empowering learners of all levels to excel with Power BI. Who is it for? This book is ideal for professionals like data analysts, business intelligence developers, and IT specialists focused on reporting. It suits readers with a basic familiarity with Power BI, looking to deepen their understanding. If you aim to stay current with Power BI's most modern practices and features, this book will help you achieve that. Additionally, it supports those aiming to enhance business decision-making through better visualizations and advanced analysis.

Tableau Certified Data Analyst Certification Guide

The 'Tableau Certified Data Analyst Certification Guide' is your essential roadmap to mastering Tableau and excelling in the Tableau Data Analyst certification exam. From fundamentals to advanced techniques, you'll solidify your Tableau skills with clear explanations, practical exercises, and realistic mock exams. After reading, you'll be ready to take the next step in your data analytics career. What this Book will help me do Gain the ability to connect, clean, and transform data effectively using Tableau. Master Tableau's diverse calculation types for data analysis, ranging from basic to advanced. Develop skills to create visually impactful dashboards and data stories. Learn to publish and manage insights on Tableau Cloud for broader collaboration. Acquire the necessary competencies to confidently pass the Tableau Data Analyst certification exam. Author(s) Authors Harry Cooney and Daisy Jones bring a wealth of Tableau and data analytics experience. Harry is a certified Tableau expert with years of teaching and consulting, while Daisy applies her data analysis expertise across industries. Together, they combine practical insights and a supportive approach to guide you through Tableau mastery and certification. Who is it for? This book is ideal for aspiring and practicing data analysts eager to master Tableau. Beginners will appreciate the accessible approach to foundational concepts, while experienced users can deepen their expertise. If you're preparing for the Tableau Certified Data Analyst exam or looking to enhance your visual analytics capabilities, this book is for you.