talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Data Exploration and Preparation with BigQuery

In "Data Exploration and Preparation with BigQuery," Michael Kahn provides a hands-on guide to understanding and utilizing Google's powerful data warehouse solution, BigQuery. This comprehensive book equips you with the skills needed to clean, transform, and analyze large datasets for actionable business insights. What this Book will help me do Master the process of exploring and assessing the quality of datasets. Learn SQL for performing efficient and advanced data transformations in BigQuery. Optimize the performance of BigQuery queries for speed and cost-effectiveness. Discover best practices for setting up and managing BigQuery resources. Apply real-world case studies to analyze data and derive meaningful insights. Author(s) Michael Kahn is an experienced data engineer and author specializing in big data solutions and technologies. With years of hands-on experience working with Google Cloud Platform and BigQuery, he has assisted organizations in optimizing their data pipelines for effective decision-making. His accessible writing style ensures complex topics become approachable, enabling readers of various skill levels to succeed. Who is it for? This book is tailored for data analysts, data engineers, and data scientists who want to learn how to effectively use BigQuery for data exploration and preparation. Whether you're new to BigQuery or looking to deepen your expertise in working with large datasets, this book provides clear guidance and practical examples to achieve your goals.

Kafka Troubleshooting in Production: Stabilizing Kafka Clusters in the Cloud and On-premises

This book provides Kafka administrators, site reliability engineers, and DataOps and DevOps practitioners with a list of real production issues that can occur in Kafka clusters and how to solve them. The production issues covered are assembled into a comprehensive troubleshooting guide for those engineers who are responsible for the stability and performance of Kafka clusters in production, whether those clusters are deployed in the cloud or on-premises. This book teaches you how to detect and troubleshoot the issues, and eventually how to prevent them. Kafka stability is hard to achieve, especially in high throughput environments, and the purpose of this book is not only to make troubleshooting easier, but also to prevent production issues from occurring in the first place. The guidance in this book is drawn from the author's years of experience in helping clients and internal customers diagnose and resolve knotty production problems and stabilize their Kafka environments. The book is organized into recipe-style troubleshooting checklists that field engineers can easily follow when under pressure to fix an unstable cluster. This is the book you will want by your side when the stakes are high, and your job is on the line. What You Will Learn Monitor and resolve production issues in your Kafka clusters Provision Kafka clusters with the lowest costs and still handle the required loads Perform root cause analyses of issues affecting your Kafka clusters Know the ways in which your Kafka cluster can affect its consumers and producers Prevent or minimize data loss and delays in data streaming Forestall production issues through an understanding of common failure points Create checklists for troubleshooting your Kafka clusters when problems occur Who This Book Is For Site reliability engineers tasked with maintaining stability of Kafka clusters, Kafka administrators who troubleshoot production issues around Kafka, DevOps and DataOps experts who are involved with provisioning Kafka (whether on-premises or in the cloud), developers of Kafka consumers and producers who wish to learn more about Kafka

Distributed Machine Learning with PySpark: Migrating Effortlessly from Pandas and Scikit-Learn

Migrate from pandas and scikit-learn to PySpark to handle vast amounts of data and achieve faster data processing time. This book will show you how to make this transition by adapting your skills and leveraging the similarities in syntax, functionality, and interoperability between these tools. Distributed Machine Learning with PySpark offers a roadmap to data scientists considering transitioning from small data libraries (pandas/scikit-learn) to big data processing and machine learning with PySpark. You will learn to translate Python code from pandas/scikit-learn to PySpark to preprocess large volumes of data and build, train, test, and evaluate popular machine learning algorithms such as linear and logistic regression, decision trees, random forests, support vector machines, Naïve Bayes, and neural networks. After completing this book, you will understand the foundational concepts of data preparation and machine learning and will have the skills necessary toapply these methods using PySpark, the industry standard for building scalable ML data pipelines. What You Will Learn Master the fundamentals of supervised learning, unsupervised learning, NLP, and recommender systems Understand the differences between PySpark, scikit-learn, and pandas Perform linear regression, logistic regression, and decision tree regression with pandas, scikit-learn, and PySpark Distinguish between the pipelines of PySpark and scikit-learn Who This Book Is For Data scientists, data engineers, and machine learning practitioners who have some familiarity with Python, but who are new to distributed machine learning and the PySpark framework.

Alteryx Designer: The Definitive Guide

Analytics projects are frequently long, drawn-out affairs, requiring multiple teams and skills to clean, join, and eventually turn data into analysis for timely decision-making. Alteryx Designer changes all of that. With this low-code, self-service, drag-and-drop workflow platform, new and experienced data and business analysts can deliver results in hours instead of weeks. This practical book shows you how to master all areas of Alteryx Designer quickly. Author and Alteryx ACE Joshua Burkhow starts with the basics of building a workflow, then introduces more than 200 tools for working with intermediate and advanced analytics functionality. With Alteryx Designer's all-in-one toolkit, you'll migrate from legacy analytics software or Excel with ease. Ready to work with data quickly and efficiently? This guide gets you started. Learn the fundamentals of cleaning, prepping, and analyzing data with Alteryx Designer Install, navigate, and quickly become competent with the Alteryx Designer layout and functionality Construct accurate, performant, reliable, and well-documented workflows that automate business processes Learn intermediate techniques using spatial analytics, reporting, and in-database tools Dive into advanced Alteryx capabilities, including predictive and machine learning tools Get introduced to the entire Alteryx Analytic Process Automation (APA) Platform

Artificial Intelligence for Business

This book is a valuable resource for academics, researchers, professionals, and policymakers who are interested in understanding the potential of AI in the business world. The contributions from leading experts and researchers provide a comprehensive overview of AI in business applications, and how it is transforming different sectors.

Near Extensions and Alignment of Data in R(superscript)n

Near Extensions and Alignment of Data in Rn Comprehensive resource illustrating the mathematical richness of Whitney Extension Problems, enabling readers to develop new insights, tools, and mathematical techniques Near Extensions and Alignment of Data in Rn demonstrates a range of hitherto unknown connections between current research problems in engineering, mathematics, and data science, exploring the mathematical richness of near Whitney Extension Problems, and presenting a new nexus of applied, pure and computational harmonic analysis, approximation theory, data science, and real algebraic geometry. For example, the book uncovers connections between near Whitney Extension Problems and the problem of alignment of data in Euclidean space, an area of considerable interest in computer vision. Written by a highly qualified author, Near Extensions and Alignment of Data in Rn includes information on: Areas of mathematics and statistics, such as harmonic analysis, functional analysis, and approximation theory, that have driven significant advances in the field Development of algorithms to enable the processing and analysis of huge amounts of data and data sets Why and how the mathematical underpinning of many current data science tools needs to be better developed to be useful New insights, potential tools, and mathematical techniques to solve problems in Whitney extensions, signal processing, shortest paths, clustering, computer vision, optimal transport, manifold learning, minimal energy, and equidistribution Providing comprehensive coverage of several subjects, Near Extensions and Alignment of Data in Rn is an essential resource for mathematicians, applied mathematicians, and engineers working on problems related to data science, signal processing, computer vision, manifold learning, and optimal transport.

Fundamentals of Data Science

Fundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researchers and professionals at different levels a good understanding of the concepts of data science, machine learning, data mining and analytics. Users will find the authors’ research experiences and achievements in data science applications, along with in-depth discussions on topics that are essential for data science projects, including pre-processing, that is carried out before applying predictive and descriptive data analysis tasks and proximity measures for numeric, categorical and mixed-type data. The book's authors include a systematic presentation of many predictive and descriptive learning algorithms, including recent developments that have successfully handled large datasets with high accuracy. In addition, a number of descriptive learning tasks are included. Presents the foundational concepts of data science along with advanced concepts and real-life applications for applied learning Includes coverage of a number of key topics such as data quality and pre-processing, proximity and validation, predictive data science, descriptive data science, ensemble learning, association rule mining, Big Data analytics, as well as incremental and distributed learning Provides updates on key applications of data science techniques in areas such as Computational Biology, Network Intrusion Detection, Natural Language Processing, Software Clone Detection, Financial Data Analysis, and Scientific Time Series Data Analysis Covers computer program code for implementing descriptive and predictive algorithms

MySQL Crash Course, 2nd Edition

MySQL is one of the most popular database management systems available, powering everything from Internet powerhouses to individual corporate databases to simple end-user applications, and everything in between. This book will teach you all you need to know to be immediately productive with the latest version of MySQL. By working through 30 highly focused hands-on lessons, your MySQL Crash Course will be both easier and more effective than youd have thought possible. Learn How To Retrieve and Sort Data Filter Data Using Comparisons, Regular Expressions, Full Text Search, and Much More Join Relational Data Create and Alter Tables Insert, Update, and Delete Data Leverage the Power of Stored Procedures and Triggers Use Views and Cursors Manage Transactional Processing Create User Accounts and Manage Security via Access Control ...

A Power BI Compendium: Answers to 65 Commonly Asked Questions on Power BI

Are you a reasonably competent Power BI user but still struggling to generate reports that truly tell the story of your data? Or do you simply want to extend your knowledge of Power BI by exploring more complex areas of visualizations, data modelling, DAX, and Power Query? If so, this book is for you. This book serves as a comprehensive resource for users to implement more challenging visuals, build better data models, use DAX with more confidence, and execute more complex queries so they can find and share important insights into their data. The contents of the chapters are in a question-and-answer format that explore everyday data analysis scenarios in Power BI. These questions have been generated from the author’s own client base and from commonly sought-for information from the Power BI community. They cover a wide and diverse range of topics that many Power BI users often struggle to get to grips with or don’t fully understand. Examples of suchquestions are: How can I generate dynamic titles for visuals? How can I control subtotals in a Matrix visual? Why do I need a date dimension? How can I show the previous N month’s sales in a column chart?Why do I need a Star Schema? Why aren't my totals correct? How can I bin measures into numeric ranges? Can I import a Word document? Can I dynamically append data from different source files? Solutions to these questions and many more are presented in non-technical and easy-to-follow explanations negating the requirement to perform tiresome and fruitless “google” searches. There are also companion Power BI Desktop files that set out the answers to each question so you can follow along with the examples given in the book.. After working through this book, you will have extended your knowledge of Power BI to an expert level, alleviating your existing frustrations and so enabling you to design Power BI reports where you are no longer limited by your lack of knowledge or experience. Who is This Book For: Power BI users who can build reports and now want to extend their knowledge of Power BI.

Generative AI on AWS

Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology. You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images. Apply generative AI to your business use cases Determine which generative AI models are best suited to your task Perform prompt engineering and in-context learning Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA) Align generative AI models to human values with reinforcement learning from human feedback (RLHF) Augment your model with retrieval-augmented generation (RAG) Explore libraries such as LangChain and ReAct to develop agents and actions Build generative AI applications with Amazon Bedrock

SAP S/4HANA Asset Management: Configure, Equip, and Manage your Enterprise

S/4HANA empowers enterprises to take big steps towards digitalization, innovation, and being mobile-friendly. This book is a concise guide to SAP S/4HANA Asset Management and will help you begin leveraging the platform’s capabilities quickly and efficiently. SAP S/4HANA Asset Management begins with an overview of the platform and its structure. You will learn how it can help with data storage and analysis, business processes, and reporting and analytics. As the book progresses, you will gain insight into single, time-based, performance-based, and multiple counter-based strategy plans. Any project is incomplete without a budget, and this book will help you understand how to use SAP S/4HANA to create and manage yours. The book’s real-life examples of asset management from contemporary industries reinforce each concept you learn, and its coverage of newer technologies and offerings in S/4HANA Asset Management will give you a sense of the immense potential offered by the platform. When you have finished this book, you will be ready to begin using SAP/S4HANA Asset Management to improve operational planning, maintenance, and scheduling activities in your own business. What You Will Learn Position S/4HANA Asset Management within the overall Business Applications suite Explore essential functionalities for enterprise asset hierarchy mapping Efficiently map both unplanned and planned maintenance activities Seamlessly integrate asset management, finance, controlling, and budgeting Unleash reporting and analytics in Asset Management Configure Asset Management to meet your S/4HANA requirements Who This Book Is For Consultants, project managers, and SAP users who are looking for a complete reference guide on S/4HANA Asset Management.

Cracking the Data Engineering Interview

"Cracking the Data Engineering Interview" is your essential guide to mastering the data engineering interview process. This book offers practical insights and techniques to build your resume, refine your skills in Python, SQL, data modeling, and ETL, and confidently tackle over 100 mock interview questions. Gain the knowledge and confidence to land your dream role in data engineering. What this Book will help me do Craft a compelling data engineering portfolio to stand out to employers. Refresh and deepen understanding of essential topics like Python, SQL, and ETL. Master over 100 interview questions that cover both technical and behavioral aspects. Understand data engineering concepts such as data modeling, security, and CI/CD. Develop negotiation, networking, and personal branding skills crucial for job applications. Author(s) None Bryan and None Ransome are seasoned authors with a wealth of experience in data engineering and professional development. Drawing from their extensive industry backgrounds, they provide actionable strategies for aspiring data engineers. Their approachable writing style and real-world insights make complex topics accessible to readers. Who is it for? This book is ideal for aspiring data engineers looking to navigate the job application process effectively. Readers should be familiar with data engineering fundamentals, including Python, SQL, cloud data platforms, and ETL processes. It's tailored for professionals aiming to enhance their portfolios, tackle challenging interviews, and boost their chances of landing a data engineering role.

Data Smart, 2nd Edition
book
by Jordan Goldmeier (Booz Allen Hamilton; The Perduco Group; EY; Excel TV; Wake Forest University; Anarchy Data)

Want to jump into data science but don't know where to start? Let's be real, data science is presented as something mystical and unattainable without the most powerful software, hardware, and data expertise. Real data science isn't about technology. It's about how you approach the problem. In this updated edition of Data Smart: Using Data Science to Transform Information into Insight, award-winning data scientist and bestselling author Jordan Goldmeier shows you how to implement data science problems using Excel while exposing how things work behind the scenes. Data Smart is your field guide to building statistics, machine learning, and powerful artificial intelligence concepts right inside your spreadsheet. Inside you'll find: Four-color data visualizations that highlight and illustrate the concepts discussed in the book Tutorials explaining complicated data science using just Microsoft Excel How to take what you’ve learned and apply it to everyday problems at work and life Advice for using formulas, Power Query, and some of Excel's latest features to solve tough data problems Smart data science solutions for common business challenges Explanations of what algorithms do, how they work, and what you can tweak to take your Excel skills to the next level Data Smart is a must-read for students, analysts, and managers ready to become data science savvy and share their findings with the world.

IBM TS7700 R5 DS8000 Object Store User's Guide

The IBM® TS7700 features a functional enhancement that allows for the TS7700 to act as an object store for transparent cloud tiering with IBM DS8000®, DFSMShsm (HSM), and native DFSMSdss (DSS). This function can be used to move data sets directly from DS8000 to TS7700. This IBM Redpaper publication provides a functional overview of the features, provides client value information, and walks through DFSMS, DS8000, and TS7700 set up steps.

Leading in Analytics

A step-by-step guide for business leaders who need to manage successful big data projects Leading in Analytics: The Critical Tasks for Executives to Master in the Age of Big Data takes you through the entire process of guiding an analytics initiative from inception to execution. You’ll learn which aspects of the project to pay attention to, the right questions to ask, and how to keep the project team focused on its mission to produce relevant and valuable project. As an executive, you can’t control every aspect of the process. But if you focus on high-impact factors that you can control, you can ensure an effective outcome. This book describes those factors and offers practical insight on how to get them right. Drawn from best-practice research in the field of analytics, the Manageable Tasks described in this book are specific to the goal of implementing big data tools at an enterprise level. A dream team of analytics and business experts have contributed their knowledge to show you how to choose the right business problem to address, put together the right team, gather the right data, select the right tools, and execute your strategic plan to produce an actionable result. Become an analytics-savvy executive with this valuable book. Ensure the success of analytics initiatives, maximize ROI, and draw value from big data Learn to define success and failure in analytics and big data projects Set your organization up for analytics success by identifying problems that have big data solutions Bring together the people, the tools, and the strategies that are right for the job By learning to pay attention to critical tasks in every analytics project, non-technical executives and strategic planners can guide their organizations to measurable results.

Data Science: The Hard Parts

This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one. Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries. With this book, you will: Understand how data science creates value Deliver compelling narratives to sell your data science project Build a business case using unit economics principles Create new features for a ML model using storytelling Learn how to decompose KPIs Perform growth decompositions to find root causes for changes in a metric Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).

Alteryx Designer Cookbook

This book, Alteryx Designer Cookbook, provides over 60 practical and detailed recipes that guide you in conquering data accessibility, preparation, and insights generation through Alteryx Designer. You will learn how to manipulate, blend, and analyze data sources effectively, improving your analytical productivity. What this Book will help me do Master efficient methods for cleaning, preparing, and shaping data accurately. Combine multiple data sources seamlessly using Alteryx Designer's blending tools. Implement essential data transformations such as pivoting and restructuring for analyses. Create reusable, automated solutions for repeated tasks using Alteryx macros. Generate rich, data-driven reports to enhance business intelligence efficiently. Author(s) None Guisande is an experienced data analytics professional with years of hands-on expertise in implementing data workflows using Alteryx Designer. Passionate about simplifying complex operations, None brings a practical approach to teaching, ensuring that readers can apply their skills immediately. Who is it for? This book is ideal for data analysts, professionals in business intelligence, and anyone proficient in Alteryx Designer's basics looking to deepen their understanding. If you aim to enhance your productivity and manual data tasks into efficient automated workflows, this book is a perfect fit.

Automate Testing for Power Apps

Are you looking to step up your Power Apps development game? "Automate Testing for Power Apps" is your comprehensive guide to leveraging low-code automation testing tools and techniques. Learn practical steps to integrate these methods into your workflow, ensuring your Power Apps are efficient, effective, and top-notch in quality. What this Book will help me do Master automation testing principles tailored specifically for Power Apps applications. Leverage tools like Test Studio and Test Engine to efficiently test Canvas apps. Learn advanced automation testing techniques for PCF components and model-driven apps. Incorporate robust testing procedures into software deployment for improved workflows. Enhance Power Apps quality and efficiency, reducing emergency fixes and improving user satisfaction. Author(s) César Calvo and Carlos de Huerta have deep expertise in Power Apps development and testing. With years of industry experience, they have honed their skills in creating robust apps and ensure quality through advanced testing techniques. Their approachable teaching style ensures learners grasp complex concepts effectively. Who is it for? This book is for Power Apps developers and IT professionals aiming to enhance their testing knowledge. Whether you're a beginner looking to grasp the basics or an advanced user exploring new automation possibilities, you'll find this guide invaluable. A basic understanding of Power Apps and Power Platform concepts will be beneficial.