talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

The Data Analysis Workshop

The Data Analysis Workshop teaches you how to analyze and interpret data to solve real-world business problems effectively. By working through practical examples and datasets, you'll gain actionable insights into modern analytic techniques and build your confidence as a data analyst. What this Book will help me do Understand and apply fundamental data analysis concepts and techniques to tackle diverse datasets. Perform rigorous hypothesis testing and analyze group differences within data sets. Create informative data visualizations using Python libraries like Matplotlib and Seaborn. Understand and use correlation metrics to identify relationships between variables. Leverage advanced data manipulation techniques to uncover hidden patterns in complex datasets. Author(s) The authors, Gururajan Govindan, Shubhangi Hora, and Konstantin Palagachev, are experts in data science and analytics with years of experience in industry and academia. Their background includes performing business-critical analysis for companies and teaching students how to approach data-driven decision-making. They bring their depth of knowledge and engaging teaching styles together in this approachable guide. Who is it for? This book is intended for programmers with proficiency in Python who want to apply their skills to the field of data analysis. Readers who have a foundational understanding of coding and are eager to implement hands-on data science techniques will gain the most value. The content is also suitable for anyone pursuing a data-driven problem-solving mindset. This is an excellent resource to help transition from basic coding proficiency to applying Python in real-world data science.

The Data Wrangling Workshop - Second Edition

The Data Wrangling Workshop is your beginner's guide to the essential techniques and practices of data manipulation using Python. Throughout the book, you will progressively build your skills, learning key concepts such as extracting, cleaning, and transforming data into actionable insights. By the end, you'll be confident in handling various data wrangling tasks efficiently. What this Book will help me do Understand and apply the fundamentals of data wrangling using Python. Combine and aggregate data from diverse sources like web data, SQL databases, and spreadsheets. Use descriptive statistics and plotting to examine dataset properties. Handle missing or incorrect data effectively to maintain data quality. Gain hands-on experience with Python's powerful data science libraries like Pandas, NumPy, and Matplotlib. Author(s) Brian Lipp, None Roychowdhury, and Dr. Tirthajyoti Sarkar are experienced educators and professionals in the fields of data science and engineering. Their collective expertise spans years of teaching and working with data technologies. They aim to make data wrangling accessible and comprehensible, focusing on practical examples to equip learners with real-world skills. Who is it for? The Data Wrangling Workshop is ideal for developers, data analysts, and business analysts aiming to become data scientists or analytics experts. If you're just getting started with Python, you will find this book guiding you step-by-step. A basic understanding of Python programming, as well as relational databases and SQL, is recommended for smooth learning.

The Data Visualization Workshop

In "The Data Visualization Workshop," you will explore the fascinating world of data visualization and learn how to turn raw data into compelling visualizations that clearly communicate your insights. This book provides practical guidance and hands-on exercises to familiarize you with essential topics such as plotting techniques and interactive visualizations using Python. What this Book will help me do Prepare and clean raw data for visualization using NumPy and pandas. Create effective and visually appealing charts using libraries like Matplotlib and Seaborn. Generate geospatial visualizations utilizing tools like geoplotlib. Develop interactive visualizations for web integration with the Bokeh library. Apply visualization techniques to real-world data analysis scenarios, including stock data and Airbnb datasets. Author(s) Mario Döbler and Tim Großmann are experienced authors and professionals in the field of Python programming and data science. They bring a wealth of knowledge and practical insights to data visualization. Through their collaborative efforts, they aim to empower readers with the skills to create compelling data visualizations and uncover meaningful data narratives. Who is it for? This book is ideal for beginners new to data visualization, as well as developers and data scientists seeking to enhance their practical skills. It is approachable for readers without prior visualization experience but assumes familiarity with Python programming and basic mathematics. If you're eager to bring your data to life in insightful and engaging ways, this book is for you.

The Applied Data Science Workshop - Second Edition

Embark on an interactive journey into the world of data science with 'The Applied Data Science Workshop'. By following real-world scenarios and hands-on exercises, you will explore the fundamentals of data analysis and machine learning modeling within Jupyter Notebooks, leveraging Python libraries like pandas and sci-kit learn to draw meaningful insights from data. What this Book will help me do Master the process of setting up and using Jupyter Notebooks effectively for data science tasks. Learn to preprocess, analyze, and visualize data using Python libraries such as pandas, Matplotlib, and Seaborn. Discover methods to train and evaluate machine learning models using real-world data scenarios. Apply techniques to assess model performance and optimize them with advanced validation. Gain the skills to communicate insights through well-documented analyses and stakeholder-ready reports. Author(s) None Galea, an accomplished author in the data science domain, focuses on making technical concepts understandable and relatable. With this book, Galea leverages years of experience to introduce readers to practical applications of data science using Python. The author's approach ensures that readers not only learn the concepts but also apply them hands-on. Who is it for? This book caters to aspiring data scientists and developers interested in data analysis and practical applications of data science techniques. Beginners will find the step-by-step methodology approachable, while those with a basic understanding of Python programming or machine learning can quickly extend their skills. It suits anyone eager to apply data science in their professional toolbox.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Hadley Wickham is Chief Scientist at RStudio, and an Adjunct Professor of Statistics at the University of Auckland, Stanford University, and Rice University. He builds tools that make data science easier and faster, including the famous tidy verse packages for the R programming language. He was named a Fellow by the American Statistical Association for "pivotal contributions to statistical practice through innovative and pioneering research in statistical graphics and computing".

Show Notes 2:39 – Hadley talks about his journey  5:22 – Hadley talks about his American Statistical Association for "pivotal contributions to statistical practice" 8:00 – Tidy data concept 9:02 - How Hadley became interested in big data and R 10:12 – Python and R 12:30 – What Hadley is doing now 13:47 – Top 3 packages that help data scientists  17:47 – Hadley discusses his book  22:48 – Writing a book vs. code 29:40 – What language is going to take over 31:01 – What’s next for data 31:54 – What’s cool for Hadley 36:26 – Hadley’s Role model Hadley Wickham’s books Ggplot2 R for Data Science Advanced R R Packages Hadl Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

This talk discusses how to build an Airflow based data platform that can take advantage of popular ML tools (Jupyter, Tensorflow, Spark) while creating an easy-to-manage/monitor As the field of data science grows in popularity, companies find themselves in need of a single common language that can connect their data science teams and data infrastructure teams. Data scientists want rapid iteration, infrastructure engineers want monitoring and security controls, and product owners want their solutions deployed in time for quarterly reports. This talk will discuss how to build an Airflow based data platform that can take advantage of popular ML tools (Jupyter, Tensorflow, Spark) while creating an easy-to-manage/monitor ecosystem for data infrastructure and support team. In this talk, we will take an idea from a single-machine Jupyter Notebook to a cross-service Spark + Tensorflow pipeline, to a canary tested, production-ready model served on Google Cloud Functions. We will show how Apache Airflow can connect all layers of a data team to deliver rapid results.

End-to-End Data Science with SAS

Learn data science concepts with real-world examples in SAS! End-to-End Data Science with SAS: A Hands-On Programming Guide provides clear and practical explanations of the data science environment, machine learning techniques, and the SAS programming knowledge necessary to develop machine learning models in any industry. The book covers concepts including understanding the business need, creating a modeling data set, linear regression, parametric classification models, and non-parametric classification models. Real-world business examples and example code are used to demonstrate each process step-by-step. Although a significant amount of background information and supporting mathematics are presented, the book is not structured as a textbook, but rather it is a user’s guide for the application of data science and machine learning in a business environment. Readers will learn how to think like a data scientist, wrangle messy data, choose a model, and evaluate the model’s effectiveness. New data scientists or professionals who want more experience with SAS will find this book to be an invaluable reference. Take your data science career to the next level by mastering SAS programming for machine learning models.

Free Data Storytelling Training Register before it sells out again! Our BI Data Storytelling Mastery Accelerator 3-Day Live Workshop new dates are finally available. Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of the workshop, you'll leave with a clear BI delivery action plan. Register today! In this episode, you'll learn: [00:40] Transformational Stories and Lessons Learned: 3 ways to use data storytelling in data science [19:50] Storytelling as a means of visualization. [26:52] How to think about answering the questions with data storytelling. For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/56  

Enjoyed the Show?  Please leave us a review on iTunes.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Peter Wang Co Founder and CEO of Anaconda and Shadi Copty VP of Offering Manager. Al, Peter, and Shadi discuss Data Science and IBMs partnership with Anaconda. Show Notes 6:11 - Corporate Mission 8:00 - Use Case 9:20 - IBM and Anaconda partnership 14:04 - Cloud Pak for Data what is it? 15:43 – Python vs R 17:15 – Anaconda’s Future 23:25 – Shadi takes over from Al 25:05 – Data Science Community 33:40 – Centre of Humane Technology   Anaconda - https://www.linkedin.com/company/anacondainc/

Connect with the Team Producer Kate Brown - LinkedIn. Producer Meighann Helene - LinkedIn. Producer Michael Sestak - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Mathematical Foundations of Data Science Using R

In order best exploit the incredible quantities of data being generated in most diverse disciplines data sciences increasingly gain worldwide importance. The book gives the mathematical foundations to handle data properly. It introduces basics and functionalities of the R programming language which has become the indispensable tool for data sciences. Thus it delivers the reader the skills needed to build own tool kits of a modern data scientist.

Modern Data Mining Algorithms in C++ and CUDA C: Recent Developments in Feature Extraction and Selection Algorithms for Data Science

Discover a variety of data-mining algorithms that are useful for selecting small sets of important features from among unwieldy masses of candidates, or extracting useful features from measured variables. As a serious data miner you will often be faced with thousands of candidate features for your prediction or classification application, with most of the features being of little or no value. You’ll know that many of these features may be useful only in combination with certain other features while being practically worthless alone or in combination with most others. Some features may have enormous predictive power, but only within a small, specialized area of the feature space. The problems that plague modern data miners are endless. This book helps you solve this problem by presenting modern feature selection techniques and the code to implement them. Some of these techniques are: Forward selection component analysis Local feature selection Linking features and a target with a hidden Markov model Improvements on traditional stepwise selection Nominal-to-ordinal conversion All algorithms are intuitively justified and supported by the relevant equations and explanatory material. The author also presents and explains complete, highly commented source code. The example code is in C++ and CUDA C but Python or other code can be substituted; the algorithm is important, not the code that's used to write it. What You Will Learn Combine principal component analysis with forward and backward stepwise selection to identify a compact subset of a large collection of variables that captures the maximum possible variation within the entire set. Identify features that may have predictive power over only a small subset of the feature domain. Such features can be profitably used by modern predictive models but may be missed by other feature selection methods. Find an underlying hidden Markov model that controls the distributions of feature variables and the target simultaneously. The memory inherent in this method is especially valuable in high-noise applications such as prediction of financial markets. Improve traditional stepwise selection in three ways: examine a collection of 'best-so-far' feature sets; test candidate features for inclusion with cross validation to automatically and effectively limit model complexity; and at each step estimate the probability that our results so far could be just the product of random good luck. We also estimate the probability that the improvement obtained by adding a new variable could have been just good luck. Take a potentially valuable nominal variable (a category or class membership) that is unsuitable for input to a prediction model, and assign to each category a sensible numeric value that can be used as a model input. Who This Book Is For Intermediate to advanced data science programmers and analysts.

Smarter Data Science

Organizations can make data science a repeatable, predictable tool, which business professionals use to get more value from their data Enterprise data and AI projects are often scattershot, underbaked, siloed, and not adaptable to predictable business changes. As a result, the vast majority fail. These expensive quagmires can be avoided, and this book explains precisely how. Data science is emerging as a hands-on tool for not just data scientists, but business professionals as well. Managers, directors, IT leaders, and analysts must expand their use of data science capabilities for the organization to stay competitive. Smarter Data Science helps them achieve their enterprise-grade data projects and AI goals. It serves as a guide to building a robust and comprehensive information architecture program that enables sustainable and scalable AI deployments. When an organization manages its data effectively, its data science program becomes a fully scalable function that’s both prescriptive and repeatable. With an understanding of data science principles, practitioners are also empowered to lead their organizations in establishing and deploying viable AI. They employ the tools of machine learning, deep learning, and AI to extract greater value from data for the benefit of the enterprise. By following a ladder framework that promotes prescriptive capabilities, organizations can make data science accessible to a range of team members, democratizing data science throughout the organization. Companies that collect, organize, and analyze data can move forward to additional data science achievements: Improving time-to-value with infused AI models for common use cases Optimizing knowledge work and business processes Utilizing AI-based business intelligence and data visualization Establishing a data topology to support general or highly specialized needs Successfully completing AI projects in a predictable manner Coordinating the use of AI from any compute node. From inner edges to outer edges: cloud, fog, and mist computing When they climb the ladder presented in this book, businesspeople and data scientists alike will be able to improve and foster repeatable capabilities. They will have the knowledge to maximize their AI and data assets for the benefit of their organizations.

Analytical Skills for AI and Data Science

While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, the vast majority have yet to reap the benefits. How can your business and analytics units gain a competitive advantage by capturing the full potential of this predictive revolution? This practical guide presents a battle-tested end-to-end method to help you translate business decisions into tractable prescriptive solutions using data and AI as fundamental inputs. Author Daniel Vaughan shows data scientists, analytics practitioners, and others interested in using AI to transform their businesses not only how to ask the right questions but also how to generate value using modern AI technologies and decision-making principles. You’ll explore several use cases common to many enterprises, complete with examples you can apply when working to solve your own issues. Break business decisions into stages that can be tackled using different skills from the analytical toolbox Identify and embrace uncertainty in decision making and protect against common human biases Customize optimal decisions to different customers using predictive and prescriptive methods and technologies Ask business questions that create high value through AI- and data-driven technologies

Send us a text Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract This week on Making Data Simple, we have Deborah Leff, Global Leader and Industry CTO, Data Science and AI Elite Team. Deborah is an Industry specialist for consumer and travel. In this week’s podcast we talk about demystifying AI and supporting customers around the world with Data Science and AI solutions.

Show Notes 1:10 Deborah explains the mission. 2:46 Deborah talks about transformational technology 14:36 American Airline reference on YouTube link - https://www.youtube.com/watch?v=t1PgNr8VMLc  13:24 Deborah Medium paper reference "AI Demands a New Perspective" - https://medium.com/@deborah.leff 22:39 Deborah Instagram account - https://www.instagram.com/ deborah.leff 23:50 and 26:04 Deborah LinkedIn page - https://www.linkedin.com/in/deborahleff/

Connect with the Team Producer Kate Brown - LinkedIn. Producer Michael Sestak - LinkedIn Producer Meighann Helene - LinkedIn.

Host Al Martin - LinkedIn and Twitter.

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Before the COVID-19 crisis, we were already acutely aware of the need for a broader conversation around data privacy: look no further than the Snowden revelations, Cambridge Analytica, the New York Times Privacy Project, the General Data Protection Regulation (GDPR) in Europe, and the California Consumer Privacy Act (CCPA). In the age of COVID-19, these issues are far more acute. We also know that governments and businesses exploit crises to consolidate and rearrange power, claiming that citizens need to give up privacy for the sake of security. But is this tradeoff a false dichotomy? And what type of tools are being developed to help us through this crisis? In this episode, Katharine Jarmul, Head of Product at Cape Privacy, a company building systems to leverage secure, privacy-preserving machine learning and collaborative data science, will discuss all this and more, in conversation with Dr. Hugo Bowne-Anderson, data scientist and educator at DataCamp.Links from the show

FROM THE INTERVIEW

Katharine on TwitterKatharine on LinkedInContact Tracing in the Real World (By Ross Anderson)The Price of the Coronavirus Pandemic (By Nick Paumgarten)Do We Need to Give Up Privacy to Fight the Coronavirus? (By Julia Angwin)Introducing the Principles of Equitable Disaster Response (By Greg Bloom)Cybersecurity During COVID-19 ( By Bruce Schneier)

Practical Statistics for Data Scientists, 2nd Edition

Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what’s important and what’s not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher-quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that "learn" from data Unsupervised learning methods for extracting meaning from unlabeled data

ML Ops: Operationalizing Data Science

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren’t truly operational, these models can’t possibly do what you’ve trained them to do. This report introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach—Build, Manage, Deploy and Integrate, and Monitor—for creating ML-infused applications within your organization. You’ll learn how to: Fulfill data science value by reducing friction throughout ML pipelines and workflows Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action

The Value of AI-Powered Business Intelligence

Artificial intelligence can yield powerful results when applied to business intelligence. Whether it’s pattern recognition in words, numbers, and big datasets or optimizing processes and expediting outcomes, AI is becoming a critical business component. In this report, Michael Norris from IBM explains how to drive AI adoption in your company. What does it mean to infuse AI into BI? It means business users can discover actionable, easy-to-understand insights on their own, independently from IT—even while remaining within the organization’s secure and governed IT architecture. Explore how AI in BI helps you to "get to the why" when analyzing and optimizing the insights you discover. Learn how AI-infused business intelligence: Enables line-of-business users to easily discover data-driven insights without requiring specialized data science expertise Allows users to ask questions in plain language with intuitive exploration tools to gain deeper insight into their data Provides recommended visualizations and dashboards to present compelling, concise, and explainable data Prepares datasets for analysis to free up IT analysts and line-of-business users

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.  Abstract This week on Making Data Simple, we are joined by Wennie Allen, Director of Data Science Elite Business, and Brittany Boggle, Senior Data Scientist at IBM. Together, they provide an update on the new initiatives the DSE team are embarking on during the COVID-19 pandemic. This includes employing data optimization and AI decision-making procedures to assist ICU facilities and estimate ventilator demand. Tune-in to find out more.  Connect with Wennie LinkedIn Twitter Connect with Brittany LinkedIn Twitter Show Notes 05:52 - Click here to learn why you should consider taking up a new hobby while physical distancing.  17:52 - Discover more about the Data Science Elite team here. 33:49 - Get up to speed with agile methodology here. 39:02 - Check out the Data and AI portfolio here.  Connect with the Team Producer Liam Seston - LinkedIn. Producer Lana Cosic - LinkedIn. Producer Meighann Helene - LinkedIn.  Producer Kate Brown - LinkedIn. Producer Allison Proctor - LinkedIn. Producer Mark Simmonds - LinkedIn.  Producer Michael Sestak - LinkedIn. Host Al Martin - LinkedIn and Twitter. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Interactive Data Visualization with Python - Second Edition

With Interactive Data Visualization with Python, you will learn to turn raw data into compelling, interactive visual stories. This book guides you through the practical uses of Python libraries such as Bokeh and Plotly, teaching you skills to create visualizations that captivate and inform. What this Book will help me do Understand and apply different principles and techniques of interactive data visualization to bring your data to life. Master the use of libraries like Matplotlib, Seaborn, Altair, and Bokeh for creating a variety of data visualizations. Learn how to customize data visualizations effectively to meet the needs of different audiences and use cases. Gain proficiency in using advanced tools like Plotly for creating dynamic and engaging visual presentations. Acquire the ability to identify common pitfalls in visualization and learn strategies to avoid them, ensuring clarity and impact. Author(s) Abha Belorkar, Sharath Chandra Guntuku, Shubhangi Hora, and Anshu Kumar are experts in Python programming and data visualization with years of experience in data science and software development. They have collaborated to blend their knowledge into this book-a clear and practical guide to mastering interactive visualization with Python. Who is it for? This book is perfect for Python developers, data analysts, and data scientists who want to enhance their skills in data presentation. If you are ready to transform complex data into digestible and interactive visuals, this book is for you. A basic familiarity with Python programming and libraries like pandas is recommended. By the end of the book, you'll feel confident in creating professional-grade data visualizations.