talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

Machine Learning with R - Second Edition

Machine Learning with R (Second Edition) provides a thorough introduction to machine learning techniques and their application using the R programming language. You'll gain hands-on experience implementing various algorithms and solving real-world data challenges, making it an invaluable resource for aspiring data scientists and analysts. What this Book will help me do Understand the fundamentals of machine learning and its applications in data analysis. Master the use of R for cleaning, exploring, and visualizing data to prepare it for modeling. Build and apply machine learning models for classification, prediction, and clustering tasks. Evaluate and fine-tune model performance to ensure accurate predictions. Explore advanced topics like text mining, handling social network data, and big data analytics. Author(s) Brett Lantz is a data scientist with significant experience as both a practitioner and communicator in the machine learning field. With a focus on accessibility, he aims to demystify complex concepts for readers interested in data science. His blend of hands-on methods and theoretical insight has made his work a favorite for both beginners and experienced professionals. Who is it for? Ideal for data analysts and aspiring data scientists who have intermediate programming skills and are exploring machine learning. Perfect for R users ready to expand their skill set to include predictive modeling techniques. Also fits those with some experience in machine learning but new to the R environment. Provides insightful guidance for anyone looking to apply machine learning in practical, real-world scenarios.

podcast_episode
by Kyle Polich , Benjamin Uminsky (Los Angeles County Registrar-Recorder/County Clerk)

In this episode, Benjamin Uminsky enlightens us about some of the ways the Los Angeles County Registrar-Recorder/County Clerk leverages data science and analysis to help be more effective and efficient with the services and expectations they provide citizens. Our topics range from forecasting to predicting the likelihood that people will volunteer to be poll workers. Benjamin recently spoke at Big Data Day LA. Videos have not yet been posted, but you can see the slides from his talk Data Mining Forecasting and BI at the RRCC if this episode has left you hungry to learn more. During the show, Benjamin encouraged any Los Angeles residents who have some time to serve their community consider becoming a pollworker.

Bioinformatics with Python Cookbook

Dive into the intersection of biology and data science with 'Bioinformatics with Python Cookbook.' This book equips you to leverage Python and its ecosystem of libraries to tackle complex challenges in computational biology, covering topics like genomics, phylogenetics, and big data bioinformatics. What this Book will help me do Understand the Python ecosystem specifically tailored for computational biology applications. Analyze and visualize next-generation sequencing data effectively. Explore and simulate population genetics for robust biological research. Utilize the Protein Data Bank to extract critical insights about proteins. Handle big genomics datasets with Python tools for large-scale bioinformatics studies. Author(s) Tiago Antao is an established bioinformatician with expertise in Python programming. With years of practical experience in computational biology, he has tailored this cookbook with detailed and actionable examples. Tiago's mission is to make bioinformatic techniques using Python accessible to researchers of varying skill levels. Who is it for? This book is ideal for researchers, biologists, and data scientists with intermediate Python skills looking to expand their expertise in bioinformatics. It caters to professionals wanting to utilize computational tools for solving biological problems. If you're involved in work or study related to genomics, phylogenetics, or large-scale biology datasets, this guide offers practical solutions. Make the most out of Python in your research journey.

Mastering Predictive Analytics with R

Dive into the realm of predictive analytics with this R-focused guide. Whether you're building your first model or refining complex analytics strategies, this book equips you with fundamental techniques and in-depth understanding of predictive modeling using R. What this Book will help me do Master the end-to-end predictive modeling process. Classify and select suitable predictive models for specific use cases. Understand the mechanics and assumptions of various predictive models. Evaluate predictive model performance with appropriate metrics. Enhance your R programming skills for analytical tasks. Author(s) The authors of this book combine strong technical expertise in data science and predictive analytics with extensive hands-on experience in applying them to real-world challenges. They excel at distilling complex topics into approachable, actionable steps for readers at varying levels of familiarity with R and data analysis. Their commitment to empowering learners defines their work. Who is it for? This book is perfect for budding data scientists and quantitative analysts with basic R knowledge who aspire to master predictive analytics. Even experienced professionals will find valuable model-specific insights. If you're familiar with basic statistics and eager to bridge the gap to robust machine learning applications, this book is for you.

The Last Mile of Analytics: Making the Leap from Platforms to Tools

Here's the net takeaway: Businesses want insights from data they can translate into meaningful actions and real results. Software vendors are beginning to deliver a new generation of advanced analytics packages that address business issues directly. In this O'Reilly report, Mike Barlow reveals how this new user-friendly software is helping businesses go beyond data analysis and straight to decision-making—without requiring data science expertise or truckloads of cash. How has advanced analytics progressed from lab project to commercial product so quickly? Through interviews with data analysts, you'll understand the role that machine learning plays in specialized analytics packages, and how this software alone can make decisions based on what's likely to happen next. When you have these capabilities, you’ve reached "the last mile of analytics."

A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the rouges about a shortcoming in operating systems.  This episode explores why such a (seemingly obvious) flaw might make sense from an engineering perspective, and how data science might be the solution. In this solo episode, Kyle proposes the concept of "annoyance mining" - the idea that with proper logging and enough feedback, data scientists could be provided the right dataset from which they can detect flaws and annoyances in software and other systems and automatically detect potential bugs, flaws, and improvements which could make those systems better. As system complexity grows, it seems that an abstraction like this might be required in order to keep maintaining an effective development cycle.  This episode is a bit of a soap box for Kyle as he explores why and how we might track an appropriate amount of data to be able to make better software and systems more suited for the users.

Marketing Data Science: Modeling Techniques in Predictive Analytics with R and Python

Now a leader of Northwestern University's prestigious analytics program presents a fully-integrated treatment of both the business and academic elements of marketing applications in predictive analytics. Writing for both managers and students, Thomas W. Miller explains essential concepts, principles, and theory in the context of real-world applications. , Building on Miller's pioneering program, thoroughly addresses segmentation, target marketing, brand and product positioning, new product development, choice modeling, recommender systems, pricing research, retail site selection, demand estimation, sales forecasting, customer retention, and lifetime value analysis. Marketing Data Science Starting where Miller's widely-praised Modeling Techniques in Predictive Analytics left off, he integrates crucial information and insights that were previously segregated in texts on web analytics, network science, information technology, and programming. Coverage includes: The role of analytics in delivering effective messages on the web Understanding the web by understanding its hidden structures Being recognized on the web – and watching your own competitors Visualizing networks and understanding communities within them Measuring sentiment and making recommendations Leveraging key data science methods: databases/data preparation, classical/Bayesian statistics, regression/classification, machine learning, and text analytics Six complete case studies address exceptionally relevant issues such as: separating legitimate email from spam; identifying legally-relevant information for lawsuit discovery; gleaning insights from anonymous web surfing data, and more. This text's extensive set of web and network problems draw on rich public-domain data sources; many are accompanied by solutions in Python and/or R. will be an invaluable resource for all students, faculty, and professional marketers who want to use business analytics to improve marketing performance. Marketing Data Science

Data Science in R

This book explains the details involved in solving real computational problems encountered in data analysis. It reveals the dynamic and iterative process by which data analysts approach a problem and reason about different ways of implementing solutions. The book's collection of projects, exercises, and sample solutions encompass practical topics pertaining to data processing and analysis. The book can be used for self-study or as supplementary reading in a statistical computing course, allowing students to gain valuable data science skills.

Data Science from Scratch

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Learning Pandas

"Learning Pandas" is your comprehensive guide to mastering pandas, the powerful Python library for data manipulation and analysis. In this book, you'll explore pandas' capabilities and learn to apply them to real-world data challenges. With clear explanations and hands-on examples, you'll enhance your ability to analyze, clean, and visualize data effectively. What this Book will help me do Understand the core concepts of pandas and how it integrates with Python. Learn to efficiently manipulate and transform datasets using pandas. Gain skills in analyzing and cleaning data to prepare for insights. Explore techniques for working with time-series data and financial datasets. Discover how to create compelling visualizations with pandas to communicate findings. Author(s) Michael Heydt is an experienced Python developer and data scientist with expertise in teaching technical concepts to others. With a deep understanding of the pandas library, Michael has authored several guides on data analysis and is passionate about making complex information accessible. His practical approach ensures readers can directly apply lessons to their own projects. Who is it for? This book is ideal for Python programmers who want to harness the power of pandas for data analysis. Whether you're a beginner in data science or looking to refine your skills, you'll find clear, actionable guidance here. Basic programming knowledge is assumed, but no prior pandas experience is necessary. If you're eager to turn data into impactful insights, this book is for you.

Nicole Goebel joins us this week to share her experiences in oceanography studying phytoplankton and other aspects of the ocean and how data plays a role in that science.   We also discuss Thinkful where Nicole and I are both mentors for the Introduction to Data Science course. Last but not least, check out Nicole's blog Data Science Girl and the videos Kyle mentioned on her Youtube channel featuring one on the diversity of phytoplankton and how that changes in time and space.

Data Science For Dummies

Discover how data science can help you gain in-depth insight into your business - the easy way! Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles in organizations. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you'll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals before moving on to working with relational databases and unstructured data and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It's a big, big data world out there - let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

podcast_episode
by Kyle Polich , Tim Schmeier (NYC Data Science Academy)

New York State approved the use of automated speed cameras within a specific range of schools. Tim Schmeier did an analysis of publically available data related to these cameras as part of a project at the NYC Data Science Academy. Tim's work leverages several open data sets to ask the questions: are the speed cameras succeeding in their intended purpose of increasing public safety near schools? What he found using open data may surprise you. You can read Tim's write up titled Speed Cameras: Revenue or Public Safety? on the NYC Data Science Academy blog. His original write up, reproducible analysis, and figures are a great compliment to this episode. For his benevolent recommendation, Tim suggests listeners visit Maddie's Fund - a data driven charity devoted to helping achieve and sustain a no-kill pet nation. And for his self-serving recommendation, Tim Schmeier will very shortly be on the job market. If you, your employeer, or someone you know is looking for data science talent, you can reach time at his gmail account which is timothy.schmeier at gmail dot com.

TIBCO Spotfire: A Comprehensive Primer

TIBCO Spotfire: A Comprehensive Primer is the go-to guide for mastering TIBCO Spotfire, a leading data visualization and analytics tool. Whether you are new to Spotfire or data visualization in general, this book will provide you with a solid foundation to create impactful and actionable visual insights. What this Book will help me do Understand the fundamentals of TIBCO Spotfire and its application in data analytics. Learn how to design compelling visualizations and dashboards that convey meaningful insights. Master advanced data transformations and analysis techniques in TIBCO Spotfire. Integrate Spotfire with external data sources and scripting languages, enhancing its functionality. Optimize Spotfire's performance and usability for enterprise-level implementations. Author(s) None Phillips, an experienced analytics professional and educator, specializes in creating accessible learning materials for data science tools. With a decade of experience in the field, None has helped many organizations unlock their data potential through tools like TIBCO Spotfire. Their approach emphasizes practical understanding, making complex concepts approachable for learners of all levels. Who is it for? The book is perfect for business analysts, data scientists, and other professionals involved in data-driven decision making who want to master TIBCO Spotfire. It's designed for beginners without prior exposure to data visualization or TIBCO Spotfire, offering an accessible entry into the field. Individuals aiming to gain hands-on experience and create enterprise-grade solutions will find this book invaluable. Additionally, it serves as a reference for experienced Spotfire users looking to refine their skills.

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data

Data Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software. This book will help you: Become a contributor on a data science team Deploy a structured lifecycle approach to data analytics problems Apply appropriate analytic techniques and tools to analyzing big data Learn how to tell a compelling story with data to drive business action Prepare for EMC Proven Professional Data Science Certification Corresponding data sets are available at www.wiley.com/go/9781118876138. Get started discovering, analyzing, visualizing, and presenting data in a meaningful way today!

Introductory Statistics and Analytics: A Resampling Perspective

Concise, thoroughly class-tested primer that features basic statistical concepts in the concepts in the context of analytics, resampling, and the bootstrap A uniquely developed presentation of key statistical topics, Introductory Statistics and Analytics: A Resampling Perspective provides an accessible approach to statistical analytics, resampling, and the bootstrap for readers with various levels of exposure to basic probability and statistics. Originally class-tested at one of the first online learning companies in the discipline, www.statistics.com, the book primarily focuses on applications of statistical concepts developed via resampling, with a background discussion of mathematical theory. This feature stresses statistical literacy and understanding, which demonstrates the fundamental basis for statistical inference and demystifies traditional formulas. The book begins with illustrations that have the essential statistical topics interwoven throughout before moving on to demonstrate the proper design of studies. Meeting all of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) requirements for an introductory statistics course, Introductory Statistics and Analytics: A Resampling Perspective also includes: Over 300 "Try It Yourself" exercises and intermittent practice questions, which challenge readers at multiple levels to investigate and explore key statistical concepts Numerous interactive links designed to provide solutions to exercises and further information on crucial concepts Linkages that connect statistics to the rapidly growing field of data science Multiple discussions of various software systems, such as Microsoft Office Excel®, StatCrunch, and R, to develop and analyze data Areas of concern and/or contrasting points-of-view indicated through the use of "Caution" icons Introductory Statistics and Analytics: A Resampling Perspective is an excellent primary textbook for courses in preliminary statistics as well as a supplement for courses in upper-level statistics and related fields, such as biostatistics and econometrics. The book is also a general reference for readers interested in revisiting the value of statistics.

Learning R for Geospatial Analysis

Learn how to leverage the power of R for geospatial analysis in this comprehensive guide. Whether you're processing spatial datasets, creating publication-quality maps, or performing GIS operations, this book covers the necessary tools and techniques for effective analysis, without requiring prior programming knowledge. What this Book will help me do Discover how to manipulate and analyze geospatial data effectively using R. Gain proficiency in loading, reshaping, and visualizing spatial data. Master key concepts like spatial queries and overlays for GIS tasks. Learn to automate spatial data workflows using reproducible R scripts. Create high-quality visualizations and maps tailored to your datasets. Author(s) None Dorman, the author of this book, is an experienced data science educator and practitioner with a particular focus on geospatial data analysis in R. With years of teaching and applied geospatial research, Dorman brings expertise in making advanced topics approachable. Their practical approach ensures readers can immediately put concepts into practice. Who is it for? This book is ideal for GIS analysts, geospatial researchers, educators, and students looking to enhance their skillset with R programming. It's particularly suited for those familiar with geographic concepts like coordinates but new to programming or R. If you aim to efficiently analyze spatial data and produce professional-grade visualizations and GIS analyses, this book is for you.

R Recipes: A Problem-Solution Approach

R Recipes is your handy problem-solution reference for learning and using the popular R programming language for statistics and other numerical analysis. Packed with hundreds of code and visual recipes, this book helps you to quickly learn the fundamentals and explore the frontiers of programming, analyzing and using R. R Recipes provides textual and visual recipes for easy and productive templates for use and re-use in your day-to-day R programming and data analysis practice. Whether you're in finance, cloud computing, big or small data analytics, or other applied computational and data science - R Recipes should be a staple for your code reference library.

Sharing Data and Models in Software Engineering

Data Science for Software Engineering: Sharing Data and Models presents guidance and procedures for reusing data and models between projects to produce results that are useful and relevant. Starting with a background section of practical lessons and warnings for beginner data scientists for software engineering, this edited volume proceeds to identify critical questions of contemporary software engineering related to data and models. Learn how to adapt data from other organizations to local problems, mine privatized data, prune spurious information, simplify complex results, how to update models for new platforms, and more. Chapters share largely applicable experimental results discussed with the blend of practitioner focused domain expertise, with commentary that highlights the methods that are most useful, and applicable to the widest range of projects. Each chapter is written by a prominent expert and offers a state-of-the-art solution to an identified problem facing data scientists in software engineering. Throughout, the editors share best practices collected from their experience training software engineering students and practitioners to master data science, and highlight the methods that are most useful, and applicable to the widest range of projects. Shares the specific experience of leading researchers and techniques developed to handle data problems in the realm of software engineering Explains how to start a project of data science for software engineering as well as how to identify and avoid likely pitfalls Provides a wide range of useful qualitative and quantitative principles ranging from very simple to cutting edge research Addresses current challenges with software engineering data such as lack of local data, access issues due to data privacy, increasing data quality via cleaning of spurious chunks in data

Web and Network Data Science: Modeling Techniques in Predictive Analytics

Master modern web and network data modeling: both theory and applications. In a top faculty member of Northwestern University’s prestigious analytics program presents the first fully-integrated treatment of both the business and academic elements of web and network modeling for predictive analytics. Web and Network Data Science, Some books in this field focus either entirely on business issues (e.g., Google Analytics and SEO); others are strictly academic (covering topics such as sociology, complexity theory, ecology, applied physics, and economics). This text gives today's managers and students what they really need: integrated coverage of concepts, principles, and theory in the context of real-world applications. Building on his pioneering Web Analytics course at Northwestern University, Thomas W. Miller covers usability testing, Web site performance, usage analysis, social media platforms, search engine optimization (SEO), and many other topics. He balances this practical coverage with accessible and up-to-date introductions to both social network analysis and network science, demonstrating how these disciplines can be used to solve real business problems.