talk-data.com talk-data.com

Topic

SPSS

statistical_software data_analysis predictive_analytics

35

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

35 activities · Newest first

The explosion of content in market research has created a paradox - more information but less time to consume it. Companies are now turning to AI chatbots to solve this problem, transforming how professionals interact with research data. Instead of expecting teams to read everything, these tools allow users to extract precisely what they need when they need it. This approach is proving not just more efficient but actually increases engagement with underlying content. How might your organization benefit from more targeted access to insights? What valuable information might be buried in your existing research that AI could help surface? With over 30 years of experience in marketing, media, and technology, Dan Coates is the President and co-founder of YPulse, the leading authority on Gen Z and Millennials. YPulse helps brands like Apple, Netflix, and Xbox understand and communicate with consumers aged 13–39, using data and insights from over 400,000 interviews conducted annually across seven countries. Prior to founding YPulse, Dan co-founded SurveyU, an online community and insights platform targeting youth, which merged with YPulse in 2009. He also led the introduction of Globalpark’s SAAS platform into the North American market, until its acquisition by QuestBack in 2011. In addition, Dan has held senior roles at Polimetrix, SPSS, PlanetFeedback, and Burke, where he developed cutting-edge practices and products for online marketing insights and transitioned several ventures from early stages to high-value acquisitions. In the episode, Richie and Dan explore the creation of an AI chatbot for market research, addressing customer engagement challenges, the integration of AI in content consumption, the impact of AI on business strategies, and the future of AI in market research, and much more. Links Mentioned in the Show: YPulseConnect with DanHaystack by DeepsetUnmanaged: Master the Magic of Creating Empowered and Happy Organizations by Jack SkeelsSkill Track: AI FundamentalsRelated Episode: Can You Use AI-Driven Pricing Ethically? with Jose Mendoza, Academic Director & Clinical Associate Professor at NYURewatch sessions from RADAR: Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

SPSS Statistics Workbook For Dummies

Practice making sense of data with IBM’s SPSS Statistics software SPSS Statistics Workbook For Dummies gives you the practice you need to navigate the leading statistical software suite. Data management and analysis, advanced analytics, business intelligence—SPSS is a powerhouse of a research platform, and this book helps you master the fundamentals and analyze data more effectively. You’ll work through practice problems that help you understand the calculations you need to perform, complete predictive analyses, and produce informative graphs. This workbook gives you hands-on exercises to hone your statistical analysis skills with SPSS Statistics 28. Plus, explanations and insider tips help you navigate the software with ease. Practical and easy-to-understand, in classic Dummies style. Practice organizing, analyzing, and graphing data Learn to write, edit, and format SPSS syntax Explore the upgrades and features new to SPSS 28 Try your hand at advanced data analysis procedures For academics using SPSS for research, business analysts and market researchers looking to extract valuable insights from data, and anyone with a hankering for more stats practice.

Summary Every data project, whether it’s analytics, machine learning, or AI, starts with the work of data cleaning. This is a critical step and benefits from being accessible to the domain experts. Trifacta is a platform for managing your data engineering workflow to make curating, cleaning, and preparing your information more approachable for everyone in the business. In this episode CEO Adam Wilson shares the story behind the business, discusses the myriad ways that data wrangling is performed across the business, and how the platform is architected to adapt to the ever-changing landscape of data management tools. This is a great conversation about how deliberate user experience and platform design can make a drastic difference in the amount of value that a business can provide to their customers.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management You listen to this show to learn about all of the latest tools, patterns, and practices that power data engineering projects across every domain. Now there’s a book that captures the foundational lessons and principles that underly everything that you hear about here. I’m happy to announce I collected wisdom from the community to help you in your journey as a data engineer and worked with O’Reilly to publish it as 97 Things Every Data Engineer Should Know. Go to dataengineeringpodcast.com/97things today to get your copy! When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at dataengineeringpodcast.com/hightouch. Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Your host is Tobias Macey and today I’m interviewing Adam Wilson about Trifacta, a platform for modern data workers to assess quality, transform, and automate data pipelines

Interview

Introduction How did you get involved in the area of data management? Can you describe what Trifacta is and the story behind it? Across your site and material you focus on using the term "data wrangling". What is your personal definition of that term, and in what ways do you differentiate from ETL/ELT?

How does the deliberate use of that terminology influence the way that you think about the design and features of the Trifacta platform?

What is Trifacta’s role in the overall data platform/data lifecycle for an organization?

What are some examples of tools that Trifacta might replace? What tools or systems does Trifacta integrate with?

Who are the target end-users of the Trifacta platform and how do those personas direct the design and functionality? Can you describe how Trifacta is architected?

How have the goals and design of the system changed or evolved since you first began working on it?

Can you talk through the workflow and lifecycle of data as it traverses your platform, and the user interactions that drive it? How can data engineers share and encourage proper patterns for working with data assets with end-users across the organization? What are the limits of scale for volume and complexity of data assets that users are able to manage through Trifacta’s visual tools?

What are some strategies that you and your customers have found useful for pre-processing the information that enters your platform to increase the accessibility for end-users to self-serve?

What are the most interesting, innovative, or unexpected ways that you have seen Trifacta used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Trifacata? When is Trifacta the wrong choice? What do you have planned for the future of Trifacta?

Contact Info

LinkedIn @a_adam_wilson on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don’t forget to check out our other show, Podcast.init to learn about the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat

Links

Trifacta Informatica UC Berkeley Stanford University Citadel

Podcast Episode

Stanford Data Wrangler DBT

Podcast Episode

Pig Databricks Sqoop Flume SPSS Tableau SDLC == Software Delivery Life-Cycle

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

IBM SPSS Essentials, 2nd Edition

Master the fundamentals of SPSS with this newly updated and instructive resource The newly and thoroughly revised Second Edition of SPSS Essentials delivers a comprehensive guide for students in the social sciences who wish to learn how to use the Statistical Package for the Social Sciences (SPSS) for the effective collection, management, and analysis of data. The accomplished researchers and authors provide readers with the practical nuts and bolts of SPSS usage and data entry, with a particular emphasis on managing and manipulating data. The book offers an introduction to SPSS, how to navigate it, and a discussion of how to understand the data the reader is working with. It also covers inferential statistics, including topics like hypothesis testing, one-sample Z-testing, T-testing, ANOVAs, correlations, and regression. Five unique appendices round out the text, providing readers with discussions of dealing with real-world data, troubleshooting, advanced data manipulations, and new workbook activities. SPSS Essentials offers a wide variety of features, including: A revised chapter order, designed to match the pacing and content of typical undergraduate statistics classes An explanation of when particular inferential statistics are appropriate for use, given the nature of the data being worked with Additional material on understanding your data sample, including discussions of SPSS output and how to find the most relevant information A companion website offering additional problem sets, complete with answers Perfect for undergraduate students of the social sciences who are just getting started with SPSS, SPSS Essentials also belongs on the bookshelves of advanced placement high school students and practitioners in social science who want to brush up on the fundamentals of this powerful and flexible software package.

SPSS Statistics For Dummies, 4th Edition

The fun and friendly guide to mastering IBM’s Statistical Package for the Social Sciences Written by an author team with a combined 55 years of experience using SPSS, this updated guide takes the guesswork out of the subject and helps you get the most out of using the leader in predictive analysis. Covering the latest release and updates to SPSS 27.0, and including more than 150 pages of basic statistical theory, it helps you understand the mechanics behind the calculations, perform predictive analysis, produce informative graphs, and more. You’ll even dabble in programming as you expand SPSS functionality to suit your specific needs. Master the fundamental mechanics of SPSS Learn how to get data into and out of the program Graph and analyze your data more accurately and efficiently Program SPSS with Command Syntax Get ready to start handling data like a pro—with step-by-step instruction and expert advice!

Send us a text  This week's guest is Jorge Castanon, a Sr. Data Scientist for Watson Studio at IBM. Host Al Martin and Jorge discuss some typical data problems currently plaguing the industry, and how Watson Studio makes dealing with those problems that much easier. Get ready for an in-depth, technical conversation with two industry experts.

Show Note 00:10 - Connect with Producer Steve Moore on LinkedIn and Twitter.  00:15 - Connect with Producer Liam Seston on LinkedIn and Twitter.  00:20 - Connect with Producer Rachit Sharma on LinkedIn. 00:25 - Connect with Host Al Martin on LinkedIn and Twitter.  00:41 - Connect with Jorge Castanon on LinkedIn and Twitter 05:42 - Check out the machine learning hub here. 09:53 - Unsure what customer churn is? Find out in this article. 20:00 - AI is not magic. Read an article discussion the topic here. 24:34 - Learn about SPSS Modeler here. 35:46 - Check out coursera here. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Learn RStudio IDE: Quick, Effective, and Productive Data Science

Discover how to use the popular RStudio IDE as a professional tool that includes code refactoring support, debugging, and Git version control integration. This book gives you a tour of RStudio and shows you how it helps you do exploratory data analysis; build data visualizations with ggplot; and create custom R packages and web-based interactive visualizations with Shiny. In addition, you will cover common data analysis tasks including importing data from diverse sources such as SAS files, CSV files, and JSON. You will map out the features in RStudio so that you will be able to customize RStudio to fit your own style of coding. Finally, you will see how to save a ton of time by adopting best practices and using packages to extend RStudio. Learn RStudio IDE is a quick, no-nonsense tutorial of RStudio that will give you a head start to develop the insights you need in your data science projects. What YouWill Learn Quickly, effectively, and productively use RStudio IDE for building data science applications Install RStudio and program your first Hello World application Adopt the RStudio workflow Make your code reusable using RStudio Use RStudio and Shiny for data visualization projects Debug your code with RStudio Import CSV, SPSS, SAS, JSON, and other data Who This Book Is For Programmers who want to start doing data science, but don’t know what tools to focus on to get up to speed quickly.

Data Science for Business and Decision Making

Data Science for Business and Decision Making covers both statistics and operations research while most competing textbooks focus on one or the other. As a result, the book more clearly defines the principles of business analytics for those who want to apply quantitative methods in their work. Its emphasis reflects the importance of regression, optimization and simulation for practitioners of business analytics. Each chapter uses a didactic format that is followed by exercises and answers. Freely-accessible datasets enable students and professionals to work with Excel, Stata Statistical Software®, and IBM SPSS Statistics Software®. Combines statistics and operations research modeling to teach the principles of business analytics Written for students who want to apply statistics, optimization and multivariate modeling to gain competitive advantages in business Shows how powerful software packages, such as SPSS and Stata, can create graphical and numerical outputs

Testing Statistical Assumptions in Research

Comprehensively teaches the basics of testing statistical assumptions in research and the importance in doing so This book facilitates researchers in checking the assumptions of statistical tests used in their research by focusing on the importance of checking assumptions in using statistical methods, showing them how to check assumptions, and explaining what to do if assumptions are not met. Testing Statistical Assumptions in Research discusses the concepts of hypothesis testing and statistical errors in detail, as well as the concepts of power, sample size, and effect size. It introduces SPSS functionality and shows how to segregate data, draw random samples, file split, and create variables automatically. It then goes on to cover different assumptions required in survey studies, and the importance of designing surveys in reporting the efficient findings. The book provides various parametric tests and the related assumptions and shows the procedures for testing these assumptions using SPSS software. To motivate readers to use assumptions, it includes many situations where violation of assumptions affects the findings. Assumptions required for different non-parametric tests such as Chi-square, Mann-Whitney, Kruskal Wallis, and Wilcoxon signed-rank test are also discussed. Finally, it looks at assumptions in non-parametric correlations, such as bi-serial correlation, tetrachoric correlation, and phi coefficient. An excellent reference for graduate students and research scholars of any discipline in testing assumptions of statistical tests before using them in their research study Shows readers the adverse effect of violating the assumptions on findings by means of various illustrations Describes different assumptions associated with different statistical tests commonly used by research scholars Contains examples using SPSS, which helps facilitate readers to understand the procedure involved in testing assumptions Looks at commonly used assumptions in statistical tests, such as z, t and F tests, ANOVA, correlation, and regression analysis Testing Statistical Assumptions in Research is a valuable resource for graduate students of any discipline who write thesis or dissertation for empirical studies in their course works, as well as for data analysts.

Practical Web Scraping for Data Science: Best Practices and Examples with Python

This book provides a complete and modern guide to web scraping, using Python as the programming language, without glossing over important details or best practices. Written with a data science audience in mind, the book explores both scraping and the larger context of web technologies in which it operates, to ensure full understanding. The authors recommend web scraping as a powerful tool for any data scientist’s arsenal, as many data science projects start by obtaining an appropriate data set. Starting with a brief overview on scraping and real-life use cases, the authors explore the core concepts of HTTP, HTML, and CSS to provide a solid foundation. Along with a quick Python primer, they cover Selenium for JavaScript-heavy sites, and web crawling in detail. The book finishes with a recap of best practices and a collection of examples that bring together everything you've learned and illustrate various data science use cases. What You'll Learn Leverage well-established best practices and commonly-used Python packages Handle today's web, including JavaScript, cookies, and common web scraping mitigation techniques Understand the managerial and legal concerns regarding web scraping Who This Book is For A data science oriented audience that is probably already familiar with Python or another programming language or analytical toolkit (R, SAS, SPSS, etc). Students or instructors in university courses may also benefit. Readers unfamiliar with Python will appreciate a quick Python primer in chapter 1 to catch up with the basics and provide pointers to other guides as well.

Mathematical Statistics

Explores mathematical statistics in its entirety—from the fundamentals to modern methods This book introduces readers to point estimation, confidence intervals, and statistical tests. Based on the general theory of linear models, it provides an in-depth overview of the following: analysis of variance (ANOVA) for models with fixed, random, and mixed effects; regression analysis is also first presented for linear models with fixed, random, and mixed effects before being expanded to nonlinear models; statistical multi-decision problems like statistical selection procedures (Bechhofer and Gupta) and sequential tests; and design of experiments from a mathematical-statistical point of view. Most analysis methods have been supplemented by formulae for minimal sample sizes. The chapters also contain exercises with hints for solutions. Translated from the successful German text, Mathematical Statistics requires knowledge of probability theory (combinatorics, probability distributions, functions and sequences of random variables), which is typically taught in the earlier semesters of scientific and mathematical study courses. It teaches readers all about statistical analysis and covers the design of experiments. The book also describes optimal allocation in the chapters on regression analysis. Additionally, it features a chapter devoted solely to experimental designs. Classroom-tested with exercises included Practice-oriented (taken from day-to-day statistical work of the authors) Includes further studies including design of experiments and sample sizing Presents and uses IBM SPSS Statistics 24 for practical calculations of data Mathematical Statistics is a recommended text for advanced students and practitioners of math, probability, and statistics.

IBM SPSS Modeler Essentials

Learn how to leverage IBM SPSS Modeler for your data mining and predictive analytics needs in this comprehensive guide. With step-by-step instructions, you'll acquire the skills to import, clean, analyze, and model your data using this robust platform. By the end, you'll be equipped to uncover patterns and trends, enabling data-driven decision-making confidently. What this Book will help me do Understand the fundamentals of data mining and the visual programming interface of IBM SPSS Modeler. Prepare, clean, and preprocess data effectively for analysis and modeling. Build robust predictive models such as decision trees using best practices. Evaluate the performance of your analytical models to ensure accuracy and reliability. Export resulting analyses to apply insights to real-world data projects. Author(s) Keith McCormick and Jesus Salcedo are accomplished professionals in data analytics and statistical modeling. With extensive experience in consulting and teaching, they have guided many in mastering IBM SPSS Modeler through both hands-on workshops and written material. Their approachable teaching style and commitment to clarity ensure accessibility for learners. Who is it for? This book is designed for beginner users of IBM SPSS Modeler who wish to gain practical and actionable skills in data analytics. If you're a data enthusiast looking to explore predictive analytics or a professional eager to discover the insights hidden in your organizational data, this book is for you. A basic understanding of data mining concepts is advantageous but not required. This resource will set any novice on the path toward expert-level comprehension and application.

Data Analysis with IBM SPSS Statistics

"Data Analysis with IBM SPSS Statistics" is a comprehensive guide designed to help you master IBM SPSS Statistics for performing robust statistical analyses. Through a practical approach, the book delves into critical techniques like data visualization, regression analysis, and hypothesis testing, enabling you to uncover patterns, make informed decisions, and enhance data interpretation. What this Book will help me do Set up and configure IBM SPSS Statistics for effective data analysis workflows. Perform data cleaning and preparation, including addressing missing data and restructuring datasets. Master statistical techniques such as ANOVA, regression analysis, and clustering to draw insights from data. Generate intuitive visualizations like charts and graphs to communicate findings effectively. Build predictive models and evaluate their effectiveness for decision-making purposes. Author(s) Ken Stehlik-Barry and Anthony Babinec are seasoned data analysts and IBM SPSS experts with extensive experience in statistical methodologies and data science. They have a knack for translating complex concepts into accessible lessons, making this book an ideal resource for learners aiming to build their SPSS aptitude. Their expertise ensures a well-rounded learning journey. Who is it for? This book is tailored for data analysts and researchers who need to analyze and interpret data effectively using IBM SPSS Statistics. Readers should have basic familiarity with statistical concepts, making it ideal for those with a foundational understanding of statistics. If you aim to grasp practical applications of SPSS for real-world data challenges, this book is for you.

SPSS Statistics for Data Analysis and Visualization

Dive deeper into SPSS Statistics for more efficient, accurate, and sophisticated data analysis and visualization SPSS Statistics for Data Analysis and Visualization goes beyond the basics of SPSS Statistics to show you advanced techniques that exploit the full capabilities of SPSS. The authors explain when and why to use each technique, and then walk you through the execution with a pragmatic, nuts and bolts example. Coverage includes extensive, in-depth discussion of advanced statistical techniques, data visualization, predictive analytics, and SPSS programming, including automation and integration with other languages like R and Python. You'll learn the best methods to power through an analysis, with more efficient, elegant, and accurate code. IBM SPSS Statistics is complex: true mastery requires a deep understanding of statistical theory, the user interface, and programming. Most users don't encounter all of the methods SPSS offers, leaving many little-known modules undiscovered. This book walks you through tools you may have never noticed, and shows you how they can be used to streamline your workflow and enable you to produce more accurate results. Conduct a more efficient and accurate analysis Display complex relationships and create better visualizations Model complex interactions and master predictive analytics Integrate R and Python with SPSS Statistics for more efficient, more powerful code These "hidden tools" can help you produce charts that simply wouldn't be possible any other way, and the support for other programming languages gives you better options for solving complex problems. If you're ready to take advantage of everything this powerful software package has to offer, SPSS Statistics for Data Analysis and Visualization is the expert-led training you need.

Enabling Real-time Analytics on IBM z Systems Platform

Regarding online transaction processing (OLTP) workloads, IBM® z Systems™ platform, with IBM DB2®, data sharing, Workload Manager (WLM), geoplex, and other high-end features, is the widely acknowledged leader. Most customers now integrate business analytics with OLTP by running, for example, scoring functions from transactional context for real-time analytics or by applying machine-learning algorithms on enterprise data that is kept on the mainframe. As a result, IBM adds investment so clients can keep the complete lifecycle for data analysis, modeling, and scoring on z Systems control in a cost-efficient way, keeping the qualities of services in availability, security, reliability that z Systems solutions offer. Because of the changed architecture and tighter integration, IBM has shown, in a customer proof-of-concept, that a particular client was able to achieve an orders-of-magnitude improvement in performance, allowing that client’s data scientist to investigate the data in a more interactive process. Open technologies, such as Predictive Model Markup Language (PMML) can help customers update single components instead of being forced to replace everything at once. As a result, you have the possibility to combine your preferred tool for model generation (such as SAS Enterprise Miner or IBM SPSS® Modeler) with a different technology for model scoring (such as Zementis, a company focused on PMML scoring). IBM SPSS Modeler is a leading data mining workbench that can apply various algorithms in data preparation, cleansing, statistics, visualization, machine learning, and predictive analytics. It has over 20 years of experience and continued development, and is integrated with z Systems. With IBM DB2 Analytics Accelerator 5.1 and SPSS Modeler 17.1, the possibility exists to do the complete predictive model creation including data transformation within DB2 Analytics Accelerator. So, instead of moving the data to a distributed environment, algorithms can be pushed to the data, using cost-efficient DB2 Accelerator for the required resource-intensive operations. This IBM Redbooks® publication explains the overall z Systems architecture, how the components can be installed and customized, how the new IBM DB2 Analytics Accelerator loader can help efficient data loading for z Systems data and external data, how in-database transformation, in-database modeling, and in-transactional real-time scoring can be used, and what other related technologies are available. This book is intended for technical specialists and architects, and data scientists who want to use the technology on the z Systems platform. Most of the technologies described in this book require IBM DB2 for z/OS®. For acceleration of the data investigation, data transformation, and data modeling process, DB2 Analytics Accelerator is required. Most value can be archived if most of the data already resides on z Systems platforms, although adding external data (like from social sources) poses no problem at all.

What IS customer intelligence? What is a customer? Is the customer best understood by breaking the word down into its component parts: "cuss" and "tumor?" Would that be an intelligent thing to do? Will these and related questions some day be answered by self-aware machines? Will any of these questions be answered on this episode? Give it a listen and find out! The mish-mash of companies, products, and miscellany mentioned on this show include: Adobe, Oracle/ATG, SAS Customer Intelligence, Salesforce.com, Scott Brinker (Chief Martec), Domo, Data Studio 360, Tableau, iJento, Netezza, SPSS, Unfrozen Caveman Lawyer, Eight Is Enough, Legend of the Plaid Dragon (and the Slack version), Office Vibe, p-value article on fivethirtyeight.com (and the p-hacking app), and the "AI, Deep Learning, and Machine Learning" video.

Effective CRM using Predictive Analytics

A step-by-step guide to data mining applications in CRM. Following a handbook approach, this book bridges the gap between analytics and their use in everyday marketing, providing guidance on solving real business problems using data mining techniques. The book is organized into three parts. Part one provides a methodological roadmap, covering both the business and the technical aspects. The data mining process is presented in detail along with specific guidelines for the development of optimized acquisition, cross/ deep/ up selling and retention campaigns, as well as effective customer segmentation schemes. In part two, some of the most useful data mining algorithms are explained in a simple and comprehensive way for business users with no technical expertise. Part three is packed with real world case studies which employ the use of three leading data mining tools: IBM SPSS Modeler, RapidMiner and Data Mining for Excel. Case studies from industries including banking, retail and telecommunications are presented in detail so as to serve as templates for developing similar applications. Key Features: Includes numerous real-world case studies which are presented step by step, demystifying the usage of data mining models and clarifying all the methodological issues. Topics are presented with the use of three leading data mining tools: IBM SPSS Modeler, RapidMiner and Data Mining for Excel. Accompanied by a website featuring material from each case study, including datasets and relevant code. Combining data mining and business knowledge, this practical book provides all the necessary information for designing, setting up, executing and deploying data mining techniques in CRM. Effective CRM using Predictive Analytics will benefit data mining practitioners and consultants, data analysts, statisticians, and CRM officers. The book will also be useful to academics and students interested in applied data mining.

Getting Started with Data Science: Making Sense of Data with Analytics

Master Data Analytics Hands-On by Solving Fascinating Problems You’ll Actually Enjoy! Harvard Business Review recently called data science “The Sexiest Job of the 21st Century.” It’s not just sexy: For millions of managers, analysts, and students who need to solve real business problems, it’s indispensable. Unfortunately, there’s been nothing easy about learning data science–until now. Getting Started with Data Science takes its inspiration from worldwide best-sellers like Freakonomics and Malcolm Gladwell’s Outliers: It teaches through a powerful narrative packed with unforgettable stories. Murtaza Haider offers informative, jargon-free coverage of basic theory and technique, backed with plenty of vivid examples and hands-on practice opportunities. Everything’s software and platform agnostic, so you can learn data science whether you work with R, Stata, SPSS, or SAS. Best of all, Haider teaches a crucial skillset most data science books ignore: how to tell powerful stories using graphics and tables. Every chapter is built around real research challenges, so you’ll always know why you’re doing what you’re doing. You’ll master data science by answering fascinating questions, such as: • Are religious individuals more or less likely to have extramarital affairs? • Do attractive professors get better teaching evaluations? • Does the higher price of cigarettes deter smoking? • What determines housing prices more: lot size or the number of bedrooms? • How do teenagers and older people differ in the way they use social media? • Who is more likely to use online dating services? • Why do some purchase iPhones and others Blackberry devices? • Does the presence of children influence a family’s spending on alcohol? For each problem, you’ll walk through defining your question and the answers you’ll need; exploring how others have approached similar challenges; selecting your data and methods; generating your statistics; organizing your report; and telling your story. Throughout, the focus is squarely on what matters most: transforming data into insights that are clear, accurate, and can be acted upon.

Accelerating Data Transformation with IBM DB2 Analytics Accelerator for z/OS Understanding and Using Accelerator-only Tables

Transforming data from operational data models to purpose-oriented data structures has been commonplace for the last decades. Data transformations are heavily used in all types of industries to provide information to various users at different levels. Depending on individual needs, the transformed data is stored in various different systems. Sending operational data to other systems for further processing is then required, and introduces much complexity to an existing information technology (IT) infrastructure. Although maintenance of additional hardware and software is one component, potential inconsistencies and individually managed refresh cycles are others. For decades, there was no simple and efficient way to perform data transformations on the source system of operational data. With IBM® DB2® Analytics Accelerator, DB2 for z/OS is now in a unique position to complete these transformations in an efficient and well-performing way. DB2 for z/OS completes these while connecting to the same platform as for operational transactions, helping you to minimize your efforts to manage existing IT infrastructure. Real-time analytics on incoming operational transactions is another demand. Creating a comprehensive scoring model to detect specific patterns inside your data can easily require multiple iterations and multiple hours to complete. By enabling a first set of analytical functionality in DB2 Analytics Accelerator, those dedicated mining algorithms can now be run on an accelerator to efficiently perform these modeling tasks. Given the speed of query processing on an accelerator, these modeling tasks can now be performed much quicker compared to traditional relational database management systems. This speed enables you to keep your scoring algorithms more up-to-date, and ultimately adapt more quickly to constantly changing customer behaviors. This IBM Redbooks® publication describes the new table type that is introduced with DB2 Analytics Accelerator V4.1 PTF5 that enables more efficient data transformations. These tables are called accelerator-only tables, and can exist on an accelerator only. The tables benefit from the accelerator performance characteristics, while maintaining access through existing DB2 for z/OS application programming interfaces (APIs). Additionally, we describe the newly introduced analytical capabilities with DB2 Analytics Accelerator V5.1, putting you in the position to efficiently perform data modeling for online analytical requirements in your DB2 for z/OS environment. This book is intended for technical decision-makers who want to get a broad understanding about the analytical capabilities and accelerator-only tables of DB2 Analytics Accelerator. In addition, you learn about how these capabilities can be used to accelerate in-database transformations and in-database analytics in various environments and scenarios, including the following scenarios: Multi-step processing and reporting in IBM DB2 Query Management Facility™, IBM Campaign, or Microstrategy environments In-database transformations using IBM InfoSphere® DataStage® Ad hoc data analysis for data scientists In-database analytics using IBM SPSS® Modeler