O'Reilly Data Science Books

Advanced Analytics with R and Tableau

2017-08-22 O'Reilly Amazon

book

LOURDES BOLAÑOS PÉREZ , Ruben Oliva Ramos , Jen Stirrup

data data-science data-science-tools r Analytics Data Science

In "Advanced Analytics with R and Tableau," you will learn how to combine the statistical computing power of R with the excellent data visualization capabilities of Tableau to perform advanced analysis and present your findings effectively. This book guides you through practical examples to understand topics such as classification, clustering, and predictive analytics while creating compelling visual dashboards. What this Book will help me do Integrate advanced statistical computations in R with Tableau's visual analysis for comprehensive analytics. Master making R function calls from Tableau through practical applications such as RServe integration. Develop predictive and classification models in R, visualized wonderfully in Tableau dashboards. Understand clustering and unsupervised learning concepts, applied to real-world datasets for business insights. Leverage the combination of Tableau and R for making impactful, data-driven decisions in your organization. Author(s) Ruben Oliva Ramos, Jen Stirrup, and Roberto Rösler are accomplished professionals with extensive experience in data science and analytics. Their combined expertise brings practical insights into combining R and Tableau for advanced analytics. Advocates for hands-on learning, they emphasize clarity and actionable knowledge in their writing. Who is it for? "Advanced Analytics with R and Tableau" is ideal for business analysts, data scientists, and Tableau professionals eager to expand their capabilities into advanced analytics. Readers should be familiar with Tableau and have basic knowledge of R, though the book starts with accessible examples. If you're looking to enhance your analytics with R's statistical power seamlessly integrated into Tableau, this book is for you.

Mastering Predictive Analytics with R, Second Edition - Second Edition

2017-08-18 O'Reilly Amazon

book

Rui Miguel Forte , James D. Miller

data data-science data-science-tools r AI/ML Analytics

This comprehensive guide dives into predictive analytics with R, exploring the powerful functionality and vast ecosystem of packages available in this programming language. By studying this book, you will gain mastery over predictive modeling techniques and learn how to apply machine learning to real-world problems efficiently and effectively. What this Book will help me do Develop proficiency in predictive modeling processes, from data preparation to model evaluation. Gain hands-on experience with R's diverse packages for machine learning. Understand the theoretical foundations and practical applications of various predictive models. Learn advanced techniques such as deep learning implementations of word embeddings and recurrent neural networks. Acquire the ability to handle large datasets using R for scalable predictive analytics workflows. Author(s) James D. Miller and Rui Miguel Forte are experts in data science and predictive analytics with decades of combined experience in the field. They bring practical insights from their work in both academia and industry. Their clear and engaging writing style aims at making complex concepts accessible to readers by integrating theoretical knowledge with real-world applications. Who is it for? This book is ideal for budding data scientists, predictive modelers, or quantitative analysts with some basic knowledge of R and statistics. Advanced learners aiming to refine their expertise in predictive analytics and those wishing to explore the functionality of R for applied machine learning will also greatly benefit from this resource. The book is suitable for professionals and enthusiasts keen to expand their understanding of predictive modeling and learn advanced techniques.

Elegant SciPy

2017-08-11 O'Reilly Amazon

book

Stéfan van der Walt , Juan Nunez-Iglesias , Harriet Dashnow

data data-science data-science-tools SciPy NumPy Pandas

Welcome to Scientific Python and its community. If you’re a scientist who programs with Python, this practical guide not only teaches you the fundamental parts of SciPy and libraries related to it, but also gives you a taste for beautiful, easy-to-read code that you can use in practice. You’ll learn how to write elegant code that’s clear, concise, and efficient at executing the task at hand. Throughout the book, you’ll work with examples from the wider scientific Python ecosystem, using code that illustrates principles outlined in the book. Using actual scientific data, you’ll work on real-world problems with SciPy, NumPy, Pandas, scikit-image, and other Python libraries. Explore the NumPy array, the data structure that underlies numerical scientific computation Use quantile normalization to ensure that measurements fit a specific distribution Represent separate regions in an image with a Region Adjacency Graph Convert temporal or spatial data into frequency domain data with the Fast Fourier Transform Solve sparse matrix problems, including image segmentations, with SciPy’s sparse module Perform linear algebra by using SciPy packages Explore image alignment (registration) with SciPy’s optimize module Process large datasets with Python data streaming primitives and the Toolz library

Learning Informatica PowerCenter 10.x - Second Edition

2017-08-10 O'Reilly Amazon

book

Rahul Malewar

data data-science analytics-platforms Informatica Data Management DWH

Dive into the world of Informatica PowerCenter 10.x, where enterprise data warehousing meets cutting-edge data management solutions. This comprehensive guide walks you through mastering ETL processes and optimizing performance, helping you become proficient in this powerful data integration tool. With step-by-step instructions, you'll build your knowledge from installation to advanced techniques. What this Book will help me do Understand how to install and configure Informatica PowerCenter 10.x for enterprise-level data integration projects, ensuring readiness to start transforming data effectively. Gain hands-on experience with PowerCenter's various developer tools, including Workflow Manager, Workflow Monitor, Designer, and Repository Manager, mastering their practical utilities. Learn and apply essential data warehousing concepts, such as Slowly Changing Dimensions (SCDs) and Incremental Aggregations, to create robust data-handling workflows. Leverage advanced PowerCenter features like pushdown optimization and partitioning to optimize performance for large-scale data processing jobs. Become proficient in migrating sources, targets, and workflows between environments, enabling seamless integration of data management solutions across enterprise systems. Author(s) Rahul Malewar, a seasoned expert in ETL and data integration, brings his extensive experience with Informatica PowerCenter to the table. With years spent working alongside global enterprises to streamline their data operations, Rahul's insights transfer into a hands-on teaching style that simplifies even the most advanced concepts. Apt at bridging technical depth with accessible explanations, he has dedicated his career to empowering learners to unlock the full potential of their data warehousing tools. Who is it for? Perfect for developers and data professionals aiming to elevate their enterprise data management skills, this book is ideally suited for those new to or experienced with Informatica. Whether you're striving to become proficient in PowerCenter or seeking to implement advanced ETL concepts in your projects, this guide will equip you with the expertise to succeed. A foundational understanding of programming and data warehousing concepts is recommended for best results.

Business Survival Analysis Using SAS

2017-07-31 O'Reilly Amazon

book

Jorge Ribeiro

data data-science analytics-platforms SAS Marketing

Solve business problems involving time-to-event and resulting probabilities by following the modeling tutorials in Business Survival Analysis Using SAS®: An Introduction to Lifetime Probabilities, the first book to be published in the field of business survival analysis! Survival analysis is a challenge. Books applying to health sciences exist, but nothing about survival applications for business has been available until now. Written for analysts, forecasters, econometricians, and modelers who work in marketing or credit risk and have little SAS modeling experience, Business Survival Analysis Using SAS® builds on a foundation of SAS code that works in any survival model and features numerous annotated graphs, coefficients, and statistics linked to real business situations and data sets. This guide also helps recent graduates who know the statistics but do not necessarily know how to apply them get up and running in their jobs. By example, it teaches the techniques while avoiding advanced theoretical underpinnings so that busy professionals can rapidly deliver a survival model to meet common business needs.

From first principles, this book teaches survival analysis by highlighting its relevance to business cases. A pragmatic introduction to survival analysis models, it leads you through business examples that contextualize and motivate the statistical methods and SAS coding. Specifically, it illustrates how to build a time-to-next-purchase survival model in SAS® Enterprise Miner, and it relates each step to the underlying statistics and to Base SAS® and SAS/STAT® software. Following the many examplesâ€”from data preparation to validation to scoring new customersâ€”you will learn to develop and apply survival analysis techniques to scenarios faced by companies in the financial services, insurance, telecommunication, and marketing industries, including the following scenarios:

Time-to-next-purchase for marketing

Employer turnover for human resources

Small business portfolio macroeconometric stress tests for banks

International Financial Reporting Standard (IFRS 9) lifetime probability of default for banks and building societies

"Churn," or attrition, models for the telecommunications and insurance industries

Bayesian Psychometric Modeling

2017-07-28 O'Reilly Amazon

book

Roy Levy , Robert J. Mislevy

data data-science data-science-tasks statistics bayesian-statistics

This book presents a unified Bayesian approach across traditionally separate families of psychometric models. It shows that Bayesian techniques, as alternatives to conventional approaches, offer distinct and profound advantages in achieving many goals of psychometrics. The book covers foundational principles and statistical models as well as popular psychometric models. Throughout the text, procedures are illustrated using examples primarily from educational assessments. A supplementary website provides the datasets, WinBUGS code, R code, and Netica files used in the examples.

Oracle Internals

2017-07-27 O'Reilly Amazon

book

Donald K. Burleson

data data-science analytics-platforms oracle-bi Oracle

Oracle has evolved from a simple relational database into one of the most complex e-commerce platforms ever devised. This book presents a compendium of articles from Oracle Internals, Auerbach Publications' newsletter for Oracle database administrators and other Oracle professionals.

Predictive Modeling with SAS Enterprise Miner, 3rd Edition

2017-07-20 O'Reilly Amazon

book

Kattamuri S. Sarma

data data-science analytics-platforms SAS

A step-by-step guide to predictive modeling!

Kattamuri Sarma's Predictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications, Third Edition, will show you how to develop and test predictive models quickly using SAS Enterprise Miner. Using realistic data, the book explains complex methods in a simple and practical way to readers from different backgrounds and industries. Incorporating the latest version of Enterprise Miner, this third edition also expands the section on time series.

Written for business analysts, data scientists, statisticians, students, predictive modelers, and data miners, this comprehensive text provides examples that will strengthen your understanding of the essential concepts and methods of predictive modeling. Topics covered include logistic regression, regression, decision trees, neural networks, variable clustering, observation clustering, data imputation, binning, data exploration, variable selection, variable transformation, and much more, including analysis of textual data.

Develop predictive models quickly, learn how to test numerous models and compare the results, gain an in-depth understanding of predictive models and multivariate methods, and discover how to do in-depth analysis. Do it all with Predictive Modeling with SAS Enterprise Miner!

Analysis of Clinical Trials Using SAS, 2nd Edition

2017-07-17 O'Reilly Amazon

book

Gary G. Koch , Alex Dmitrienko

data data-science analytics-platforms SAS

Analysis of Clinical Trials Using SAS®: A Practical Guide, Second Edition bridges the gap between modern statistical methodology and real-world clinical trial applications. Tutorial material and step-by-step instructions illustrated with examples from actual trials serve to define relevant statistical approaches, describe their clinical trial applications, and implement the approaches rapidly and efficiently using the power of SAS. Topics reflect the International Conference on Harmonization (ICH) guidelines for the pharmaceutical industry and address important statistical problems encountered in clinical trials. Commonly used methods are covered, including dose-escalation and dose-finding methods that are applied in Phase I and Phase II clinical trials, as well as important trial designs and analysis strategies that are employed in Phase II and Phase III clinical trials, such as multiplicity adjustment, data monitoring, and methods for handling incomplete data. This book also features recommendations from clinical trial experts and a discussion of relevant regulatory guidelines.

This new edition includes more examples and case studies, new approaches for addressing statistical problems, and the following new technological updates:

SAS procedures used in group sequential trials (PROC SEQDESIGN and PROC SEQTEST)

SAS procedures used in repeated measures analysis (PROC GLIMMIX and PROC GEE)

macros for implementing a broad range of randomization-based methods in clinical trials, performing complex multiplicity adjustments, and investigating the design and analysis of early phase trials (Phase I dose-escalation trials and Phase II dose-finding trials)

Clinical statisticians, research scientists, and graduate students in biostatistics will greatly benefit from the decades of clinical research experience and the ready-to-use SAS macros compiled in this book.

Principles of Data Wrangling

2017-07-14 O'Reilly Amazon

book

Jeffrey Heer , Connor Carreras , Sean Kandel , Joseph M. Hellerstein , Tye Rattenbury

data data-science Agile/Scrum Trifacta

A key task that any aspiring data-driven organization needs to learn is data wrangling, the process of converting raw data into something truly useful. This practical guide provides business analysts with an overview of various data wrangling techniques and tools, and puts the practice of data wrangling into context by asking, "What are you trying to do and why?" Wrangling data consumes roughly 50-80% of an analyst’s time before any kind of analysis is possible. Written by key executives at Trifacta, this book walks you through the wrangling process by exploring several factors—time, granularity, scope, and structure—that you need to consider as you begin to work with data. You’ll learn a shared language and a comprehensive understanding of data wrangling, with an emphasis on recent agile analytic processes used by many of today’s data-driven organizations. Appreciate the importance—and the satisfaction—of wrangling data the right way. Understand what kind of data is available Choose which data to use and at what level of detail Meaningfully combine multiple sources of data Decide how to distill the results to a size and shape that can drive downstream analysis

Dynamic Documents with R and knitr, 2nd Edition

2017-07-12 O'Reilly Amazon

book

Yihui Xie

data data-science data-science-tools r

Suitable for both beginners and advanced users, this popular book makes writing statistical reports easier by integrating computing directly with reporting. Reports range from homework, projects, exams, books, blogs, and web pages to virtually any documents related to statistical graphics, computing, and data analysis. This edition includes a new chapter on R Markdown v2, changes that reflect improvements in the knitr package, and several new sections. Demos and other information about the package are available on the author’s website.

Analytics

2017-07-05 O'Reilly Amazon

book

Phil Simon

data data-science web-analytics google-analytics Agile/Scrum Analytics

For years, organizations have struggled to make sense out of their data. IT projects designed to provide employees with dashboards, KPIs, and business-intelligence tools often take a year or more to reach the finish line...if they get there at all. This has always been a problem. Today, though, it's downright unacceptable. The world changes faster than ever. Speed has never been more important. By adhering to antiquated methods, firms lose the ability to see nascent trends—and act upon them until it's too late. But what if the process of turning raw data into meaningful insights didn't have to be so painful, time-consuming, and frustrating? What if there were a better way to do analytics? Fortunately, you're in luck... Analytics: The Agile Way is the eighth book from award-winning author and Arizona State University professor Phil Simon. Analytics: The Agile Way demonstrates how progressive organizations such as Google, Nextdoor, and others approach analytics in a fundamentally different way. They are applying the same Agile techniques that software developers have employed for years. They have replaced large batches in favor of smaller ones...and their results will astonish you. Through a series of case studies and examples, Analytics: The Agile Way demonstrates the benefits of this new analytics mind-set: superior access to information, quicker insights, and the ability to spot trends far ahead of your competitors.

Learning pandas - Second Edition

2017-06-30 O'Reilly Amazon

book

Michael Heydt , Nicola Rainiero , Sonali Dayal

data data-science data-science-tools Pandas API Python

Take your Python skills to the next level with 'Learning pandas,' your go-to guide for mastering data manipulation and analysis. This book walks you through the powerful tools offered by the pandas library, helping you unlock key insights from data efficiently. Whether you're handling time-series data or visualizing patterns, you'll gain the proficiency needed to make sense of complex datasets. What this Book will help me do Understand and effectively use pandas Series and DataFrame objects for data representation and manipulation. Master indexing, slicing, and combining data to perform detailed exploration and analysis. Learn to access and work with external data sources, including APIs, databases, and files, using pandas. Develop the skills to handle and analyze time-series data, managing its unique challenges. Create informative and professional data visualizations directly using pandas capabilities. Author(s) Michael Heydt is a respected author and educator in the field of Python and data analysis. With years of experience utilizing pandas in practical and professional environments, Michael offers a unique perspective that combines deep technical insight with approachable examples. His teaching philosophy emphasizes clarity, applicability, and engaging instruction, ensuring learners easily acquire valuable skills. Who is it for? This book is ideal for Python programmers looking to enhance their data analysis capabilities, as well as data analysts and scientists wanting to leverage pandas to improve their workflows. Readers are recommended to have some familiarity with Python, though prior experience with pandas is not required. If you have a keen interest in data exploration and quantitative techniques, this book is for you.

Practical Predictive Analytics

2017-06-30 O'Reilly Amazon

book

Ralph Winters

data data-science business-intelligence prescriptive-analytics Analytics Big Data

Dive into the world of predictive analytics with 'Practical Predictive Analytics.' This comprehensive guide walks you through analyzing current and historical data to predict future outcomes. Using tools like R and Spark, you will master practical skills, solve real-world challenges, and apply predictive analytics across domains like marketing, healthcare, and retail. What this Book will help me do Learn the six steps for successfully implementing predictive analytics projects. Acquire practical skills in data cleaning, input, and model deployment using tools like R and Spark. Understand core predictive analytics algorithms and their applications in various industries. Apply data analytics techniques to solve problems in fields such as healthcare and marketing. Master methods for handling big data analytics using Databricks and Spark for effective prediction. Author(s) The author, None Winters, is an experienced data scientist and technical educator. With extensive background in predictive analytics, Winters specializes in applying statistical methods and techniques to real-world consultation scenarios. Winters brings a practical and accessible approach to this text, ensuring that learners can follow along and apply their newfound expertise effectively. Who is it for? This book is ideal for statisticians and analysts with some programming background in languages like R, who want to master predictive analytics skills. It caters to intermediate learners who aim to enhance their ability to solve complex analytical problems. Whether you're looking to advance your career or improve your proficiency in data science, this book will serve as a valuable resource for learning and growth.

QlikView for Developers

2017-06-30 O'Reilly Amazon

book

Miguel Angel Garcia , Barry Harmsen

data data-science analytics-platforms qlikview BI Qlik

"QlikView for Developers" is a comprehensive guide to mastering QlikView, a powerful business intelligence tool. This book takes you on a journey from understanding the basics to building scalable and maintainable QlikView applications. Designed to provide practical methods, real-world scenarios, and valuable tips, it is ideal for anyone wanting to learn and effectively use QlikView for BI solutions. What this Book will help me do Understand the key features and architecture of QlikView and what has changed in QlikView 12. Learn to transform, model, and organize data in QlikView to effectively support business processes. Master best practices for creating interactive dashboards using charts, tables, and visualization objects. Discover techniques to optimize data architecture for scalable deployments and ensure data consistency. Implement advanced scripting and calculation methods, such as Set Analysis, to handle complex analytical requirements. Author(s) Miguel Angel Garcia and Barry Harmsen bring years of professional expertise in business intelligence and QlikView application development. Both authors have contributed significantly to the BI community and have extensive experience teaching and consulting on QlikView solutions. Their goal with this book is to provide a resource that is both informative and practical for QlikView developers. Who is it for? This book is intended for developers and analysts looking to harness the capabilities of QlikView for business intelligence purposes. It is suitable for beginners with minimal experience in QlikView, as well as for experienced practitioners wanting to deepen their knowledge and skills. The book provides a balanced approach that caters to various skill levels, ensuring accessible and actionable content for all readers.

Practical Data Science Cookbook, Second Edition - Second Edition

2017-06-29 O'Reilly Amazon

book

Prabhanjan Narayanachar Tattar , RATNADIP ADHIKARI , Anthony Ojeda , Abhinav Prakash Rai , Rajib Bhattacharya , Hashmat Rohian , Bhushan Purushottam Joshi , Sean P Murphy , ABHIJIT DASGUPTA

data data-science Analytics Data Science Python

The Practical Data Science Cookbook, Second Edition provides hands-on, practical recipes that guide you through all aspects of the data science process using R and Python. Starting with setting up your programming environment, you'll work through a series of real-world projects to acquire, clean, analyze, and visualize data efficiently. What this Book will help me do Set up R and Python environments effectively for data science tasks. Acquire, clean, and preprocess data tailored to analysis with practical steps. Develop robust predictive and exploratory models for actionable insights. Generate analytic reports and share findings with impactful visualizations. Construct tree-based models and master random forests for advanced analytics. Author(s) Authored by a team of experienced professionals in the field of data science and analytics, this book reflects their collective expertise in tackling complex data challenges using programming. With backgrounds spanning industry and academia, the authors bring a practical, application-focused approach to teaching data science. Who is it for? This book is ideal for aspiring data scientists who want hands-on experience with real-world projects, regardless of prior experience. Beginners will gain step-by-step understanding of data science concepts, while seasoned professionals will appreciate the structured projects and use of R and Python for advanced analytics and modeling.

Focused Genograms, 2nd Edition

2017-06-26 O'Reilly Amazon

book

Rita DeMaria , Markie C. Twist , Gerald R. Weeks

data data-science data-science-tasks graph-analytics

Focused Genograms introduces and provides a guide to constructing focused genograms through the lens of attachment theory.

Text Mining with R

2017-06-26 O'Reilly Amazon

book

David Robinson , Julia Silge

data data-science data-science-tools r NLP

Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective. The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You’ll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media. Learn how to apply the tidy text format to NLP Use sentiment analysis to mine the emotional content of text Identify a document’s most important terms with frequency measurements Explore relationships and connections between words with the ggraph and widyr packages Convert back and forth between R’s tidy and non-tidy text formats Use topic modeling to classify document collections into natural groups Examine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages

Advanced Object-Oriented Programming in R: Statistical Programming for Data Science, Analysis and Finance

2017-06-24 O'Reilly Amazon

book

Thomas Mailund

data data-science data-science-tools r Analytics Data Science

Learn how to write object-oriented programs in R and how to construct classes and class hierarchies in the three object-oriented systems available in R. This book gives an introduction to object-oriented programming in the R programming language and shows you how to use and apply R in an object-oriented manner. You will then be able to use this powerful programming style in your own statistical programming projects to write flexible and extendable software. After reading Advanced Object-Oriented Programming in R, you'll come away with a practical project that you can reuse in your own analytics coding endeavors. You'll then be able to visualize your data as objects that have state and then manipulate those objects with polymorphic or generic methods. Your projects will benefit from the high degree of flexibility provided by polymorphism, where the choice of concrete method to execute depends on the type of data being manipulated. What You'll Learn Define and use classes and generic functions using R Work with the R class hierarchies Benefit from implementation reuse Handle operator overloading Apply the S4 and R6 classes Who This Book Is For Experienced programmers and for those with at least some prior experience with R programming language.

Introduction to Google Analytics: A Guide for Absolute Beginners

2017-06-19 O'Reilly Amazon

book

Todd Kelsey

data data-science web-analytics google-analytics Analytics Google Analytics

Develop your digital/online marketing skills and learn web analytics to understand the performance of websites and ad campaigns. Approaches covered will be immediately useful for business or nonprofit organizations. If you are completely new to Google Analytics and you want to learn the basics, this guide will introduce you to the content quickly. Web analytics is critical to online marketers as they seek to track return on investment and optimize their websites. Introduction to Google Analytics covers the basics of Google Analytics, starting with creating a blog, and monitoring the number of people who see the blog posts and where they come from. What You'll Learn Understand basic techniques to generate traffic for a blog or website Review the performance of a website or campaign Set up a Shopify account to track ROI Create and maximize AdWords to track conversion Discover opportunities offered by Google, including the Google Individual Qualification Who This Book Is For Those who need to get up to speed on Google Analytics tools and techniques for business or personal use. This book is also suitable as a student reference.

R: Mining Spatial, Text, Web, and Social Media Data

2017-06-19 O'Reilly Amazon

book

Pradeepta Mishra , Richard Heimann , Nathan Danneman , Bater Makhabel

data data-science data-science-tools r Data Management Hadoop

Create data mining algorithms About This Book Develop a strong strategy to solve predictive modeling problems using the most popular data mining algorithms Real-world case studies will take you from novice to intermediate to apply data mining techniques Deploy cutting-edge sentiment analysis techniques to real-world social media data using R Who This Book Is For This Learning Path is for R developers who are looking to making a career in data analysis or data mining. Those who come across data mining problems of different complexities from web, text, numerical, political, and social media domains will find all information in this single learning path. What You Will Learn Discover how to manipulate data in R Get to know top classification algorithms written in R Explore solutions written in R based on R Hadoop projects Apply data management skills in handling large data sets Acquire knowledge about neural network concepts and their applications in data mining Create predictive models for classification, prediction, and recommendation Use various libraries on R CRAN for data mining Discover more about data potential, the pitfalls, and inferencial gotchas Gain an insight into the concepts of supervised and unsupervised learning Delve into exploratory data analysis Understand the minute details of sentiment analysis In Detail Data mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining. You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and time-series data. You will also be introduced to solutions written in R based on R Hadoop projects. Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of end-to-end data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects. After this, you will gain hands-on experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data. This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products: Learning Data Mining with R by Bater Makhabel R Data Mining Blueprints by Pradeepta Mishra Social Media Mining with R by Nathan Danneman and Richard Heimann Style and approach A complete package with which will take you from the basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining. Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Delivering Embedded Analytics in Modern Applications

2017-06-15 O'Reilly Amazon

book

Federico Castanedo , Andy Oram

data data-science analytics-platforms Analytics BI Big Data

Organizations are rapidly consuming more data than ever before, and to drive their competitive advantage, they’re demanding interactive visualizations and interactive analyses of that data be embedded in their applications and business processes. This will enable them to make faster and more effective decisions based on data, not guesses. This practical book examines the considerations that software developers, product managers, and vendors need to take into account when making visualization and analytics a seamlessly integrated part of the applications they deliver, as well as the impact of migrating their applications to modern data platforms. Authors Federico Castanedo (Vodafone Group) and Andy Oram (O’Reilly Media) explore the basic requirements for embedding domain expertise with fast, powerful, and interactive visual analytics that will delight and inform customers more than spreadsheets and custom-generated charts. Particular focus is placed on the characteristics of effective visual analytics for big and fast data. Learn the impact of trends driving embedded analytics Review examples of big data applications and their analytics requirements in retail, direct service, cybersecurity, the Internet of Things, and logistics Explore requirements for embedding visual analytics in modern data environments, including collection, storage, retrieval, data models, speed, microservices, parallelism, and interactivity Take a deep dive into the characteristics of effective visual analytics and criteria for evaluating modern embedded analytics tools Use a self-assessment rating chart to determine the value of your organization’s BI in the modern data setting

MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence

2017-06-15 O'Reilly Amazon

book

Phil Kim

data data-science data-science-tools MATLAB AI/ML Big Data

Get started with MATLAB for deep learning and AI with this in-depth primer. In this book, you start with machine learning fundamentals, then move on to neural networks, deep learning, and then convolutional neural networks. In a blend of fundamentals and applications, MATLAB Deep Learning employs MATLAB as the underlying programming language and tool for the examples and case studies in this book. With this book, you'll be able to tackle some of today's real world big data, smart bots, and other complex data problems. You'll see how deep learning is a complex and more intelligent aspect of machine learning for modern smart data analysis and usage. What You'll Learn Use MATLAB for deep learning Discover neural networks and multi-layer neural networks Work with convolution and pooling layers Build a MNIST example with these layers Who This Book Is For Those who want to learn deep learning using MATLAB. Some MATLAB experience may be useful.

R for Everyone: Advanced Analytics and Graphics, 2nd Edition

2017-06-14 O'Reilly Amazon

book

Jared P. Lander

data data-science data-science-tools r AI/ML Analytics

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. is the solution. R for Everyone, Second Edition, Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, manipulation, and visualization; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. After all this you'll make your code reproducible with LaTeX, RMarkdown, and Shiny. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. Coverage includes Explore R, RStudio, and R packages Use R for math: variable types, vectors, calling functions, and more Exploit data structures, including data.frames, matrices, and lists Read many different types of data Create attractive, intuitive statistical graphics Write user-defined functions Control program flow with if, ifelse, and complex checks Improve program efficiency with group manipulations Combine and reshape multiple datasets Manipulate strings using R's facilities and regular expressions Create normal, binomial, and Poisson probability distributions Build linear, generalized linear, and nonlinear models Program basic statistics: mean, standard deviation, and t-tests Train machine learning models Assess the quality of models and variable selection Prevent overfitting and perform variable selection, using the Elastic Net and Bayesian methods Analyze univariate and multivariate time series data Group data via K-means and hierarchical clustering Prepare reports, slideshows, and web pages with knitr Display interactive data with RMarkdown and htmlwidgets Implement dashboards with Shiny Build reusable R packages with devtools and Rcpp

Agile Data Science 2.0

2017-06-13 O'Reilly Amazon

book

Russell Jurney

data data-science Agile/Scrum Airflow Analytics Data Science

Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools. Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You’ll learn an iterative approach that lets you quickly change the kind of analysis you’re doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization. Build value from your data in a series of agile sprints, using the data-value pyramid Extract features for statistical models from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future via classification and regression Translate predictions into actions Get feedback from users after each sprint to keep your project on track

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

Advanced Analytics with R and Tableau

Mastering Predictive Analytics with R, Second Edition - Second Edition

Elegant SciPy

Learning Informatica PowerCenter 10.x - Second Edition

Business Survival Analysis Using SAS

Bayesian Psychometric Modeling

Oracle Internals

Predictive Modeling with SAS Enterprise Miner, 3rd Edition

Analysis of Clinical Trials Using SAS, 2nd Edition

Principles of Data Wrangling

Dynamic Documents with R and knitr, 2nd Edition

Analytics

Learning pandas - Second Edition

Practical Predictive Analytics

QlikView for Developers

Practical Data Science Cookbook, Second Edition - Second Edition

Focused Genograms, 2nd Edition

Text Mining with R

Advanced Object-Oriented Programming in R: Statistical Programming for Data Science, Analysis and Finance

Introduction to Google Analytics: A Guide for Absolute Beginners

R: Mining Spatial, Text, Web, and Social Media Data

Delivering Embedded Analytics in Modern Applications

MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence

R for Everyone: Advanced Analytics and Graphics, 2nd Edition

Agile Data Science 2.0