O'Reilly Data Science Books

Practical MATLAB Modeling with Simulink: Programming and Simulating Ordinary and Partial Differential Equations

2020-04-07 O'Reilly Amazon

book

Sulaymon L. Eshkabilov

data data-science data-science-tools MATLAB Data Science

Employ the essential and hands-on tools and functions of MATLAB's ordinary differential equation (ODE) and partial differential equation (PDE) packages, which are explained and demonstrated via interactive examples and case studies. This book contains dozens of simulations and solved problems via m-files/scripts and Simulink models which help you to learn programming and modeling of more difficult, complex problems that involve the use of ODEs and PDEs. You’ll become efficient with many of the built-in tools and functions of MATLAB/Simulink while solving more complex engineering and scientific computing problems that require and use differential equations. Practical MATLAB Modeling with Simulink explains various practical issues of programming and modelling. After reading and using this book, you'll be proficient at using MATLAB and applying the source code from the book's examples as templates for your own projects in data science or engineering. What You Will Learn Model complex problems using MATLAB and Simulink Gain the programming and modeling essentials of MATLAB using ODEs and PDEs Use numerical methods to solve 1st and 2nd order ODEs Solve stiff, higher order, coupled, and implicit ODEs Employ numerical methods to solve 1st and 2nd order linear PDEs Solve stiff, higher order, coupled, and implicit PDEs Who This Book Is For Engineers, programmers, data scientists, and students majoring in engineering, applied/industrial math, data science, and scientific computing. This book continues where Apress' Beginning MATLAB and Simulink leaves off.

Build a Career in Data Science

2020-03-26 O'Reilly Amazon

book

Emily Robinson , Jacqueline Nolis

data data-science Data Science

You are going to need more than technical knowledge to succeed as a data scientist. Build a Career in Data Science teaches you what school leaves out, from how to land your first job to the lifecycle of a data science project, and even how to become a manager. About the Technology What are the keys to a data scientist’s long-term success? Blending your technical know-how with the right “soft skills” turns out to be a central ingredient of a rewarding career. About the Book Build a Career in Data Science is your guide to landing your first data science job and developing into a valued senior employee. By following clear and simple instructions, you’ll learn to craft an amazing resume and ace your interviews. In this demanding, rapidly changing field, it can be challenging to keep projects on track, adapt to company needs, and manage tricky stakeholders. You’ll love the insights on how to handle expectations, deal with failures, and plan your career path in the stories from seasoned data scientists included in the book. What's Inside Creating a portfolio of data science projects Assessing and negotiating an offer Leaving gracefully and moving up the ladder Interviews with professional data scientists About the Reader For readers who want to begin or advance a data science career. About the Authors Emily Robinson is a data scientist at Warby Parker. Jacqueline Nolis is a data science consultant and mentor. Quotes Full of useful advice, real-case scenarios, and contributions from professionals in the industry. - Sebastián Palma Mardones, ArchDaily The perfect companion for someone who wants to be a successful data scientist! - Gustavo Gomes, Brightcove Insightful overview of all aspects of a data science career. - Krzysztof Jędrzejewski, Pearson Highly recommended. - Hagai Luger, Clarizen

Pandas 1.x Cookbook - Second Edition

2020-02-27 O'Reilly Amazon

book

Theodore Petrou , Matthew Harrison

data data-science data-science-tools Pandas AI/ML Data Science

The 'Pandas 1.x Cookbook' offers a recipe-based guide for mastering the powerful Python library, pandas. You will gain practical knowledge for handling and manipulating data efficiently, from the fundamentals to advanced techniques. The book is an essential resource for exploring and analyzing datasets with pandas. What this Book will help me do Understand and apply data exploration techniques in pandas. Use pandas to manipulate, aggregate, and clean datasets to extract meaningful insights. Combine pandas with Matplotlib and Seaborn to create effective visualizations. Perform time series analysis and transform datasets for machine learning. Implement workflows for handling large-scale data that exceeds your computer's memory. Author(s) Matthew Harrison and Theodore Petrou are highly experienced educators and practitioners in data science and Python programming. With their extensive expertise in using pandas, they provide insights through practical exercises and approachable narratives. Their aim is to make complex concepts accessible to learners of varying skill levels. Who is it for? This book is ideal for Python programmers, analysts, and data scientists seeking to expand their data handling and analysis capabilities. It caters to both beginners who are new to pandas and those looking to deepen their understanding of its advanced features. If your goal is to explore, clean, and analyze complex datasets efficiently, this book is tailored for you.

Advances in Data Science

2020-02-05 O'Reilly Amazon

book

Huiwen Wang , Gilbert Saporta , Rong Guan , Edwin Diday

data data-science AI/ML Data Science

Data science unifies statistics, data analysis and machine learning to achieve a better understanding of the masses of data which are produced today, and to improve prediction. Special kinds of data (symbolic, network, complex, compositional) are increasingly frequent in data science. These data require specific methodologies, but there is a lack of reference work in this field. Advances in Data Science fills this gap. It presents a collection of up-to-date contributions by eminent scholars following two international workshops held in Beijing and Paris. The 10 chapters are organized into four parts: Symbolic Data, Complex Data, Network Data and Clustering. They include fundamental contributions, as well as applications to several domains, including business and the social sciences.

Principles of Managerial Statistics and Data Science

2020-02-05 O'Reilly Amazon

book

Roberto Rivera

data data-science data-science-tasks statistics Analytics Big Data

Introduces readers to the principles of managerial statistics and data science, with an emphasis on statistical literacy of business students Through a statistical perspective, this book introduces readers to the topic of data science, including Big Data, data analytics, and data wrangling. Chapters include multiple examples showing the application of the theoretical aspects presented. It features practice problems designed to ensure that readers understand the concepts and can apply them using real data. Over 100 open data sets used for examples and problems come from regions throughout the world, allowing the instructor to adapt the application to local data with which students can identify. Applications with these data sets include: Assessing if searches during a police stop in San Diego are dependent on driver’s race Visualizing the association between fat percentage and moisture percentage in Canadian cheese Modeling taxi fares in Chicago using data from millions of rides Analyzing mean sales per unit of legal marijuana products in Washington state Topics covered in Principles of Managerial Statistics and Data Science include:data visualization; descriptive measures; probability; probability distributions; mathematical expectation; confidence intervals; and hypothesis testing. Analysis of variance; simple linear regression; and multiple linear regression are also included. In addition, the book offers contingency tables, Chi-square tests, non-parametric methods, and time series methods. The textbook: Includes academic material usually covered in introductory Statistics courses, but with a data science twist, and less emphasis in the theory Relies on Minitab to present how to perform tasks with a computer Presents and motivates use of data that comes from open portals Focuses on developing an intuition on how the procedures work Exposes readers to the potential in Big Data and current failures of its use Supplementary material includes: a companion website that houses PowerPoint slides; an Instructor's Manual with tips, a syllabus model, and project ideas; R code to reproduce examples and case studies; and information about the open portal data Features an appendix with solutions to some practice problems Principles of Managerial Statistics and Data Science is a textbook for undergraduate and graduate students taking managerial Statistics courses, and a reference book for working business professionals.

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP, 2nd Edition

2020-02-05 O'Reilly Amazon

book

Bhisham C. Gupta , Kalanka P. Jayalath , Irwin Guttman

data data-science data-science-tasks statistics AI/ML Big Data

Introduces basic concepts in probability and statistics to data science students, as well as engineers and scientists Aimed at undergraduate/graduate-level engineering and natural science students, this timely, fully updated edition of a popular book on statistics and probability shows how real-world problems can be solved using statistical concepts. It removes Excel exhibits and replaces them with R software throughout, and updates both MINITAB and JMP software instructions and content. A new chapter discussing data mining—including big data, classification, machine learning, and visualization—is featured. Another new chapter covers cluster analysis methodologies in hierarchical, nonhierarchical, and model based clustering. The book also offers a chapter on Response Surfaces that previously appeared on the book’s companion website. Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP, Second Edition is broken into two parts. Part I covers topics such as: describing data graphically and numerically, elements of probability, discrete and continuous random variables and their probability distributions, distribution functions of random variables, sampling distributions, estimation of population parameters and hypothesis testing. Part II covers: elements of reliability theory, data mining, cluster analysis, analysis of categorical data, nonparametric tests, simple and multiple linear regression analysis, analysis of variance, factorial designs, response surfaces, and statistical quality control (SQC) including phase I and phase II control charts. The appendices contain statistical tables and charts and answers to selected problems. Features two new chapters—one on Data Mining and another on Cluster Analysis Now contains R exhibits including code, graphical display, and some results MINITAB and JMP have been updated to their latest versions Emphasizes the p-value approach and includes related practical interpretations Offers a more applied statistical focus, and features modified examples to better exhibit statistical concepts Supplemented with an Instructor's-only solutions manual on a book’s companion website Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP is an excellent text for graduate level data science students, and engineers and scientists. It is also an ideal introduction to applied statistics and probability for undergraduate students in engineering and the natural sciences.

The Data Science Workshop

2020-01-29 O'Reilly Amazon

book

Tiffany Ford , Anthony So , Pritesh Tiwari , Thomas Joseph , Andrew Worsley , Ivan Liu , Dr. Samuel Asare , Robert Thas John , Barbora stetinova

data data-science AI/ML Data Science Pandas Python

The Data Science Workshop is designed for beginners looking to step into the rigorous yet rewarding world of data science. By leveraging a hands-on approach, this book demystifies key concepts and guides you gently into creating practical machine learning models with Python. What this Book will help me do Understand supervised and unsupervised learning and their applications. Gain hands-on experience with Python libraries like scikit-learn and pandas for data manipulation. Learn practical use cases of machine learning techniques such as regression and clustering. Discover techniques to ensure robustness in machine learning with hyperparameter tuning and ensembling. Develop efficiency in feature engineering with automated tools to accelerate workflows. Author(s) Anthony So None, Thomas Joseph, Robert Thas John, and Andrew Worsley are seasoned experts in data science and Python programming. Along with Dr. Samuel Asare None, they bring decades of experience and practical knowledge to this book, delivering an engaging and approachable learning experience. Who is it for? This book is targeted toward individuals who are beginners in data science and are eager to acquire foundational knowledge and practical skills. It appeals to those who prefer a structured, hands-on approach to learning, possibly having some prior programming experience or interest in Python. Professionals aspiring to pivot into data-oriented roles or students aiming to strengthen their understanding of data science concepts will find this book particularly valuable. If you're looking to gain confidence in implementing data science projects and solving real-world problems, this text is for you.

Data Science Programming All-in-One For Dummies

2020-01-09 O'Reilly Amazon

book

John Paul Mueller , Luca Massaron

data data-science AI/ML Data Science Python

Your logical, linear guide to the fundamentals of data science programming Data science is exploding—in a good way—with a forecast of 1.7 megabytes of new information created every second for each human being on the planet by 2020 and 11.5 million job openings by 2026. It clearly pays dividends to be in the know. This friendly guide charts a path through the fundamentals of data science and then delves into the actual work: linear regression, logical regression, machine learning, neural networks, recommender engines, and cross-validation of models. Data Science Programming All-In-One For Dummies is a compilation of the key data science, machine learning, and deep learning programming languages: Python and R. It helps you decide which programming languages are best for specific data science needs. It also gives you the guidelines to build your own projects to solve problems in real time. Get grounded: the ideal start for new data professionals What lies ahead: learn about specific areas that data is transforming Be meaningful: find out how to tell your data story See clearly: pick up the art of visualization Whether you’re a beginning student or already mid-career, get your copy now and add even more meaning to your life—and everyone else’s!

Practical Data Science with R, Second Edition

2019-12-23 O'Reilly Amazon

book

John Mount , Nina Zumel

data data-science AI/ML BI Computer Science Data Science

Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. You’ll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support. About the Technology Evidence-based decisions are crucial to success. Applying the right data analysis techniques to your carefully curated business data helps you make accurate predictions, identify trends, and spot trouble in advance. The R data analysis platform provides the tools you need to tackle day-to-day data analysis and machine learning tasks efficiently and effectively. About the Book Practical Data Science with R, Second Edition is a task-based tutorial that leads readers through dozens of useful, data analysis practices using the R language. By concentrating on the most important tasks you’ll face on the job, this friendly guide is comfortable both for business analysts and data scientists. Because data is only useful if it can be understood, you’ll also find fantastic tips for organizing and presenting data in tables, as well as snappy visualizations. What's Inside Statistical analysis for business pros Effective data presentation The most useful R tools Interpreting complicated predictive models About the Reader You’ll need to be comfortable with basic statistics and have an introductory knowledge of R or another high-level programming language. About the Authors Nina Zumel and John Mount founded a San Francisco–based data science consulting firm. Both hold PhDs from Carnegie Mellon University and blog on statistics, probability, and computer science. Quotes Full of useful shared experience and practical advice. Highly recommended. - From the Foreword by Jeremy Howard and Rachel Thomas Great examples and an informative walk-through of the data science process. - David Meza, NASA Offers interesting perspectives that cover many aspects of practical data science; a good reference. - Pascal Barbedor, BL SET R you ready to get data science done the right way? - Taylor Dolezal, Disney Studios

Practical DataOps: Delivering Agile Data Science at Scale

2019-12-09 O'Reilly Amazon

book

Harvinder Atwal

data data-science DataOps Agile/Scrum AI/ML Analytics

Gain a practical introduction to DataOps, a new discipline for delivering data science at scale inspired by practices at companies such as Facebook, Uber, LinkedIn, Twitter, and eBay. Organizations need more than the latest AI algorithms, hottest tools, and best people to turn data into insight-driven action and useful analytical data products. Processes and thinking employed to manage and use data in the 20th century are a bottleneck for working effectively with the variety of data and advanced analytical use cases that organizations have today. This book provides the approach and methods to ensure continuous rapid use of data to create analytical data products and steer decision making. Practical DataOps shows you how to optimize the data supply chain from diverse raw data sources to the final data product, whether the goal is a machine learning model or other data-orientated output. The book provides an approach to eliminate wasted effort and improve collaboration between data producers, data consumers, and the rest of the organization through the adoption of lean thinking and agile software development principles. This book helps you to improve the speed and accuracy of analytical application development through data management and DevOps practices that securely expand data access, and rapidly increase the number of reproducible data products through automation, testing, and integration. The book also shows how to collect feedback and monitor performance to manage and continuously improve your processes and output. What You Will Learn Develop a data strategy for your organization to help it reach its long-term goals Recognize and eliminate barriers to delivering data to users at scale Work on the right things for the right stakeholders through agile collaboration Create trust in data via rigorous testing and effective data management Build a culture of learning and continuous improvement through monitoring deployments and measuring outcomes Create cross-functional self-organizing teams focused on goals not reporting lines Build robust, trustworthy, data pipelines in support of AI, machine learning, and other analytical data products Who This Book Is For Data science and advanced analytics experts, CIOs, CDOs (chief data officers), chief analytics officers, business analysts, business team leaders, and IT professionals (data engineers, developers, architects, and DBAs) supporting data teams who want to dramatically increase the value their organization derives from data. The book is ideal for data professionals who want to overcome challenges of long delivery time, poor data quality, high maintenance costs, and scaling difficulties in getting data science output and machine learning into customer-facing production.

Beginning MATLAB and Simulink: From Novice to Professional

2019-11-28 O'Reilly Amazon

book

Sulaymon Eshkabilov

data data-science data-science-tools MATLAB Data Science DataViz

Employ essential and hands-on tools and functions of the MATLAB and Simulink packages, which are explained and demonstrated via interactive examples and case studies. This book contains dozens of simulation models and solved problems via m-files/scripts and Simulink models which help you to learn programming and modeling essentials. You’ll become efficient with many of the built-in tools and functions of MATLAB/Simulink while solving engineering and scientific computing problems. Beginning MATLAB and Simulink explains various practical issues of programming and modelling in parallel by comparing MATLAB and Simulink. After reading and using this book, you'll be proficient at using MATLAB and applying the source code from the book's examples as templates for your own projects in data science or engineering. What You Will Learn Get started using MATLAB and Simulink Carry out data visualization with MATLAB Gain the programming and modeling essentials of MATLAB Build a GUI with MATLAB Work with integration and numerical root finding methods Apply MATLAB to differential equations-based models and simulations Use MATLAB for data science projects Who This Book Is For Engineers, programmers, data scientists, and students majoring in engineering and scientific computing.

The Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders

2019-11-26 O'Reilly Amazon

book

Stylianos Kampakis

data data-science AI/ML Data Collection Data Science

Data science is expanding across industries at a rapid pace, and the companies first to adopt best practices will gain a significant advantage. To reap the benefits, decision makers need to have a confident understanding of data science and its application in their organization. It is easy for novices to the subject to feel paralyzed by intimidating buzzwords, but what many don’t realize is that data science is in fact quite multidisciplinary—useful in the hands of business analysts, communications strategists, designers, and more. With the second edition of The Decision Maker’s Handbook to Data Science, you will learn how to think like a veteran data scientist and approach solutions to business problems in an entirely new way. Author Stylianos Kampakis provides you with the expertise and tools required to develop a solid data strategy that is continuously effective. Ethics and legal issues surrounding data collection and algorithmic bias are some common pitfalls that Kampakis helps you avoid, while guiding you on the path to build a thriving data science culture at your organization. This updated and revised second edition, includes plenty of case studies, tools for project assessment, and expanded content for hiring and managing data scientists Data science is a language that everyone at a modern company should understand across departments. Friction in communication arises most often when management does not connect with what a data scientist is doing or how impactful data collection and storage can be for their organization. The Decision Maker’s Handbook to Data Science bridges this gap and readies you for both the present and future of your workplace in this engaging, comprehensive guide. What You Will Learn Understand how data science can be used within your business. Recognize the differences between AI, machine learning, and statistics. Become skilled at thinking like a data scientist, without being one. Discover how to hire and manage data scientists. Comprehend how to build the right environment in order to make your organization data-driven. Who This Book Is For Startup founders, product managers, higher level managers, and any other non-technical decision makers who are thinking to implement data science in their organization and hire data scientists. A secondary audience includes people looking for a soft introduction into the subject of data science.

Reporting, Predictive Analytics, and Everything in Between

2019-11-25 O'Reilly Amazon

book

Brett Stupakevich , David Sweenor , Shane Swiderek

data data-science business-intelligence AI/ML Analytics BI

Business decisions today are tactical and strategic at the same time. How do you respond to a competitor’s price change? Or to specific technology changes? What new products, markets, or businesses should you pursue? Decisions like these are based on information from only one source: data. With this practical report, technical and non-technical leaders alike will explore the fundamental elements necessary to embark on a data analytics initiative. Is your company planning or contemplating a data analytics initiative? Authors Brett Stupakevich, David Sweenor, and Shane Swiderek from TIBCO guide you through several analytics options. IT leaders, product developers, analytics leaders, data analysts, data scientists, and business professionals will learn how to deploy analytic components in streaming and embedded systems using one of five platforms. You’ll examine: Analytics platforms including embedded BI, reporting, data exploration & discovery, streaming BI, and data science & machine learning The business problems each option solves and the capabilities and requirements of each How to identify the right analytics type for your particular use case Key considerations and the level of investment for each analytics platform

Advanced Statistics with Applications in R

2019-11-12 O'Reilly Amazon

book

Eugene Demidenko

data data-science data-science-tasks statistics Data Science

Advanced Statistics with Applications in R fills the gap between several excellent theoretical statistics textbooks and many applied statistics books where teaching reduces to using existing packages. This book looks at what is under the hood. Many statistics issues including the recent crisis with p-value are caused by misunderstanding of statistical concepts due to poor theoretical background of practitioners and applied statisticians. This book is the product of a forty-year experience in teaching of probability and statistics and their applications for solving real-life problems. There are more than 442 examples in the book: basically every probability or statistics concept is illustrated with an example accompanied with an R code. Many examples, such as Who said π? What team is better? The fall of the Roman empire, James Bond chase problem, Black Friday shopping, Free fall equation: Aristotle or Galilei, and many others are intriguing. These examples cover biostatistics, finance, physics and engineering, text and image analysis, epidemiology, spatial statistics, sociology, etc. Advanced Statistics with Applications in R teaches students to use theory for solving real-life problems through computations: there are about 500 R codes and 100 datasets. These data can be freely downloaded from the author's website dartmouth.edu/~eugened. This book is suitable as a text for senior undergraduate students with major in statistics or data science or graduate students. Many researchers who apply statistics on the regular basis find explanation of many fundamental concepts from the theoretical perspective illustrated by concrete real-world applications.

Managing Data Science

2019-11-12 O'Reilly Amazon

book

Kirill Dubovikov

data data-science Data Science DevOps

Discover how to successfully manage data science projects and build high-performing teams with 'Managing Data Science.' This book provides actionable insights on handling the entire data science workflow, from conception to production, and addresses common challenges with practical strategies. What this Book will help me do Understand the fundamentals of building scalable and efficient data science pipelines. Acquire techniques to manage every stage of data science projects effectively, from prototype to production. Learn proven strategies for assembling, cultivating, and sustaining a skilled data science team. Explore the latest tools, methodologies, and best practices in ModelOps and DevOps for data science. Gain insights into troubleshooting and optimizing data science workflows to achieve organizational goals. Author(s) None Dubovikov is a seasoned expert in data science and project management, bringing years of hands-on experience to both domains. With a passion for leveraging data to drive business success, None guides readers through building sustainable practices and effective teams in the growing field of data science. Who is it for? This book is perfect for data science professionals, project managers, and business leaders seeking practical guidance to reap the benefits of data-driven decision-making. Designed for readers with a foundational understanding of data science, it helps bridge the gap between technical expertise and managerial efficiency.

Business Analytics, Volume II

2019-11-08 O'Reilly Amazon

book

Dr. Amar Sahay

data data-science business-intelligence AI/ML Analytics BI

This business analytics (BA) text discusses the models based on fact-based data to measure past business performance to guide an organization in visualizing and predicting future business performance and outcomes. It provides a comprehensive overview of analytics in general with an emphasis on predictive analytics. Given the booming interest in analytics and data science, this book is timely and informative. It brings many terms, tools, and methods of analytics together. The first three chapters provide an introduction to BA, importance of analytics, types of BA-descriptive, predictive, and prescriptive-along with the tools and models. Business intelligence (BI) and a case on descriptive analytics are discussed. Additionally, the book discusses on the most widely used predictive models, including regression analysis, forecasting, data mining, and an introduction to recent applications of predictive analytics-machine learning, neural networks, and artificial intelligence. The concluding chapter discusses on the current state, job outlook, and certifications in analytics.

Clustering Methodology for Symbolic Data

2019-11-04 O'Reilly Amazon

book

Edwin Diday , Lynne Billard

data data-science data-science-tasks exploratory-data-analysis Data Management Data Science

Covers everything readers need to know about clustering methodology for symbolic data—including new methods and headings—while providing a focus on multi-valued list data, interval data and histogram data This book presents all of the latest developments in the field of clustering methodology for symbolic data—paying special attention to the classification methodology for multi-valued list, interval-valued and histogram-valued data methodology, along with numerous worked examples. The book also offers an expansive discussion of data management techniques showing how to manage the large complex dataset into more manageable datasets ready for analyses. Filled with examples, tables, figures, and case studies, Clustering Methodology for Symbolic Data begins by offering chapters on data management, distance measures, general clustering techniques, partitioning, divisive clustering, and agglomerative and pyramid clustering. Provides new classification methodologies for histogram valued data reaching across many fields in data science Demonstrates how to manage a large complex dataset into manageable datasets ready for analysis Features very large contemporary datasets such as multi-valued list data, interval-valued data, and histogram-valued data Considers classification models by dynamical clustering Features a supporting website hosting relevant data sets Clustering Methodology for Symbolic Data will appeal to practitioners of symbolic data analysis, such as statisticians and economists within the public sectors. It will also be of interest to postgraduate students of, and researchers within, web mining, text mining and bioengineering.

Mastering pandas - Second Edition

2019-10-25 O'Reilly Amazon

book

Ashish Kumar

data data-science data-science-tools Pandas Analytics Data Science

Mastering pandas is the ultimate guide to harnessing the power of the pandas library for data analysis. Covering everything from installation to advanced techniques, this book provides comprehensive instructions and examples to help you perform efficient data manipulation and visualization. Explore key features of pandas, such as multi-indexing and time series analysis, and become proficient in actionable analytics. What this Book will help me do Master importing and managing datasets of various formats using pandas. Expertly handle missing data and clean datasets for robust analysis. Create powerful visualizations and reports using pandas and Jupyter notebooks. Leverage advanced indexing and grouping techniques to derive insights. Utilize pandas for time series analysis to analyze trends and patterns. Author(s) None Kumar is an experienced data scientist specializing in data analysis and visualization using Python. With a deep understanding of the pandas library, None has been helping professionals and enthusiasts alike to make data-driven decisions. Known for an example-driven teaching style, None bridges complex theoretical concepts with practical applications in data science. Who is it for? If you're a data scientist, analyst, or Python developer seeking to enhance your data analysis capabilities, this book is for you. Prior knowledge of Python is beneficial but not mandatory, as foundational concepts are explained. This guide spans beginner to advanced topics, accommodating users looking to deepen their skills and those aiming to start with pandas.

SAS for R Users

2019-09-24 O'Reilly Amazon

book

Ajay Ohri

data data-science analytics-platforms SAS Analytics Cloud Computing

BRIDGES THE GAP BETWEEN SAS AND R, ALLOWING USERS TRAINED IN ONE LANGUAGE TO EASILY LEARN THE OTHER SAS and R are widely-used, very different software environments. Prized for its statistical and graphical tools, R is an open-source programming language that is popular with statisticians and data miners who develop statistical software and analyze data. SAS (Statistical Analysis System) is the leading corporate software in analytics thanks to its faster data handling and smaller learning curve. SAS for R Users enables entry-level data scientists to take advantage of the best aspects of both tools by providing a cross-functional framework for users who already know R but may need to work with SAS. Those with knowledge of both R and SAS are of far greater value to employers, particularly in corporate settings. Using a clear, step-by-step approach, this book presents an analytics workflow that mirrors that of the everyday data scientist. This up-to-date guide is compatible with the latest R packages as well as SAS University Edition. Useful for anyone seeking employment in data science, this book: Instructs both practitioners and students fluent in one language seeking to learn the other Provides command-by-command translations of R to SAS and SAS to R Offers examples and applications in both R and SAS Presents step-by-step guidance on workflows, color illustrations, sample code, chapter quizzes, and more Includes sections on advanced methods and applications Designed for professionals, researchers, and students, SAS for R Users is a valuable resource for those with some knowledge of coding and basic statistics who wish to enter the realm of data science and business analytics. AJAY OHRI is the founder of analytics startup Decisionstats.com. His research interests include spreading open source analytics, analyzing social media manipulation with mechanism design, simpler interfaces to cloud computing, investigating climate change, and knowledge flows. He currently advises startups in analytics off shoring, analytics services, and analytics. He is the author of Python for R Users: A Data Science Approach (Wiley), R for Business Analytics, and R for Cloud Computing.

Model Management and Analytics for Large Scale Systems

2019-09-14 O'Reilly Amazon

book

Mehmet Aksit , Loek Cleophas , Bedir Tekinerdogan , Mark van den Brand , Önder Babur

data data-science Analytics Big Data Data Analytics Data Management

Model Management and Analytics for Large Scale Systems covers the use of models and related artefacts (such as metamodels and model transformations) as central elements for tackling the complexity of building systems and managing data. With their increased use across diverse settings, the complexity, size, multiplicity and variety of those artefacts has increased. Originally developed for software engineering, these approaches can now be used to simplify the analytics of large-scale models and automate complex data analysis processes. Those in the field of data science will gain novel insights on the topic of model analytics that go beyond both model-based development and data analytics. This book is aimed at both researchers and practitioners who are interested in model-based development and the analytics of large-scale models, ranging from big data management and analytics, to enterprise domains. The book could also be used in graduate courses on model development, data analytics and data management. Identifies key problems and offers solution approaches and tools that have been developed or are necessary for model management and analytics Explores basic theory and background, current research topics, related challenges and the research directions for model management and analytics Provides a complete overview of model management and analytics frameworks, the different types of analytics (descriptive, diagnostics, predictive and prescriptive), the required modelling and method steps, and important future directions

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data

2019-09-07 O'Reilly Amazon

book

Ervin Varga

data data-science AI/ML Big Data Cloud Computing Data Engineering

Gain insight into essential data science skills in a holistic manner using data engineering and associated scalable computational methods. This book covers the most popular Python 3 frameworks for both local and distributed (in premise and cloud based) processing. Along the way, you will be introduced to many popular open-source frameworks, like, SciPy, scikitlearn, Numba, Apache Spark, etc. The book is structured around examples, so you will grasp core concepts via case studies and Python 3 code. As data science projects gets continuously larger and more complex, software engineering knowledge and experience is crucial to produce evolvable solutions. You'll see how to create maintainable software for data science and how to document data engineering practices. This book is a good starting point for people who want to gain practical skills to perform data science. All the code willbe available in the form of IPython notebooks and Python 3 programs, which allow you to reproduce all analyses from the book and customize them for your own purpose. You'll also benefit from advanced topics like Machine Learning, Recommender Systems, and Security in Data Science. Practical Data Science with Python will empower you analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors. What You'll Learn Play the role of a data scientist when completing increasingly challenging exercises using Python 3 Work work with proven data science techniques/technologies Review scalable software engineering practices to ramp up data analysis abilities in the realm of Big Data Apply theory of probability, statistical inference, and algebra to understand the data sciencepractices Who This Book Is For Anyone who would like to embark into the realm of data science using Python 3.

Learn Python by Building Data Science Applications

2019-08-30 O'Reilly Amazon

book

Philipp Kats , David Katz

data data-science AI/ML CI/CD Data Science Matplotlib

Learn Python by Building Data Science Applications takes a hands-on approach to teaching Python programming by guiding you through building engaging real-world data science projects. This book introduces Python's rich ecosystem and equips you with the skills to analyze data, train models, and deploy them as efficient applications. What this Book will help me do Get proficient in Python programming by learning core topics like data structures, loops, and functions. Explore data science libraries such as NumPy, Pandas, and scikit-learn to analyze and process data. Learn to create visualizations with Matplotlib and Altair, simplifying data communication. Build and deploy machine learning models using Python and share them as web services. Understand development practices such as testing, packaging, and continuous integration for professional workflows. Author(s) None Kats and None Katz are seasoned Python developers with years of experience in teaching programming and deploying data science applications. Their expertise spans providing learners with practical knowledge and versatile skills. They combine clear explanations with engaging projects to ensure a rewarding learning experience. Who is it for? This book is ideal for individuals new to programming or data science who want to learn Python through practical projects. Researchers, analysts, and ambitious students with minimal coding background but a keen interest in data analysis and application development will find this book beneficial. It's a perfect choice for anyone eager to explore and leverage Python for real-world solutions.

Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions

2019-08-23 O'Reilly Amazon

book

Matt Taddy

data ai-ml machine-learning AI/ML Analytics Big Data

Publisher's Note: Products purchased from Third Party sellers are not guaranteed by the publisher for quality, authenticity, or access to any online entitlements included with the product. Use machine learning to understand your customers, frame decisions, and drive value The business analytics world has changed, and Data Scientists are taking over. Business Data Science takes you through the steps of using machine learning to implement best-in-class business data science. Whether you are a business leader with a desire to go deep on data, or an engineer who wants to learn how to apply Machine Learning to business problems, you’ll find the information, insight, and tools you need to flourish in today’s data-driven economy. You’ll learn how to: •Use the key building blocks of Machine Learning: sparse regularization, out-of-sample validation, and latent factor and topic modeling •Understand how use ML tools in real world business problems, where causation matters more that correlation •Solve data science programs by scripting in the R programming language Today’s business landscape is driven by data and constantly shifting. Companies live and die on their ability to make and implement the right decisions quickly and effectively. Business Data Science is about doing data science right. It’s about the exciting things being done around Big Data to run a flourishing business. It’s about the precepts, principals, and best practices that you need know for best-in-class business data science.

R Data Science Quick Reference: A Pocket Guide to APIs, Libraries, and Packages

2019-08-07 O'Reilly Amazon

book

Thomas Mailund

data data-science data-science-tools r Analytics API

In this handy, practical book you will cover each concept concisely, with many illustrative examples. You'll be introduced to several R data science packages, with examples of how to use each of them. In this book, you’ll learn about the following APIs and packages that deal specifically with data science applications: readr, dibble, forecasts, lubridate, stringr, tidyr, magnittr, dplyr, purrr, ggplot2, modelr, and more. After using this handy quick reference guide, you'll have the code, APIs, and insights to write data science-based applications in the R programming language. You'll also be able to carry out data analysis. What You Will Learn Import data with readr Work with categories using forcats, time and dates with lubridate, and strings with stringr Format data using tidyr and then transform that data using magrittr and dplyrWrite functions with R for data science, data mining, and analytics-based applications Visualize data with ggplot2 and fit data to models using modelr Who This Book Is For Programmers new to R's data science, data mining, and analytics packages. Some prior coding experience with R in general is recommended.

Hands-On Data Analysis with Pandas

2019-07-26 O'Reilly Amazon

book

Stefanie Molin

data data-science data-science-tools Pandas AI/ML Analytics

Hands-On Data Analysis with Pandas provides an intensive dive into mastering the pandas library for data science and analysis using Python. Through a combination of conceptual explanations and practical demonstrations, readers will learn how to manipulate, visualize, and analyze data efficiently. What this Book will help me do Understand and apply the pandas library for efficient data manipulation. Learn to perform data wrangling tasks such as cleaning and reshaping datasets. Create effective visualizations using pandas and libraries like matplotlib and seaborn. Grasp the basics of machine learning and implement solutions with scikit-learn. Develop reusable data analysis scripts and modules in Python. Author(s) Stefanie Molin is a seasoned data scientist and software engineer with extensive experience in Python and data analytics. She specializes in leveraging the latest data science techniques to solve real-world problems. Her engaging and detailed writing draws from her practical expertise, aiming to make complex concepts accessible to all. Who is it for? This book is ideal for data analysts and aspiring data scientists who are at the beginning stages of their careers or looking to enhance their toolset with pandas and Python. It caters to Python developers eager to delve into data analysis workflows. Readers should have some programming knowledge to fully benefit from the examples and exercises.

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

Practical MATLAB Modeling with Simulink: Programming and Simulating Ordinary and Partial Differential Equations

Build a Career in Data Science

Pandas 1.x Cookbook - Second Edition

Advances in Data Science

Principles of Managerial Statistics and Data Science

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP, 2nd Edition

The Data Science Workshop

Data Science Programming All-in-One For Dummies

Practical Data Science with R, Second Edition

Practical DataOps: Delivering Agile Data Science at Scale

Beginning MATLAB and Simulink: From Novice to Professional

The Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders

Reporting, Predictive Analytics, and Everything in Between

Advanced Statistics with Applications in R

Managing Data Science

Business Analytics, Volume II

Clustering Methodology for Symbolic Data

Mastering pandas - Second Edition

SAS for R Users

Model Management and Analytics for Large Scale Systems

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data

Learn Python by Building Data Science Applications

Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions

R Data Science Quick Reference: A Pocket Guide to APIs, Libraries, and Packages

Hands-On Data Analysis with Pandas