talk-data.com talk-data.com

Topic

data

2093

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Complex Network Analysis in Python

Construct, analyze, and visualize networks with networkx, a Python language module. Network analysis is a powerful tool you can apply to a multitude of datasets and situations. Discover how to work with all kinds of networks, including social, product, temporal, spatial, and semantic networks. Convert almost any real-world data into a complex network--such as recommendations on co-using cosmetic products, muddy hedge fund connections, and online friendships. Analyze and visualize the network, and make business decisions based on your analysis. If you're a curious Python programmer, a data scientist, or a CNA specialist interested in mechanizing mundane tasks, you'll increase your productivity exponentially. Complex network analysis used to be done by hand or with non-programmable network analysis tools, but not anymore! You can now automate and program these tasks in Python. Complex networks are collections of connected items, words, concepts, or people. By exploring their structure and individual elements, we can learn about their meaning, evolution, and resilience. Starting with simple networks, convert real-life and synthetic network graphs into networkx data structures. Look at more sophisticated networks and learn more powerful machinery to handle centrality calculation, blockmodeling, and clique and community detection. Get familiar with presentation-quality network visualization tools, both programmable and interactive--such as Gephi, a CNA explorer. Adapt the patterns from the case studies to your problems. Explore big networks with NetworKit, a high-performance networkx substitute. Each part in the book gives you an overview of a class of networks, includes a practical study of networkx functions and techniques, and concludes with case studies from various fields, including social networking, anthropology, marketing, and sports analytics. Combine your CNA and Python programming skills to become a better network analyst, a more accomplished data scientist, and a more versatile programmer. What You Need: You will need a Python 3.x installation with the following additional modules: Pandas (>=0.18), NumPy (>=1.10), matplotlib (>=1.5), networkx (>=1.11), python-louvain (>=0.5), NetworKit (>=3.6), and generalizesimilarity. We recommend using the Anaconda distribution that comes with all these modules, except for python-louvain, NetworKit, and generalizedsimilarity, and works on all major modern operating systems.

Analyzing Baseball Data with R

With its flexible capabilities and open-source platform, R has become a major tool for analyzing detailed, high-quality baseball data. Analyzing Baseball Data with R provides an introduction to R for sabermetricians, baseball enthusiasts, and students interested in exploring the rich sources of baseball data. It equips readers with the necessary skills and software tools to perform all of the analysis steps, from gathering the datasets and entering them in a convenient format to visualizing the data via graphs to performing a statistical analysis. The authors first present an overview of publicly available baseball datasets and a gentle introduction to the type of data structures and exploratory and data management capabilities of R. They also cover the traditional graphics functions in the base package and introduce more sophisticated graphical displays available through the lattice and ggplot2 packages. Much of the book illustrates the use of R through popular sabermetrics topics, including the Pythagorean formula, runs expectancy, career trajectories, simulation of games and seasons, patterns of streaky behavior of players, and fielding measures. Each chapter contains exercises that encourage readers to perform their own analyses using R. All of the datasets and R code used in the text are available online. This book helps readers answer questions about baseball teams, players, and strategy using large, publically available datasets. It offers detailed instructions on downloading the datasets and putting them into formats that simplify data exploration and analysis. Through the book’s various examples, readers will learn about modern sabermetrics and be able to conduct their own baseball analyses.

Practical Big Data Analytics

Practical Big Data Analytics is your ultimate guide to harnessing Big Data technologies for enterprise analytics and machine learning. By leveraging tools like Hadoop, Spark, NoSQL databases, and frameworks such as R, this book equips you with the skills to implement robust data solutions that drive impactful business insights. Gain practical expertise in handling data at scale and uncover the value behind the numbers. What this Book will help me do Master the fundamental concepts of Big Data storage, processing, and analytics. Gain practical skills in using tools like Hadoop, Spark, and NoSQL databases for large-scale data handling. Develop and deploy machine learning models and dashboards with R and R Shiny. Learn strategies for creating cost-efficient and scalable enterprise data analytics solutions. Understand and implement effective approaches to combining Big Data technologies for actionable insights. Author(s) None Dasgupta is an expert in Big Data analytics, statistical methodologies, and enterprise data solutions. With years of experience consulting on enterprise data platforms and working with leading industry technologies, Dasgupta brings a wealth of practical knowledge to help readers navigate and succeed in the field of Big Data. Through this book, Dasgupta shares an accessible and systematic way to learn and apply key Big Data concepts. Who is it for? This book is ideal for professionals eager to delve into Big Data analytics, regardless of their current level of expertise. It accommodates both aspiring analysts and seasoned IT professionals looking to enhance their knowledge in data-driven decision making. Individuals with a technical inclination and a drive to build Big Data architectures will find this book particularly beneficial. No prior knowledge of Big Data is required, although familiarity with programming concepts will enhance the learning experience.

SAS Certification Prep Guide, 4th Edition

Prepare for the SAS Base Programming for SAS 9 exam with the official guide by the SAS Global Certification Program. New and experienced SAS users who want to prepare for the SAS Base Programming for SAS 9 exam will find this guide to be an invaluable, convenient, and comprehensive resource that covers all of the objectives tested on the exam. Now in its fourth edition, the guide has been extensively updated, and revised to streamline explanations. Major topics include importing and exporting raw data files, creating and modifying SAS data sets, and identifying and correcting data syntax and programming logic errors. The chapter quizzes have been thoroughly updated and full solutions are included at the back of the book. In addition, links are provided to the exam objectives, practice exams, and other helpful resources, such as the updated Base SAS glossary and an expanded collection of practice data sets. Content updates are available here.

Statistical Rethinking

Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work. The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation. By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling. Web Resource The book is accompanied by an R package (rethinking) that is available on the author’s website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.

IBM SPSS Modeler Essentials

Learn how to leverage IBM SPSS Modeler for your data mining and predictive analytics needs in this comprehensive guide. With step-by-step instructions, you'll acquire the skills to import, clean, analyze, and model your data using this robust platform. By the end, you'll be equipped to uncover patterns and trends, enabling data-driven decision-making confidently. What this Book will help me do Understand the fundamentals of data mining and the visual programming interface of IBM SPSS Modeler. Prepare, clean, and preprocess data effectively for analysis and modeling. Build robust predictive models such as decision trees using best practices. Evaluate the performance of your analytical models to ensure accuracy and reliability. Export resulting analyses to apply insights to real-world data projects. Author(s) Keith McCormick and Jesus Salcedo are accomplished professionals in data analytics and statistical modeling. With extensive experience in consulting and teaching, they have guided many in mastering IBM SPSS Modeler through both hands-on workshops and written material. Their approachable teaching style and commitment to clarity ensure accessibility for learners. Who is it for? This book is designed for beginner users of IBM SPSS Modeler who wish to gain practical and actionable skills in data analytics. If you're a data enthusiast looking to explore predictive analytics or a professional eager to discover the insights hidden in your organizational data, this book is for you. A basic understanding of data mining concepts is advantageous but not required. This resource will set any novice on the path toward expert-level comprehension and application.

Learning Alteryx

Learning Alteryx introduces you to using the powerful Alteryx platform for self-service analytics, helping you master key features like data preparation and predictive analytics without needing to code. With this book, you'll gain the skills to create workflows that generate actionable insights, empowering your business to make data-driven decisions. What this Book will help me do Master creating and optimizing workflows in Alteryx to address complex analytical problems. Learn how to clean, prepare, and blend data from various sources efficiently. Understand advanced Alteryx expressions for processing large datasets effectively. Develop meaningful reports and visualizations to communicate insights clearly. Leverage predictive analytics capabilities in Alteryx to make informed decisions. Author(s) The authors of Learning Alteryx collectively bring years of expertise in data analytics and business intelligence. Having worked on diverse projects across multiple industries, they understand the challenges faced by data professionals and are skilled in simplifying complex concepts. They focus on providing practical insights and step-by-step guides to empower learners. Who is it for? Learning Alteryx is ideal for professionals aspiring to enhance their data analytics capabilities or explore self-service analytics. It caters to beginners unfamiliar with analytics platforms, as well as intermediate users seeking to deepen their Alteryx knowledge. Readers should have a basic understanding of data analysis principles.

R Programming By Example

"R Programming By Example" serves as an engaging and practical introduction to the R programming language for data analysis and visualization. Through step-by-step examples and comprehensive guides, this book builds your understanding from foundational knowledge to advanced applications in R. You will master programming practices while analyzing real-world scenarios. What this Book will help me do Gain proficiency in leveraging R's versatile features and package ecosystem to tackle data analysis tasks. Learn to create and customize high-quality visualizations, including 3D graphs, for enhanced data presentation. Understand statistical modeling and descriptive analysis techniques for extracting insights from data. Discover efficient programming strategies in R, including code profiling and parallelization, to optimize performance. Acquire the skills to interface R with databases and RESTful APIs for robust data integration. Author(s) The authors, None Trejo Navarro and Omar Trejo Navarro, bring a wealth of experience in statistical programming and data analysis. Having worked extensively with R, they focus on practical and results-driven teaching. They have a passion for making complex topics accessible to learners. Who is it for? This book is aimed at aspiring data scientists, statisticians, or analysts looking to learn R. It is particularly suitable for readers familiar with basic programming concepts and who wish to apply R in practical scenarios. Whether you're analyzing data, building models, or creating visualizations, this book will guide you effectively. If you're eager to advance your R skills through hands-on projects, this is for you.

SciPy Recipes

Dive into the world of scientific computing with 'SciPy Recipes', a practical guide tailored for anyone seeking hands-on experience with the SciPy stack. With over 110 detailed recipes, you'll gain expertise in handling real-world data challenges, from statistical computations to crafting intricate visualizations and beyond. What this Book will help me do Learn to use the SciPy Stack libraries like NumPy, pandas, and matplotlib effectively for scientific computing tasks. Master data wrangling techniques using pandas for efficient data manipulation. Understand the process of creating informative visualizations using matplotlib. Perform advanced statistical and numerical computations with simplicity. Solve real-world problems like numerical analysis and linear algebra using SciPy components. Author(s) None Martins, Ruben Oliva Ramos, and V Kishore Ayyadevara bring years of experience in scientific computing and Python programming to this book. Individually, they have contributed extensively to the implementation of computational tools and systems. Together, they've crafted this book to be both accessible to learners and insightful for practitioners, blending instruction with real-world practical applications. Who is it for? This book is designed for Python developers, data scientists, and analysts eager to venture into scientific computing. If you have a basic understanding of Python and aspire to effectively manipulate and visualize data using the SciPy stack, this book is perfect for you. It's equally beneficial for those who seek practical solutions to complex computational challenges. Begin your journey into scientific computing with this essential guide.

Adaptive Filtering

This book covers the fundamentals of adaptive filtering, with a focus on the least mean square (LMS) adaptive filter. It discusses random variables, stochastic processes, vectors, matrices, determinants, discrete random signals, and probability distributions, while delivering a concise introduction to MATLAB®—complete with problems, computer experiments, and over 110 functions and script files. The text not only addresses the basics of the LMS adaptive filter algorithm but also explores the Wiener filter and its applications, details the steepest descent method, and develops the Newton’s algorithm.

Biological and Medical Sensor Technologies

Edited by a pioneer in the area of advanced semiconductor materials, this book contains contributions from experts who explore the development and use of sensors in biological and medical applications. It covers advanced sensing and communications, modeling of DNA-derivative architecture, and the use of enzyme and quartz crystal microbalance-based biosensors. The book also addresses biosensors in human behavior measurement, sweat rate wearable sensors, and the future of medical imaging, including developments in spatial and spectral resolution of semiconductor detectors. Contributors discuss application of high-resolution CdTe detectors in gamma ray imaging and recent advances in positron emission tomography technology.

Electronically Scanned Arrays MATLAB® Modeling and Simulation

Electronically scanned arrays (ESAs) have become a key technology for sensor electronic systems. MATLAB® provides an excellent framework for ESA design and analysis, and this book is an invaluable resource for those who require simulation analysis tools that provide insight and understanding for ESA design. In addition to covering ESA fundamentals such as pattern synthesis, grating lobes, and instantaneous bandwidth, the text also provides insight into pattern optimization, subarray beamforming, space-based application of ESAs, and ESA reliability modeling. The book provides MATLAB code, giving readers an opportunity to model ESAs and develop an in-depth understanding that other books do not offer.

Fundamentals of Predictive Analytics with JMP, Second Edition

Written for students in undergraduate and graduate statistics courses, as well as for the practitioner who wants to make better decisions from data and models, this updated and expanded second edition of Fundamentals of Predictive Analytics with JMP(R) bridges the gap between courses on basic statistics, which focus on univariate and bivariate analysis, and courses on data mining and predictive analytics. Going beyond the theoretical foundation, this book gives you the technical knowledge and problem-solving skills that you need to perform real-world multivariate data analysis. First, this book teaches you to recognize when it is appropriate to use a tool, what variables and data are required, and what the results might be. Second, it teaches you how to interpret the results and then, step-by-step, how and where to perform and evaluate the analysis in JMP . Using JMP 13 and JMP 13 Pro, this book offers the following new and enhanced features in an example-driven format: an add-in for Microsoft Excel Graph Builder dirty data visualization regression ANOVA logistic regression principal component analysis LASSO elastic net cluster analysis decision trees k-nearest neighbors neural networks bootstrap forests boosted trees text mining association rules model comparison With today’s emphasis on business intelligence, business analytics, and predictive analytics, this second edition is invaluable to anyone who needs to expand his or her knowledge of statistics and to apply real-world, problem-solving analysis. This book is part of the SAS Press program.

A Data Scientist's Guide to Acquiring, Cleaning, and Managing Data in R

The only how-to guide offering a unified, systemic approach to acquiring, cleaning, and managing data in R Every experienced practitioner knows that preparing data for modeling is a painstaking, time-consuming process. Adding to the difficulty is that most modelers learn the steps involved in cleaning and managing data piecemeal, often on the fly, or they develop their own ad hoc methods. This book helps simplify their task by providing a unified, systematic approach to acquiring, modeling, manipulating, cleaning, and maintaining data in R. Starting with the very basics, data scientists Samuel E. Buttrey and Lyn R. Whitaker walk readers through the entire process. From what data looks like and what it should look like, they progress through all the steps involved in getting data ready for modeling. They describe best practices for acquiring data from numerous sources; explore key issues in data handling, including text/regular expressions, big data, parallel processing, merging, matching, and checking for duplicates; and outline highly efficient and reliable techniques for documenting data and recordkeeping, including audit trails, getting data back out of R, and more. The only single-source guide to R data and its preparation, it describes best practices for acquiring, manipulating, cleaning, and maintaining data Begins with the basics and walks readers through all the steps necessary to get data ready for the modeling process Provides expert guidance on how to document the processes described so that they are reproducible Written by seasoned professionals, it provides both introductory and advanced techniques Features case studies with supporting data and R code, hosted on a companion website A Data Scientist's Guide to Acquiring, Cleaning and Managing Data in R is a valuable working resource/bench manual for practitioners who collect and analyze data, lab scientists and research associates of all levels of experience, and graduate-level data mining students.

Data Mining Algorithms in C++: Data Patterns and Algorithms for Modern Applications

Discover hidden relationships among the variables in your data, and learn how to exploit these relationships. This book presents a collection of data-mining algorithms that are effective in a wide variety of prediction and classification applications. All algorithms include an intuitive explanation of operation, essential equations, references to more rigorous theory, and commented C++ source code. Many of these techniques are recent developments, still not in widespread use. Others are standard algorithms given a fresh look. In every case, the focus is on practical applicability, with all code written in such a way that it can easily be included into any program. The Windows-based DATAMINE program lets you experiment with the techniques before incorporating them into your own work. What You'll Learn Use Monte-Carlo permutation tests to provide statistically sound assessments of relationships present in your data Discover how combinatorially symmetric cross validation reveals whether your model has true power or has just learned noise by overfitting the data Work with feature weighting as regularized energy-based learning to rank variables according to their predictive power when there is too little data for traditional methods See how the eigenstructure of a dataset enables clustering of variables into groups that exist only within meaningful subspaces of the data Plot regions of the variable space where there is disagreement between marginal and actual densities, or where contribution to mutual information is high Who This Book Is For Anyone interested in discovering and exploiting relationships among variables. Although all code examples are written in C++, the algorithms are described in sufficient detail that they can easily be programmed in any language.

Pandas for Everyone: Python Data Analysis, First Edition

The Hands-On, Example-Rich Introduction to Pandas Data Analysis in Python Today, analysts must manage data characterized by extraordinary variety, velocity, and volume. Using the open source Pandas library, you can use Python to rapidly automate and perform virtually any data analysis task, no matter how large or complex. Pandas can help you ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. brings together practical knowledge and insight for solving real problems with Pandas, even if you’re new to Python data analysis. Daniel Y. Chen introduces key concepts through simple but practical examples, incrementally building on them to solve more difficult, real-world problems. Pandas for Everyone Chen gives you a jumpstart on using Pandas with a realistic dataset and covers combining datasets, handling missing data, and structuring datasets for easier analysis and visualization. He demonstrates powerful data cleaning techniques, from basic string manipulation to applying functions simultaneously across dataframes. Once your data is ready, Chen guides you through fitting models for prediction, clustering, inference, and exploration. He provides tips on performance and scalability, and introduces you to the wider Python data analysis ecosystem. Work with DataFrames and Series, and import or export data Create plots with matplotlib, seaborn, and pandas Combine datasets and handle missing data Reshape, tidy, and clean datasets so they’re easier to work with Convert data types and manipulate text strings Apply functions to scale data manipulations Aggregate, transform, and filter large datasets with groupby Leverage Pandas’ advanced date and time capabilities Fit linear models using statsmodels and scikit-learn libraries Use generalized linear modeling to fit models with different response variables Compare multiple models to select the “best” Regularize to overcome overfitting and improve performance Use clustering in unsupervised machine learning Register your product at informit.com/register for convenient access to downloads, updates, and/or corrections as they become available.

The Power of Connection

A simple communication framework to begin practising today We all carry around the technology to stay connected 24/7, yet many of us are disengaged and challenged with our lack of communication skills. The Power of Connection provides you with practical, real-world solutions for improving your professional performance, your personal relationships and your outlook — one conversation at a time. Becoming a confident and compelling communicator might be the most important skill for leaders in the modern business landscape, parents in the modern home and individuals who use ‘self-talk' to help shape their world. By adopting the simple strategies revealed in every chapter, you can become an unshakeable success at what you set out to do. This book is designed to help you start communicating better today, so start reading and start practicing with your very next conversation! Understand your communication strengths and weaknesses Become a better listener to build a deeper connection Learn how communication sits at the heart of all relationships Develop the skills to connect, inspire, engage and empower We are surrounded by noise, yet no one is actually saying anything we can connect with — or are we just not listening? Communication is a two-way street, and involves so much more than just speaking. The Power of Connection offers a quick and easy road map for your personal journey of growth and development that will make you a better parent, friend, spouse and employee. It's the right message for this time considering there's never a wrong time to level up your skills and become more effective at work, at home and in life.

Pro Power BI Desktop

Deliver eye-catching Business Intelligence with Microsoft Power BI Desktop. This new edition has been updated to cover all the latest features, including combo charts, Cartesian charts, trend lines, use of gauges, and more. Also covered are Top-N features, the ability to bin data into groupings and chart the groupings, and new techniques for detecting and handling outlier data points. You can take data from virtually any source and use it to produce stunning dashboards and compelling reports that will seize your audience’s attention. Slice and dice the data with remarkable ease and then add metrics and KPIs to project the insights that create your competitive advantage. Make raw data into clear, accurate, and interactive information with Microsoft’s free self-service business intelligence tool. Pro Power BI Desktop shows you how to choose from a wide range of built-in and third-party visualization types so that your message is always enhanced. You’ll be able to deliver those results on the PC, tablets, and smartphones, as well as share results via the cloud. This book helps you save time by preparing the underlying data correctly without needing an IT department to prepare it for you. What You'll Learn Deliver attention-grabbing information, turning data into insight Mash up data from multiple sources into a cleansed and coherent data model Create dashboards that help in monitoring key performance indicators of your business Build interdependent charts, maps, and tables to deliver visually stunning information Share business intelligence in the cloud without involving IT Deliver visually stunning and interactive charts, maps, and tables Find new insights as you chop and tweak your data as never before Adapt delivery to mobile devices such as phones and tablets Who This Book Is For Everyone from CEOs and Business Intelligence developers to power users and IT managers

D3.js in Action, Second Edition

D3.js in Action, Second Edition is completely revised and updated for D3 v4 and ES6. It's a practical tutorial for creating interactive graphics and data-driven applications using D3. About the Technology Visualizing complex data is hard. Visualizing complex data on the web is darn near impossible without D3.js. D3 is a JavaScript library that provides a simple but powerful data visualization API over HTML, CSS, and SVG. Start with a structure, dataset, or algorithm; mix in D3; and you can programmatically generate static, animated, or interactive images that scale to any screen or browser. It's easy, and after a little practice, you'll be blown away by how beautiful your results can be! About the Book D3.js in Action, Second Edition is a completely updated revision of Manning's bestselling guide to data visualization with D3. You'll explore dozens of real-world examples in full-color, including force and network diagrams, workflow illustrations, geospatial constructions, and more! Along the way, you'll pick up best practices for building interactive graphics, animations, and live data representations. You'll also step through a fully interactive application created with D3 and React. What's Inside Rich full-color diagrams and illustrations Updated for D3 v4 and ES6 Reusable layouts and components Geospatial data visualizations Mixed-mode rendering About the Reader Suitable for web developers with HTML, CSS, and JavaScript skills. No specialized data science skills required. About the Author Elijah Meeks is a senior data visualization engineer at Netflix. Quotes From basic to complex, this book gives you the tools to create beautiful data visualizations. - Claudio Rodriguez, Cox Media Group The best reference for one of the most useful DataViz tools. - Jonathan Rioux, TD Insurance From toy examples to techniques for real projects. Shows how all the pieces fit together. - Scott McKissock, USAID A clever way to immerse yourself in the D3.js world. - Felipe Vildoso Castillo, University of Chile