talk-data.com talk-data.com

Topic

Data Analytics

data_analysis statistics insights

760

tagged

Activity Trend

38 peak/qtr
2020-Q1 2026-Q1

Activities

760 activities · Newest first

Python Data Analytics: Data Analysis and Science Using Pandas, matplotlib, and the Python Programming Language

Python Data Analytics will help you tackle the world of data acquisition and analysis using the power of the Python language. At the heart of this book lies the coverage of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Author Fabio Nelli expertly shows the strength of the Python programming language when applied to processing, managing and retrieving information. Inside, you will see how intuitive and flexible it is to discover and communicate meaningful patterns of data using Python scripts, reporting systems, and data export. This book examines how to go about obtaining, processing, storing, managing and analyzing data using the Python programming language. You will use Python and other open source tools to wrangle data and tease out interesting and important trends in that data that will allow you to predict future patterns. Whether you are dealing with sales data, investment data (stocks, bonds, etc.), medical data, web page usage, or any other type of data set, Python can be used to interpret, analyze, and glean information from a pile of numbers and statistics. This book is an invaluable reference with its examples of storing and accessing data in a database; it walks you through the process of report generation; it provides three real world case studies or examples that you can take with you for your everyday analysis needs.

SAS Essentials: Mastering SAS for Data Analytics, 2nd Edition

A step-by-step introduction to using SAS statistical software as a foundational approach to data analysis and interpretation Presenting a straightforward introduction from the ground up, SAS Essentials: Mastering SAS for Data Analytics, Second Edition illustrates SAS using hands-on learning techniques and numerous real-world examples. Keeping different experience levels in mind, the highly-qualified author team has developed the book over 20 years of teaching introductory SAS courses. Divided into two sections, the first part of the book provides an introduction to data manipulation, statistical techniques, and the SAS programming language. The second section is designed to introduce users to statistical analysis using SAS Procedures. Featuring self-contained chapters to enhance the learning process, the Second Edition also includes: Programming approaches for the most up-to-date version of the SAS platform including information on how to use the SAS University Edition Discussions to illustrate the concepts and highlight key fundamental computational skills that are utilized by business, government, and organizations alike New chapters on reporting results in tables and factor analysis Additional information on the DATA step for data management with an emphasis on importing data from other sources, combining data sets, and data cleaning Updated ANOVA and regression examples as well as other data analysis techniques A companion website with the discussed data sets, additional code, and related PowerPoint slides SAS Essentials: Mastering SAS for Data Analytics, Second Edition is an ideal textbook for upper-undergraduate and graduate-level courses in statistics, data analytics, applied SAS programming, and statistical computer applications as well as an excellent supplement for statistical methodology courses. The book is an appropriate reference for researchers and academicians who require a basic introduction to SAS for statistical analysis and for preparation for the Basic SAS Certification Exam.

QlikView Your Business

Unlock the meaning of your data with QlikView The Qlik platform was designed to provide a fast and easy data analytics tool, and QlikView Your Business is your detailed, full-color, step-by-step guide to understanding Qlikview's powerful features and techniques so you can quickly start unlocking your data’s potential. This expert author team brings real-world insight together with practical business analytics, so you can approach, explore, and solve business intelligence problems using the robust Qlik toolset and clearly communicate your results to stakeholders using powerful visualization features in QlikView and Qlik Sense. This book starts at the basic level and dives deep into the most advanced QlikView techniques, delivering tangible value and knowledge to new users and experienced developers alike. As an added benefit, every topic presented in the book is enhanced with tips, tricks, and insightful recommendations that the authors accumulated through years of developing QlikView analytics. This is the book for you: If you are a developer whose job is to load transactional data into Qlik BI environment, and who needs to understand both the basics and the most advanced techniques of Qlik data modelling and scripting If you are a data analyst whose job is to develop actionable and insightful QlikView visualizations to share within your organization If you are a project manager or business person, who wants to get a better understanding of the Qlik Business Intelligence platform and its capabilities What You Will Learn: The book covers three common business scenarios - Sales, Profitability, and Inventory Analysis. Each scenario contains four chapters, covering the four main disciplines of business analytics: Business Case, Data Modeling, Scripting, and Visualizations. The material is organized by increasing levels of complexity. Following our comprehensive tutorial, you will learn simple and advanced QlikView and Qlik Sense concepts, including the following: Data Modeling: Transforming Transactional data into Dimensional models Building a Star Schema Linking multiple fact tables using Link Tables Combing multiple tables into a single fact able using Concatenated Fact models Managing slowly changing dimensions Advanced date handling, using the As of Date table Calculating running balances Basic and Advanced Scripting: How to use the Data Load Script language for implementing data modeling techniques How to build and use the QVD data layer Building a multi-tier data architectures Using variables, loops, subroutines, and other script control statements Advanced scripting techniques for a variety of ETL solutions Building Insightful Visualizations in QlikView: Introduction into QlikView sheet objects — List Boxes, Text Objects, Charts, and more Designing insightful Dashboards in QlikView Using advanced calculation techniques, such as Set Analysis and Advanced Aggregation Using variables for What-If Analysis, as well as using variables for storing calculations, colors, and selection filters Advanced visualization techniques - normalized and non-normalized Mekko charts, Waterfall charts, Whale Tail charts, and more Building Insightful Visualizations in Qlik Sense: Introducing Qlik Sense - how it is different from QlikView and what is similar? Creating Sense sheet objects Building and using the Library of Master Items Exploring Qlik Sense unique features — Storytelling, Geo Mapping, and using Extensions Whether you are jus

Machine Learning with R - Second Edition

Machine Learning with R (Second Edition) provides a thorough introduction to machine learning techniques and their application using the R programming language. You'll gain hands-on experience implementing various algorithms and solving real-world data challenges, making it an invaluable resource for aspiring data scientists and analysts. What this Book will help me do Understand the fundamentals of machine learning and its applications in data analysis. Master the use of R for cleaning, exploring, and visualizing data to prepare it for modeling. Build and apply machine learning models for classification, prediction, and clustering tasks. Evaluate and fine-tune model performance to ensure accurate predictions. Explore advanced topics like text mining, handling social network data, and big data analytics. Author(s) Brett Lantz is a data scientist with significant experience as both a practitioner and communicator in the machine learning field. With a focus on accessibility, he aims to demystify complex concepts for readers interested in data science. His blend of hands-on methods and theoretical insight has made his work a favorite for both beginners and experienced professionals. Who is it for? Ideal for data analysts and aspiring data scientists who have intermediate programming skills and are exploring machine learning. Perfect for R users ready to expand their skill set to include predictive modeling techniques. Also fits those with some experience in machine learning but new to the R environment. Provides insightful guidance for anyone looking to apply machine learning in practical, real-world scenarios.

IBM Software Defined Infrastructure for Big Data Analytics Workloads

This IBM® Redbooks® publication documents how IBM Platform Computing, with its IBM Platform Symphony® MapReduce framework, IBM Spectrum Scale (based Upon IBM GPFS™), IBM Platform LSF®, the Advanced Service Controller for Platform Symphony are work together as an infrastructure to manage not just Hadoop-related offerings, but many popular industry offeringsm such as Apach Spark, Storm, MongoDB, Cassandra, and so on. It describes the different ways to run Hadoop in a big data environment, and demonstrates how IBM Platform Computing solutions, such as Platform Symphony and Platform LSF with its MapReduce Accelerator, can help performance and agility to run Hadoop on distributed workload managers offered by IBM. This information is for technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions on IBM Power Systems™ to help uncover insights among client’s data so they can optimize product development and business results.

Healthcare Data Analytics

Supplying a comprehensive overview of healthcare analytics research, Healthcare Data Analytics provides an understanding of the analytical techniques currently available to solve healthcare problems. The book details novel techniques for acquiring, handling, retrieving, and making best use of healthcare data. It analyzes recent developments in healthcare computing and discusses emerging technologies that can help improve the health and well-being of patients. Written by prominent researchers and experts working in the healthcare domain, it sheds light on the computational challenges in the field of medical informatics.

Implementation Best Practices for IBM DB2 BLU Acceleration with SAP BW on IBM Power Systems

BLU Acceleration is a new technology that has been developed by IBM® and integrated directly into the IBM DB2® engine. BLU Acceleration is a new storage engine along with integrated run time (directly into the core DB2 engine) to support the storage and analysis of column-organized tables. The BLU Acceleration processing is parallel to the regular, row-based table processing found in the DB2 engine. This is not a bolt-on technology nor is it a separate analytic engine that sits outside of DB2. Much like when IBM added XML data as a first class object within the database along with all the storage and processing enhancements that came with XML, now IBM has added column-organized tables directly into the storage and processing engine of DB2. This IBM Redbooks® publication shows examples on an IBM Power Systems™ entry server as a starter configuration for small organizations, and build larger configurations with IBM Power Systems larger servers. This publication takes you through how to build a BLU Acceleration solution on IBM POWER® having SAP Landscape integrated to it. This publication implements SAP NetWeaver Business Warehouse Systems as part of the scenario using another DB2 Feature called Near-Line Storage (NLS), on IBM POWER virtualization features to develop and document best recommendation scenarios. This publication is targeted towards technical professionals (DBAs, data architects, consultants, technical support staff, and IT specialists) responsible for delivering cost-effective data management solutions to provide the best system configuration for their clients' data analytics on Power Systems.

Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know

Features basic statistical concepts as a tool for thinking critically, wading through large quantities of information, and answering practical, everyday questions Written in an engaging and inviting manner, Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know presents the more subjective side of statistics—the art of data analytics. Each chapter explores a different question using fun, common sense examples that illustrate the concepts, methods, and applications of statistical techniques. Without going into the specifics of theorems, propositions, or formulas, the book effectively demonstrates statistics as a useful problem-solving tool. In addition, the author demonstrates how statistics is a tool for thinking critically, wading through large volumes of information, and answering life's important questions. Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know also features: Plentiful examples throughout aimed to strengthen readers' understanding of the statistical concepts and methods A step-by-step approach to elementary statistical topics such as sampling, hypothesis tests, outlier detection, normality tests, robust statistics, and multiple regression A case study in each chapter that illustrates the use of the presented techniques Highlights of well-known shortcomings that can lead to false conclusions An introduction to advanced techniques such as validation and bootstrapping Featuring examples that are engaging and non-application specific, the book appeals to a broad audience of students and professionals alike, specifically students of undergraduate statistics, managers, medical professionals, and anyone who has to make decisions based on raw data or compiled results.

Hadoop Essentials

In 'Hadoop Essentials,' you'll embark on an engaging journey to master the Hadoop ecosystem. This book covers fundamental to advanced topics, from HDFS and MapReduce to real-time analytics with Spark, empowering you to handle modern data challenges efficiently. What this Book will help me do Understand the core components of Hadoop, including HDFS, YARN, and MapReduce, for foundational knowledge. Learn to optimize Big Data architectures and improve application performance. Utilize tools like Hive and Pig for efficient data querying and processing. Master data ingestion technologies like Sqoop and Flume for seamless data management. Achieve fluency in real-time data analytics using modern tools like Apache Spark and Apache Storm. Author(s) None Achari is a seasoned expert in Big Data and distributed systems with in-depth knowledge of the Hadoop ecosystem. With years of experience in both development and teaching, they craft content that bridges practical know-how with theoretical insights in a highly accessible style. Who is it for? This book is perfect for system and application developers aiming to learn practical applications of Hadoop. It suits professionals seeking solutions to real-world Big Data challenges as well as those familiar with distributed systems basics and looking to deepen their expertise in advanced data analysis.

The Security Data Lake

Companies of all sizes are considering data lakes as a way to deal with terabytes of security data that can help them conduct forensic investigations and serve as an early indicator to identify bad or relevant behavior. Many think about replacing their existing SIEM (security information and event management) systems with Hadoop running on commodity hardware. Before your company jumps into the deep end, you first need to weigh several critical factors. This O'Reilly report takes you through technological and design options for implementing a data lake. Each option not only supports your data analytics use cases, but is also accessible by processes, workflows, third-party tools, and teams across your organization. Within this report, you'll explore: Five questions to ask before choosing architecture for your backend data store How data lakes can overcome scalability and data duplication issues Different options for storing context and unstructured log data Data access use cases covering both search and analytical queries via SQL Processes necessary for ingesting data into a data lake, including parsing, enrichment, and aggregation Four methods for embedding your SIEM into a data lake

Big Data

Convert the promise of big data into real world results There is so much buzz around big data. We all need to know what it is and how it works - that much is obvious. But is a basic understanding of the theory enough to hold your own in strategy meetings? Probably. But what will set you apart from the rest is actually knowing how to USE big data to get solid, real-world business results - and putting that in place to improve performance. Big Data will give you a clear understanding, blueprint, and step-by-step approach to building your own big data strategy. This is a well-needed practical introduction to actually putting the topic into practice. Illustrated with numerous real-world examples from a cross section of companies and organisations, Big Data will take you through the five steps of the SMART model: Start with Strategy, Measure Metrics and Data, Apply Analytics, Report Results, Transform. Discusses how companies need to clearly define what it is they need to know Outlines how companies can collect relevant data and measure the metrics that will help them answer their most important business questions Addresses how the results of big data analytics can be visualised and communicated to ensure key decisions-makers understand them Includes many high-profile case studies from the author's work with some of the world's best known brands

Apache Hive Essentials

Apache Hive Essentials is the perfect guide for understanding and mastering Hive, the SQL-like big data query language built on top of Hadoop. With this book, you will gain the skills to effectively use Hive to analyze and manage large data sets. Whether you're a developer, data analyst, or just curious about big data, this hands-on guide will enhance your capabilities. What this Book will help me do Understand the core concepts of Hive and its relation to big data and Hadoop. Learn how to set up a Hive environment and integrate it with Hadoop. Master the SQL-like query functionalities of Hive to select, manipulate, and analyze data. Develop custom functions in Hive to extend its functionality for your own specific use cases. Discover best practices for optimizing Hive performance and ensuring data security. Author(s) Dayong Du is an expert in big data analytics with extensive experience in implementing and using tools like Hive in professional settings. Having worked on practical big data solutions, Dayong brings a wealth of knowledge and insights to his writing. His clear, approachable style makes complex topics accessible to readers. Who is it for? This book is ideal for developers, data analysts, and data engineers looking to leverage Hive for big data analysis. If you are familiar with SQL and Hadoop basics and aim to enhance your understanding of Hive, this book is for you. Beginners with some programming background eager to dive into big data technologies will also benefit. It's tailored for learners wanting actionable knowledge to advance their data processing skills.

NoSQL For Dummies

Get up to speed on the nuances of NoSQL databases and what they mean for your organization This easy to read guide to NoSQL databases provides the type of no-nonsense overview and analysis that you need to learn, including what NoSQL is and which database is right for you. Featuring specific evaluation criteria for NoSQL databases, along with a look into the pros and cons of the most popular options, NoSQL For Dummies provides the fastest and easiest way to dive into the details of this incredible technology. You'll gain an understanding of how to use NoSQL databases for mission-critical enterprise architectures and projects, and real-world examples reinforce the primary points to create an action-oriented resource for IT pros. If you're planning a big data project or platform, you probably already know you need to select a NoSQL database to complete your architecture. But with options flooding the market and updates and add-ons coming at a rapid pace, determining what you require now, and in the future, can be a tall task. This is where NoSQL For Dummies comes in! Learn the basic tenets of NoSQL databases and why they have come to the forefront as data has outpaced the capabilities of relational databases Discover major players among NoSQL databases, including Cassandra, MongoDB, MarkLogic, Neo4J, and others Get an in-depth look at the benefits and disadvantages of the wide variety of NoSQL database options Explore the needs of your organization as they relate to the capabilities of specific NoSQL databases Big data and Hadoop get all the attention, but when it comes down to it, NoSQL databases are the engines that power many big data analytics initiatives. With NoSQL For Dummies, you'll go beyond relational databases to ramp up your enterprise's data architecture in no time.

TIBCO Spotfire: A Comprehensive Primer

TIBCO Spotfire: A Comprehensive Primer is the go-to guide for mastering TIBCO Spotfire, a leading data visualization and analytics tool. Whether you are new to Spotfire or data visualization in general, this book will provide you with a solid foundation to create impactful and actionable visual insights. What this Book will help me do Understand the fundamentals of TIBCO Spotfire and its application in data analytics. Learn how to design compelling visualizations and dashboards that convey meaningful insights. Master advanced data transformations and analysis techniques in TIBCO Spotfire. Integrate Spotfire with external data sources and scripting languages, enhancing its functionality. Optimize Spotfire's performance and usability for enterprise-level implementations. Author(s) None Phillips, an experienced analytics professional and educator, specializes in creating accessible learning materials for data science tools. With a decade of experience in the field, None has helped many organizations unlock their data potential through tools like TIBCO Spotfire. Their approach emphasizes practical understanding, making complex concepts approachable for learners of all levels. Who is it for? The book is perfect for business analysts, data scientists, and other professionals involved in data-driven decision making who want to master TIBCO Spotfire. It's designed for beginners without prior exposure to data visualization or TIBCO Spotfire, offering an accessible entry into the field. Individuals aiming to gain hands-on experience and create enterprise-grade solutions will find this book invaluable. Additionally, it serves as a reference for experienced Spotfire users looking to refine their skills.

Learning Spark

Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.

Big Data Analytics

With this book, managers and decision makers are given the tools to make more informed decisions about big data purchasing initiatives. Big Data Analytics: A Practical Guide for Managers not only supplies descriptions of common tools, but also surveys the various products and vendors that supply the big data market. Comparing and contrasting the different types of analysis commonly conducted with big data, this accessible reference presents clear-cut explanations of the general workings of big data tools. Instead of spending time on HOW to install specific packages, it focuses on the reasons WHY readers would install a given package. The book provides authoritative guidance on a range of tools, including open source and proprietary systems. It details the strengths and weaknesses of incorporating big data analysis into decision-making and explains how to leverage the strengths while mitigating the weaknesses. Describes the benefits of distributed computing in simple terms Includes substantial vendor/tool material, especially for open source decisions Covers prominent software packages, including Hadoop and Oracle Endeca Examines GIS and machine learning applications Considers privacy and surveillance issues The book further explores basic statistical concepts that, when misapplied, can be the source of errors. Time and again, big data is treated as an oracle that discovers results nobody would have imagined. While big data can serve this valuable function, all too often these results are incorrect, yet are still reported unquestioningly. The probability of having erroneous results increases as a larger number of variables are compared unless preventative measures are taken. The approach taken by the authors is to explain these concepts so managers can ask better questions of their analysts and vendors as to the appropriateness of the methods used to arrive at a conclusion. Because the world of science and medicine has been grappling with similar issues in the publication of studies, the authors draw on their efforts and apply them to big data.

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data

Data Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software. This book will help you: Become a contributor on a data science team Deploy a structured lifecycle approach to data analytics problems Apply appropriate analytic techniques and tools to analyzing big data Learn how to tell a compelling story with data to drive business action Prepare for EMC Proven Professional Data Science Certification Corresponding data sets are available at www.wiley.com/go/9781118876138. Get started discovering, analyzing, visualizing, and presenting data in a meaningful way today!

Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive). The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton. Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to: Store big data Configure big data Process big data Schedule processes Move data among SQL and NoSQL systems Monitor data Perform big data analytics Report on big data processes and projects Test big data systems Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career.

R Recipes: A Problem-Solution Approach

R Recipes is your handy problem-solution reference for learning and using the popular R programming language for statistics and other numerical analysis. Packed with hundreds of code and visual recipes, this book helps you to quickly learn the fundamentals and explore the frontiers of programming, analyzing and using R. R Recipes provides textual and visual recipes for easy and productive templates for use and re-use in your day-to-day R programming and data analysis practice. Whether you're in finance, cloud computing, big or small data analytics, or other applied computational and data science - R Recipes should be a staple for your code reference library.

Big Data and Health Analytics

Data availability is surpassing existing paradigms for governing, managing, analyzing, and interpreting health data. Big Data and Health Analytics provides frameworks, use cases, and examples that illustrate the role of big data and analytics in modern health care, including how public health information can inform health delivery. Written for health care professionals and executives, this is not a technical book on the use of statistics and machine-learning algorithms for extracting knowledge out of data, nor a book on the intricacies of database design. Instead, this book presents the current thinking of academic and industry researchers and leaders from around the world. Using non-technical language, this book is accessible to health care professionals who might not have an IT and analytics background. It includes case studies that illustrate the business processes underlying the use of big data and health analytics to improve health care delivery. Highlighting lessons learned from the case studies, the book supplies readers with the foundation required for further specialized study in health analytics and data management. Coverage includes community health information, information visualization which offers interactive environments and analytic processes that support exploration of EHR data, the governance structure required to enable data analytics and use, federal regulations and the constraints they place on analytics, and information security. Links to websites, videos, articles, and other online content that expand and support the primary learning objectives for each major section of the book are also included to help you develop the skills you will need to achieve quality improvements in health care delivery through the effective use of data and analytics.