talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

IBM Software Defined Infrastructure for Big Data Analytics Workloads

This IBM® Redbooks® publication documents how IBM Platform Computing, with its IBM Platform Symphony® MapReduce framework, IBM Spectrum Scale (based Upon IBM GPFS™), IBM Platform LSF®, the Advanced Service Controller for Platform Symphony are work together as an infrastructure to manage not just Hadoop-related offerings, but many popular industry offeringsm such as Apach Spark, Storm, MongoDB, Cassandra, and so on. It describes the different ways to run Hadoop in a big data environment, and demonstrates how IBM Platform Computing solutions, such as Platform Symphony and Platform LSF with its MapReduce Accelerator, can help performance and agility to run Hadoop on distributed workload managers offered by IBM. This information is for technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions on IBM Power Systems™ to help uncover insights among client’s data so they can optimize product development and business results.

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

Can the media analyst and the web analyst get along? Can the chasm between clicks and visits every be crossed? Is attribution management the silver bullet that will, once and for all, accurately assign a value to banner ads? What the hell is Vizu? Can Michael use the word "whackadoo" in a coherent sentence? These questions and more get discussed and debated on this shockingly cordial episode of the Digital Analytics Power Hour that clocks in at 35 minutes.

Healthcare Data Analytics

Supplying a comprehensive overview of healthcare analytics research, Healthcare Data Analytics provides an understanding of the analytical techniques currently available to solve healthcare problems. The book details novel techniques for acquiring, handling, retrieving, and making best use of healthcare data. It analyzes recent developments in healthcare computing and discusses emerging technologies that can help improve the health and well-being of patients. Written by prominent researchers and experts working in the healthcare domain, it sheds light on the computational challenges in the field of medical informatics.

This episode discusses video game analytics with guest Anders Drachen. The way in which people get access to games and the opportunity for game designers to ask interesting questions with data has changed quite a bit in the last two decades. Anders shares his insights about the past, present, and future of game analytics. We explore not only some of the innovations and interesting ways of examining user experience in the gaming industry, but also touch on some of the exciting opportunities for innovation that are right on the horizon. You can find more from Anders online at andersdrachen.com, and follow him on twitter @andersdrachen

Mastering Predictive Analytics with R

Dive into the realm of predictive analytics with this R-focused guide. Whether you're building your first model or refining complex analytics strategies, this book equips you with fundamental techniques and in-depth understanding of predictive modeling using R. What this Book will help me do Master the end-to-end predictive modeling process. Classify and select suitable predictive models for specific use cases. Understand the mechanics and assumptions of various predictive models. Evaluate predictive model performance with appropriate metrics. Enhance your R programming skills for analytical tasks. Author(s) The authors of this book combine strong technical expertise in data science and predictive analytics with extensive hands-on experience in applying them to real-world challenges. They excel at distilling complex topics into approachable, actionable steps for readers at varying levels of familiarity with R and data analysis. Their commitment to empowering learners defines their work. Who is it for? This book is perfect for budding data scientists and quantitative analysts with basic R knowledge who aspire to master predictive analytics. Even experienced professionals will find valuable model-specific insights. If you're familiar with basic statistics and eager to bridge the gap to robust machine learning applications, this book is for you.

Implementing an IBM InfoSphere BigInsights Cluster using Linux on Power

This IBM® Redbooks® publication demonstrates and documents how to implement and manage an IBM PowerLinux™ cluster for big data focusing on hardware management, operating systems provisioning, application provisioning, cluster readiness check, hardware, operating system, IBM InfoSphere® BigInsights™, IBM Platform Symphony®, IBM Spectrum™ Scale (formerly IBM GPFS™), applications monitoring, and performance tuning. This publication shows that IBM PowerLinux clustering solutions (hardware and software) deliver significant value to clients that need cost-effective, highly scalable, and robust solutions for big data and analytics workloads. This book documents and addresses topics on how to use IBM Platform Cluster Manager to manage PowerLinux BigData data clusters through IBM InfoSphere BigInsights, Spectrum Scale, and Platform Symphony. This book documents how to set up and manage a big data cluster on PowerLinux servers to customize application and programming solutions, and to tune applications to use IBM hardware architectures. This document uses the architectural technologies and the software solutions that are available from IBM to help solve challenging technical and business problems. This book is targeted at technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) that are responsible for delivering cost-effective Linux on IBM Power Systems™ solutions that help uncover insights among client's data so they can act to optimize business results, product development, and scientific discoveries.

The Last Mile of Analytics: Making the Leap from Platforms to Tools

Here's the net takeaway: Businesses want insights from data they can translate into meaningful actions and real results. Software vendors are beginning to deliver a new generation of advanced analytics packages that address business issues directly. In this O'Reilly report, Mike Barlow reveals how this new user-friendly software is helping businesses go beyond data analysis and straight to decision-making—without requiring data science expertise or truckloads of cash. How has advanced analytics progressed from lab project to commercial product so quickly? Through interviews with data analysts, you'll understand the role that machine learning plays in specialized analytics packages, and how this software alone can make decisions based on what's likely to happen next. When you have these capabilities, you’ve reached "the last mile of analytics."

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , John Lovett (Web Analytics Demystified) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

Most people find the concept of governance about as interesting as an afternoon of quality control work at the beige paint factory. If you agree with this sentiment and are listening to this week's podcast, we hope to change your mind! With special guest John Lovett, Senior Partner at Web Analytics Demystified, Tim, Michael and Jim talk about what governance is for a digital analytics practice, why it's so darned important, and how anyone can get started. All of this AND a little poetry (really!) for the low, low price of 45 minutes of your time, in the Digital Analytics Power Hour.

Implementing IBM FlashSystem 900

Today's global organizations depend on being able to unlock business insights from massive volumes of data. Now, with IBM® FlashSystem™ 900, powered by IBM FlashCore™ technology, they can make faster decisions based on real-time insights and unleash the power of the most demanding applications, including online transaction processing (OLTP) and analytics databases, virtual desktop infrastructures (VDIs), technical computing applications, and cloud environments. This IBM Redbooks® publication introduces clients to the IBM FlashSystem® 900. It provides in-depth knowledge of the product architecture, software and hardware, implementation, and hints and tips. Also illustrated are use cases that show real-world solutions for tiering, flash-only, and preferred-read, and also examples of the benefits gained by integrating the FlashSystem storage into business environments. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand how to implement this new and exciting technology. This book describes the following offerings of the IBM Spectrum™ Storage family: IBM Spectrum Storage™ IBM Spectrum Control IBM Spectrum Virtualize IBM Spectrum Scale IBM Spectrum Accelerate

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

Sure I like your theory guys, but I want to hear some stories from the trenches! Episode 11 is all about the anecdotes, with the guys sharing stories about work they've done as analysts that had the most impact. Whether from hard work or a moment of inspiration, big wins with analysis are sometimes few and far between Hear some great examples from Michael, Tim and Jim's personal experiences. What has 6 legs, three microphones and will make you a better analyst in 42 minutes? It's the Digital Analytics Power Hour.

Designing and Operating a Data Reservoir

Together, big data and analytics have tremendous potential to improve the way we use precious resources, to provide more personalized services, and to protect ourselves from unexpected and ill-intentioned activities. To fully use big data and analytics, an organization needs a system of insight. This is an ecosystem where individuals can locate and access data, and build visualizations and new analytical models that can be deployed into the IT systems to improve the operations of the organization. The data that is most valuable for analytics is also valuable in its own right and typically contains personal and private information about key people in the organization such as customers, employees, and suppliers. Although universal access to data is desirable, safeguards are necessary to protect people's privacy, prevent data leakage, and detect suspicious activity. The data reservoir is a reference architecture that balances the desire for easy access to data with information governance and security. The data reservoir reference architecture describes the technical capabilities necessary for a system of insight, while being independent of specific technologies. Being technology independent is important, because most organizations already have investments in data platforms that they want to incorporate in their solution. In addition, technology is continually improving, and the choice of technology is often dictated by the volume, variety, and velocity of the data being managed. A system of insight needs more than technology to succeed. The data reservoir reference architecture includes description of governance and management processes and definitions to ensure the human and business systems around the technology support a collaborative, self-service, and safe environment for data use. The data reservoir reference architecture was first introduced in Governing and Managing Big Data for Analytics and Decision Makers, REDP-5120, which is available at: http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html. This IBM® Redbooks publication, Designing and Operating a Data Reservoir, builds on that material to provide more detail on the capabilities and internal workings of a data reservoir.

IBM Spectrum Scale (formerly GPFS)

This IBM® Redbooks® publication updates and complements the previous publication: Implementing the IBM General Parallel File System in a Cross Platform Environment, SG24-7844, with additional updates since the previous publication version was released with IBM General Parallel File System (GPFS™). Since then, two releases have been made available up to the latest version of IBM Spectrum™ Scale 4.1. Topics such as what is new in Spectrum Scale, Spectrum Scale licensing updates (Express/Standard/Advanced), Spectrum Scale infrastructure support/updates, storage support (IBM and OEM), operating system and platform support, Spectrum Scale global sharing - Active File Management (AFM), and considerations for the integration of Spectrum Scale in IBM Tivoli® Storage Manager (Spectrum Protect) backup solutions are discussed in this new IBM Redbooks publication. This publication provides additional topics such as planning, usability, best practices, monitoring, problem determination, and so on. The main concept for this publication is to bring you up to date with the latest features and capabilities of IBM Spectrum Scale as the solution has become a key component of the reference architecture for clouds, analytics, mobile, social media, and much more. This publication targets technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) responsible for delivering cost effective cloud services and big data solutions on IBM Power Systems™ helping to uncover insights among clients' data so they can take actions to optimize business results, product development, and scientific discoveries.

Mastering Pandas for Finance

"Mastering Pandas for Finance" takes a deep dive into applying Python and the pandas library to solve real-world financial data analysis problems. With a focus on financial modeling, backtesting trading strategies, and analyzing large datasets, this book equips you with the skills to leverage pandas effectively. What this Book will help me do Utilize pandas DataFrame for efficient financial data handling and manipulation. Develop robust time-series models and perform statistical analysis on financial data. Backtest algorithmic trading strategies including momentum and mean reversion. Price complex financial options and calculate Value at Risk for portfolio management. Optimize portfolio allocation and model financial performance using industry techniques. Author(s) Michael Heydt is an experienced software engineer and data scientist with a strong background in quantitative finance. He specializes in using Python for data analysis and has spent years teaching and writing about technical subjects. His detailed yet approachable writing style makes complex topics accessible to all. Who is it for? "Mastering Pandas for Finance" is perfect for finance professionals seeking to integrate Python into their workflows, data analysts exploring quantitative finance applications, and programmers aiming to specialize in financial analytics. Some baseline Python and pandas knowledge is recommended, but the book is structured to guide you effectively through advanced concepts too.

Marketing Data Science: Modeling Techniques in Predictive Analytics with R and Python

Now a leader of Northwestern University's prestigious analytics program presents a fully-integrated treatment of both the business and academic elements of marketing applications in predictive analytics. Writing for both managers and students, Thomas W. Miller explains essential concepts, principles, and theory in the context of real-world applications. , Building on Miller's pioneering program, thoroughly addresses segmentation, target marketing, brand and product positioning, new product development, choice modeling, recommender systems, pricing research, retail site selection, demand estimation, sales forecasting, customer retention, and lifetime value analysis. Marketing Data Science Starting where Miller's widely-praised Modeling Techniques in Predictive Analytics left off, he integrates crucial information and insights that were previously segregated in texts on web analytics, network science, information technology, and programming. Coverage includes: The role of analytics in delivering effective messages on the web Understanding the web by understanding its hidden structures Being recognized on the web – and watching your own competitors Visualizing networks and understanding communities within them Measuring sentiment and making recommendations Leveraging key data science methods: databases/data preparation, classical/Bayesian statistics, regression/classification, machine learning, and text analytics Six complete case studies address exceptionally relevant issues such as: separating legitimate email from spam; identifying legally-relevant information for lawsuit discovery; gleaning insights from anonymous web surfing data, and more. This text's extensive set of web and network problems draw on rich public-domain data sources; many are accompanied by solutions in Python and/or R. will be an invaluable resource for all students, faculty, and professional marketers who want to use business analytics to improve marketing performance. Marketing Data Science

Implementation Best Practices for IBM DB2 BLU Acceleration with SAP BW on IBM Power Systems

BLU Acceleration is a new technology that has been developed by IBM® and integrated directly into the IBM DB2® engine. BLU Acceleration is a new storage engine along with integrated run time (directly into the core DB2 engine) to support the storage and analysis of column-organized tables. The BLU Acceleration processing is parallel to the regular, row-based table processing found in the DB2 engine. This is not a bolt-on technology nor is it a separate analytic engine that sits outside of DB2. Much like when IBM added XML data as a first class object within the database along with all the storage and processing enhancements that came with XML, now IBM has added column-organized tables directly into the storage and processing engine of DB2. This IBM Redbooks® publication shows examples on an IBM Power Systems™ entry server as a starter configuration for small organizations, and build larger configurations with IBM Power Systems larger servers. This publication takes you through how to build a BLU Acceleration solution on IBM POWER® having SAP Landscape integrated to it. This publication implements SAP NetWeaver Business Warehouse Systems as part of the scenario using another DB2 Feature called Near-Line Storage (NLS), on IBM POWER virtualization features to develop and document best recommendation scenarios. This publication is targeted towards technical professionals (DBAs, data architects, consultants, technical support staff, and IT specialists) responsible for delivering cost-effective data management solutions to provide the best system configuration for their clients' data analytics on Power Systems.

Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know

Features basic statistical concepts as a tool for thinking critically, wading through large quantities of information, and answering practical, everyday questions Written in an engaging and inviting manner, Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know presents the more subjective side of statistics—the art of data analytics. Each chapter explores a different question using fun, common sense examples that illustrate the concepts, methods, and applications of statistical techniques. Without going into the specifics of theorems, propositions, or formulas, the book effectively demonstrates statistics as a useful problem-solving tool. In addition, the author demonstrates how statistics is a tool for thinking critically, wading through large volumes of information, and answering life's important questions. Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know also features: Plentiful examples throughout aimed to strengthen readers' understanding of the statistical concepts and methods A step-by-step approach to elementary statistical topics such as sampling, hypothesis tests, outlier detection, normality tests, robust statistics, and multiple regression A case study in each chapter that illustrates the use of the presented techniques Highlights of well-known shortcomings that can lead to false conclusions An introduction to advanced techniques such as validation and bootstrapping Featuring examples that are engaging and non-application specific, the book appeals to a broad audience of students and professionals alike, specifically students of undergraduate statistics, managers, medical professionals, and anyone who has to make decisions based on raw data or compiled results.

Google Analytics Integrations

Get a complete view of your customers and make your marketing analysis more meaningful How well do you really know your customers? Find out with the help of expert author Daniel Waisberg and Google Analytics Integrations. This unique guide takes you well beyond the basics of using Google Analytics to track metrics, showing you how to transform this simple data collection tool into a powerful, central marketing analysis platform for your organization. You'll learn how Google AdWords, AdSense, CRMs, and other data sources can be used together to deliver actionable insights about your customers and their behavior. Explains proven techniques and best practices for collecting clean and accurate information from the start Shows you how to import your organization's marketing and customer data into Google Analytics Illustrates the importance of taking a holistic view of your customers and how this knowledge can transform your business Provides step-by-step guidance on using the latest analytical tools and services to gain a complete understanding of your customers, their needs, and what motivates them to take action Google Analytics Integration is your in-depth guide to improving your data integration, behavioral analysis, and ultimately, your bottom line.

Big Data

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built. About the Technology About the Book Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. What's Inside Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills About the Reader This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. About the Authors Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing. Quotes Transcends individual tools or platforms. Required reading for anyone working with big data systems. - Jonathan Esterhazy, Groupon A comprehensive, example-driven tour of the Lambda Architecture with its originator as your guide. - Mark Fisher, Pivotal Contains wisdom that can only be gathered after tackling many big data projects. A must-read. - Pere Ferrera Bertran, Datasalt The de facto guide to streamlining your data pipeline in batch and near-real time. - Alex Holmes, Author of "Hadoop in Practice"

Hadoop Essentials

In 'Hadoop Essentials,' you'll embark on an engaging journey to master the Hadoop ecosystem. This book covers fundamental to advanced topics, from HDFS and MapReduce to real-time analytics with Spark, empowering you to handle modern data challenges efficiently. What this Book will help me do Understand the core components of Hadoop, including HDFS, YARN, and MapReduce, for foundational knowledge. Learn to optimize Big Data architectures and improve application performance. Utilize tools like Hive and Pig for efficient data querying and processing. Master data ingestion technologies like Sqoop and Flume for seamless data management. Achieve fluency in real-time data analytics using modern tools like Apache Spark and Apache Storm. Author(s) None Achari is a seasoned expert in Big Data and distributed systems with in-depth knowledge of the Hadoop ecosystem. With years of experience in both development and teaching, they craft content that bridges practical know-how with theoretical insights in a highly accessible style. Who is it for? This book is perfect for system and application developers aiming to learn practical applications of Hadoop. It suits professionals seeking solutions to real-world Big Data challenges as well as those familiar with distributed systems basics and looking to deepen their expertise in advanced data analysis.

Apache Solr Search Patterns

Master Elasticsearch as you uncover advanced Solr techniques in this professional guide. This book dives deeply into deploying and optimizing Solr-powered search engines and explores high-performance techniques. Learn to leverage your data with accessible, comprehensive, and practical insights. What this Book will help me do Learn to customize Solr's query scorer to provide tailored search results. Understand the internals of Solr, including indexing and query facilities, for better optimization. Implement scalable and reliable search clusters using SolrCloud. Explore the use of Solr for spatial, e-commerce, and advertising searches. Combine Solr with front-end technologies like AJAX and advanced tagging with FSTs. Author(s) Jayant Kumar, an experienced developer and search solutions architect, specializes in leveraging Apache Solr. With years of practical experience, he brings unique insights into scaling search platforms. His commitment to imparting clear, actionable knowledge is reflected in this focused resource. Who is it for? This book is ideal for software developers and architects embedded in the Solr ecosystem looking to enhance their expertise. If you are seeking to develop advanced and scalable solutions, master Solr's core capabilities, or improve your analytics and graph-generating skills, this book will support your goals.