talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

Real Time Analytics with SAP Hana

"Real Time Analytics with SAP HANA" offers a comprehensive, step-by-step guide to mastering analytics and data modeling in the powerful SAP HANA environment. This book covers everything from basic data modeling concepts to more advanced techniques like creating calculation views and leveraging SAP HANA artifacts. What this Book will help me do Understand and build analytics/data models in the SAP HANA environment. Create schemas, packages, and delivery units in SAP HANA Studio. Master real-time data replication using SLT and SAP HANA Studio. Learn about full-text search, fuzzy search, and other analytical capabilities in SAP HANA. Develop comprehensive use cases combining SAP HANA concepts and tools. Author(s) Vinay Singh, the author of this book, is a seasoned SAP HANA expert with extensive experience in analytics and data modeling. He has worked on multiple SAP HANA implementation and migration projects and brings this expertise into his writing. His practical examples and hands-on approach make SAP HANA concepts accessible to learners at all levels. Who is it for? This book is ideal for SAP HANA data modelers, developers, implementation or migration consultants, project managers, and architects. It is designed for individuals aiming to enhance their skill set in SAP HANA and master real-time analytics. Whether you are actively working with SAP HANA or just starting, this book will serve as a valuable guide.

Practical Graph Analytics with Apache Giraph

Practical Graph Analytics with Apache Giraph helps you build data mining and machine learning applications using the Apache Foundation’s Giraph framework for graph processing. This is the same framework as used by Facebook, Google, and other social media analytics operations to derive business value from vast amounts of interconnected data points. Graphs arise in a wealth of data scenarios and describe the connections that are naturally formed in both digital and real worlds. Examples of such connections abound in online social networks such as Facebook and Twitter, among users who rate movies from services like Netflix and Amazon Prime, and are useful even in the context of biological networks for scientific research. Whether in the context of business or science, viewing data as connected adds value by increasing the amount of information available to be drawn from that data and put to use in generating new revenue or scientific opportunities. Apache Giraph offers a simple yet flexible programming model targeted to graph algorithms and designed to scale easily to accommodate massive amounts of data. Originally developed at Yahoo!, Giraph is now a top top-level project at the Apache Foundation, and it enlists contributors from companies such as Facebook, LinkedIn, and Twitter. Practical Graph Analytics with Apache Giraph brings the power of Apache Giraph to you, showing how to harness the power of graph processing for your own data by building sophisticated graph analytics applications using the very same framework that is relied upon by some of the largest players in the industry today.

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , David McBride (Intel) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

It's hard enough keeping up with the times when digital analytics is exclusively Desktop/Mobile/Tablet devices. Now, what if we had to work with data that came from everything? Join us this episode where we lean heavily on the wisdom and experience of Intel's David McBride, and talk about the Internet of Things, Measurement, and perhaps Millennials - all for the low low price of 50 minutes of your time.

People, places, and things reference in this episode include:

Kickstarter wearables projects Faraday Cage IFTTT (If This Then That) Maker Faire Qualcomm MIT Media Lab Tom Emrich Intel Curie Raspberry Pi SMS Audio

Learning to Love Data Science

Until recently, many people thought big data was a passing fad. "Data science" was an enigmatic term. Today, big data is taken seriously, and data science is considered downright sexy. With this anthology of reports from award-winning journalist Mike Barlow, you’ll appreciate how data science is fundamentally altering our world, for better and for worse. Barlow paints a picture of the emerging data space in broad strokes. From new techniques and tools to the use of data for social good, you’ll find out how far data science reaches. With this anthology, you’ll learn how: Analysts can now get results from their data queries in near real time Indie manufacturers are blurring the lines between hardware and software Companies try to balance their desire for rapid innovation with the need to tighten data security Advanced analytics and low-cost sensors are transforming equipment maintenance from a cost center to a profit center CIOs have gradually evolved from order takers to business innovators New analytics tools let businesses go beyond data analysis and straight to decision-making Mike Barlow is an award-winning journalist, author, and communications strategy consultant. Since launching his own firm, Cumulus Partners, he has represented major organizations in a number of industries.

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem

Get Started Fast with Apache Hadoop ® 2, YARN, and Today’s Hadoop Ecosystem With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models. Hadoop ® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it. Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more. This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist. Coverage Includes Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters Exploring the Hadoop Distributed File System (HDFS) Understanding the essentials of MapReduce and YARN application programming Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase Observing application progress, controlling jobs, and managing workflows Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark

VersaStack Solution by Cisco and IBM with SQL, Spectrum Control, and Spectrum Protect

Dynamic organizations want to accelerate growth while reducing costs. To do so, they must speed the deployment of business applications and adapt quickly to any changes in priorities. Organizations today require an IT infrastructure to be easy, efficient, and versatile. The VersaStack solution by Cisco and IBM® can help you accelerate the deployment of your data centers. It reduces costs by more efficiently managing information and resources while maintaining your ability to adapt to business change. The VersaStack solution combines the innovation of Cisco UCS Integrated Infrastructure with the efficiency of the IBM Storwize® storage system. The Cisco UCS Integrated Infrastructure includes the Cisco Unified Computing System (Cisco UCS), Cisco Nexus and Cisco MDS switches, and Cisco UCS Director. The IBM Storwize V7000 enhances virtual environments with its Data Virtualization, IBM Real-time Compression™, and IBM Easy Tier® features. These features deliver extraordinary levels of performance and efficiency. The VersaStack solution is Cisco Application Centric Infrastructure (ACI) ready. Your IT team can build, deploy, secure, and maintain applications through a more agile framework. Cisco Intercloud Fabric capabilities help enable the creation of open and highly secure solutions for the hybrid cloud. These solutions accelerate your IT transformation while delivering dramatic improvements in operational efficiency and simplicity. Cisco and IBM are global leaders in the IT industry. The VersaStack solution gives you the opportunity to take advantage of integrated infrastructure solutions that are targeted at enterprise applications, analytics, and cloud solutions. The VersaStack solution is backed by Cisco Validated Designs (CVD) to provide faster delivery of applications, greater IT efficiency, and less risk. This IBM Redbooks® publication is aimed at experienced storage administrators that are tasked with deploying a VersaStack solution with Microsoft Sequel (SQL), IBM Spectrum™ Protect, and IBM Spectrum Control™.

Sams Teach Yourself: Big Data Analytics with Microsoft HDInsight in 24 Hours

Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours In just 24 lessons of one hour or less, Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours helps you leverage Hadoop’s power on a flexible, scalable cloud platform using Microsoft’s newest business intelligence, visualization, and productivity tools. This book’s straightforward, step-by-step approach shows you how to provision, configure, monitor, and troubleshoot HDInsight and use Hadoop cloud services to solve real analytics problems. You’ll gain more of Hadoop’s benefits, with less complexity–even if you’re completely new to Big Data analytics. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success. Practical, hands-on examples show you how to apply what you learn Quizzes and exercises help you test your knowledge and stretch your skills Notes and tips point out shortcuts and solutions Learn how to… Master core Big Data and NoSQL concepts, value propositions, and use cases Work with key Hadoop features, such as HDFS2 and YARN Quickly install, configure, and monitor Hadoop (HDInsight) clusters in the cloud Automate provisioning, customize clusters, install additional Hadoop projects, and administer clusters Integrate, analyze, and report with Microsoft BI and Power BI Automate workflows for data transformation, integration, and other tasks Use Apache HBase on HDInsight Use Sqoop or SSIS to move data to or from HDInsight Perform R-based statistical computing on HDInsight datasets Accelerate analytics with Apache Spark Run real-time analytics on high-velocity data streams Write MapReduce, Hive, and Pig programs Register your book at informit.com/register for convenient access to downloads, updates, and corrections as they become available.

Data Preparation in the Big Data Era

Preparing and cleaning data is notoriously expensive, prone to error, and time consuming: the process accounts for roughly 80% of the total time spent on analysis. As this O’Reilly report points out, enterprises have already invested billions of dollars in big data analytics, so there’s great incentive to modernize methods for cleaning, combining, and transforming data. Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, details best practices for reducing the time it takes to convert raw data into actionable insights. With these tools and techniques in mind, your organization will be well positioned to translate big data into big decisions. Explore the problems organizations face today with traditional prep and integration Define the business questions you want to address before selecting, prepping, and analyzing data Learn new methods for preparing raw data, including date-time and string data Understand how some cleaning actions (like replacing missing values) affect your analysis Examine data curation products: modern approaches that scale Consider your business audience when choosing ways to deliver your analysis

Fast Data: Smart and at Scale

The need for fast data applications is growing rapidly, driven by the IoT, the surge in machine-to-machine (M2M) data, global mobile device proliferation, and the monetization of SaaS platforms. So how do you combine real-time, streaming analytics with real-time decisions in an architecture that’s reliable, scalable, and simple? In this O’Reilly report, Ryan Betts and John Hugg from VoltDB examine ways to develop apps for fast data, using pre-defined patterns. These patterns are general enough to suit both the do-it-yourself, hybrid batch/streaming approach, as well as the simpler, proven in-memory approach available with certain fast database offerings. Their goal is to create a collection of fast data app development recipes. We welcome your contributions, which will be tested and included in future editions of this report.

Hadoop with Python

Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages with this distributed storage and processing framework, particularly Python. With this concise book, you’ll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the Apache Spark cluster-computing framework. Authors Zachary Radtka and Donald Miner from the data science firm Miner & Kasch take you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark. Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools. Use the Python library Snakebite to access HDFS programmatically from within Python applications Write MapReduce jobs in Python with mrjob, the Python MapReduce library Extend Pig Latin with user-defined functions (UDFs) in Python Use the Spark Python API (PySpark) to write Spark programs with Python Learn how to use the Luigi Python workflow scheduler to manage MapReduce jobs and Pig scripts Zachary Radtka, a platform engineer at Miner & Kasch, has extensive experience creating custom analytics that run on petabyte-scale data sets.

The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI

This comprehensive and authoritative guide will teach you the DAX language for business intelligence, data modeling, and analytics. Leading Microsoft BI consultants Marco Russo and Alberto Ferrari help you master everything from table functions through advanced code and model optimization. You’ll learn exactly what happens under the hood when you run a DAX expression, how DAX behaves differently from other languages, and how to use this knowledge to write fast, robust code. If you want to leverage all of DAX’s remarkable power and flexibility, this no-compromise “deep dive” is exactly what you need. Perform powerful data analysis with DAX for Microsoft SQL Server Analysis Services, Excel, and Power BI Master core DAX concepts, including calculated columns, measures, and error handling Understand evaluation contexts and the CALCULATE and CALCULATETABLE functions Perform time-based calculations: YTD, MTD, previous year, working days, and more Work with expanded tables, complex functions, and elaborate DAX expressions Perform calculations over hierarchies, including parent/child hierarchies Use DAX to express diverse and unusual relationships Measure DAX query performance with SQL Server Profiler and DAX Studio

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Moe Kiss (Canva) , Michael Helbling (Search Discovery) , Ben Gaines

In this episode, we've simulated a lobby bar at the end of Adobe Summit, with Ben Gaines dropping by and everyone temporarily tapped out on talk of eVars, s.Props, derived metrics, and classifications for a bit. The result? A conversation that quickly turns to an adjacent passion of many digital analysts: sports analytics. Baseball, basketball, football, e-sports, the CFL, and even a fairly obscure game played on ice with a stick. Surprisingly, the discussion loops back to the parallels of sports analytics to digital analytics time and time again. This Power Hour clocks in almost 10 minutes shorter than a regulation NBA game.

People, places, and things referenced in this episode include:

Moneyball Fivethirtyeight.com Nate Silver and The Signal and the Noise Grantland Nylon Calculus stats.nba.com baseball-reference.com nhl-reference.com The Ottawa Redblacks

Dashboards for Excel

The book takes a hands-on approach to developing dashboards, from instructing users on advanced Excel techniques to addressing dashboard pitfalls common in the real world. Dashboards for Excel is your key to creating informative, actionable, and interactive dashboards and decision support systems. Throughout the book, the reader is challenged to think about Excel and data analytics differently—that is, to think outside the cell. This book shows you how to create dashboards in Excel quickly and effectively. In this book, you learn how to: Apply data visualization principles for more effective dashboards Employ dynamic charts and tables to create dashboards that are constantly up-to-date and providing fresh information Use understated yet powerful formulas for Excel development Apply advanced Excel techniques mixing formulas and Visual Basic for Applications (VBA) to create interactive dashboards Create dynamic systems for decision support in your organization Avoid common problems in Excel development and dashboard creation Get started with the Excel data model, PowerPivot, and Power Query

Building a Recommendation System with R

Dive into building recommendation systems with R in this comprehensive guide. You will learn about data mining, machine learning, and how R's powerful libraries and tools can be utilized to create efficient and optimized recommendation engines. By the end of this book, you will have the expertise to develop custom solutions tailored to specific data and user cases. What this Book will help me do Master the foundations of recommendation systems and their applications. Understand and implement essential data preprocessing techniques. Learn to optimize recommendation algorithms for better efficiency. Explore the use of the recommenderlab package in R for building models. Gain hands-on experience through a complete case study building a recommendation engine. Author(s) None Usuelli is a seasoned data scientist and R programming enthusiast passionate about machine learning and data analysis. They have extensive experience in developing recommendation systems for various industries, leveraging the power of R for robust solutions. None's clear teaching approach makes complex concepts accessible to learners of all levels. Who is it for? This book is ideal for developers who already possess a fundamental understanding of R and basic machine learning principles. If you aim to deepen your knowledge in creating advanced recommendation systems and practically apply these concepts, this book is the perfect resource for you. It is an excellent guide for professionals looking to specialize in predictive analytics and systems design.

IBM Software for SAP Solutions

SAP is a market leader in enterprise business application software. SAP solutions provide a rich set of composable application modules, and configurable functional capabilities that are expected from a comprehensive enterprise business application software suite. In most cases, companies that adopt SAP software remain heterogeneous enterprises running both SAP and non-SAP systems to support their business processes. Regardless of the specific scenario, in heterogeneous enterprises most SAP implementations must be integrated with a variety of non-SAP enterprise systems: Portals Messaging infrastructure Business process management (BPM) tools Enterprise Content Management (ECM) methods and tools Business analytics (BA) and business intelligence (BI) technologies Security Systems of record Systems of engagement The tooling included with SAP software addresses many needs for creating SAP-centric environments. However, the classic approach to implementing SAP functionality generally leaves the business with a rigid solution that is difficult and expensive to change and enhance. When SAP software is used in a large, heterogeneous enterprise environment, SAP clients face the dilemma of selecting the correct set of tools and platforms to implement SAP functionality, and to integrate the SAP solutions with non-SAP systems. This IBM® Redbooks® publication explains the value of integrating IBM software with SAP solutions. It describes how to enhance and extend pre-built capabilities in SAP software with best-in-class IBM enterprise software, enabling clients to maximize return on investment (ROI) in their SAP investment and achieve a balanced enterprise architecture approach. This book describes IBM Reference Architecture for SAP, a prescriptive blueprint for using IBM software in SAP solutions. The reference architecture is focused on defining the use of IBM software with SAP, and is not intended to address the internal aspects of SAP components. The chapters of this book provide a specific reference architecture for many of the architectural domains that are each important for a large enterprise to establish common strategy, efficiency, and balance. The majority of the most important architectural domain topics, such as integration, process optimization, master data management, mobile access, Enterprise Content Management, business intelligence, DevOps, security, systems monitoring, and so on, are covered in the book. However, there are several other architectural domains which are not included in the book. This is not to imply that these other architectural domains are not important or are less important, or that IBM does not offer a solution to address them. It is only reflective of time constraints, available resources, and the complexity of assembling a book on an extremely broad topic. Although more content could have been added, the authors feel confident that the scope of architectural material that has been included should provide organizations with a fantastic head start in defining their own enterprise reference architecture for many of the important architectural domains, and it is hoped that this book provides great value to those reading it. This IBM Redbooks publication is targeted to the following audiences: Client decision makers and solution architects leading enterprise transformation projects and wanting to gain further insight so that they can benefit from the integration of IBM software in large-scale SAP projects. IT architects and consultants integrating IBM technology with SAP solutions.

Data Analysis in the Cloud

Data Analysis in the Cloud introduces and discusses models, methods, techniques, and systems to analyze the large number of digital data sources available on the Internet using the computing and storage facilities of the cloud. Coverage includes scalable data mining and knowledge discovery techniques together with cloud computing concepts, models, and systems. Specific sections focus on map-reduce and NoSQL models. The book also includes techniques for conducting high-performance distributed analysis of large data on clouds. Finally, the book examines research trends such as Big Data pervasive computing, data-intensive exascale computing, and massive social network analysis. Introduces data analysis techniques and cloud computing concepts Describes cloud-based models and systems for Big Data analytics Provides examples of the state-of-the-art in cloud data analysis Explains how to develop large-scale data mining applications on clouds Outlines the main research trends in the area of scalable Big Data analysis

Managing Ever-Increasing Amounts of Data with IBM DB2 for z/OS: Using Temporal Data Management, Archive Transparency, and the DB2 Analytics Accelerator

IBM® DB2® Version 11.1 for z/OS® (DB2 11 for z/OS or just DB2 11 throughout this book) is the fifteenth release of DB2 for IBM MVS™. The DB2 11 environment is available either for new installations of DB2 or for migrations from DB2 10 for z/OS subsystems only. This IBM Redbooks® publication describes enhancements that are available with DB2 11 for z/OS. The contents help database administrators to understand the new extensions and performance enhancements, to plan for ways to use the key new capabilities, and to justify the investment in installing or migrating to DB2 11. Businesses are faced with a global and increasingly competitive business environment, and they need to collect and analyze ever increasing amounts of data (Figure 1). Governments also need to collect and analyze large amounts of data. The main focus of this book is to introduce recent DB2 capability that can be used to address challenges facing organizations with storing and analyzing exploding amounts of business or organizational data, while managing risk and trying to meet new regulatory and compliance requirements. This book describes recent extensions to DB2 for z/OS in V10 and V11 that can help organizations address these challenges.

Beginning Big Data with Power BI and Excel 2013

In Beginning Big Data with Power BI and Excel 2013, you will learn to solve business problems by tapping the power of Microsoft’s Excel and Power BI to import data from NoSQL and SQL databases and other sources, create relational data models, and analyze business problems through sophisticated dashboards and data-driven maps. While Beginning Big Data with Power BI and Excel 2013 covers prominent tools such as Hadoop and the NoSQL databases, it recognizes that most small and medium-sized businesses don’t have the Big Data processing needs of a Netflix, Target, or Facebook. Instead, it shows how to import data and use the self-service analytics available in Excel with Power BI. As you’ll see through the book’s numerous case examples, these tools—which you already know how to use—can perform many of the same functions as the higher-end Apache tools many people believe are required to carry out in Big Data projects. Through instruction, insight, advice, and case studies, Beginning Big Data with Power BI and Excel 2013 will show you how to: Import and mash up data from web pages, SQL and NoSQL databases, the Azure Marketplace and other sources. Tap into the analytical power of PivotTables and PivotCharts and develop relational data models to track trends and make predictions based on a wide range of data. Understand basic statistics and use Excel with PowerBI to do sophisticated statistical analysis—including identifying trends and correlations. Use SQL within Excel to do sophisticated queries across multiple tables, including NoSQL databases. Create complex formulas to solve real-world business problems using Data Analysis Expressions (DAX).

Data Analytics in Sports

As any child with a baseball card intuitively knows, sports and statistics go hand-in-hand. Yet, the general media disdain the flood of sports statistics available today: sports are pure and analytic tools are not. Well, if the so-called purists find tools like baseball’s sabermetrics upsetting, then they’d better brace themselves for the new wave of data analytics. In this O’Reilly report, Janine Barlow examines how advanced predictive analytics are impacting the world of sports—from the rise of tools such as Major League Baseball’s Statcast, which collects data on the movement of balls and players, to SportVU, which the National Basketball Association uses to collect spatial analysis data. You’ll also learn: How "Dance Card" makes accurate predictions about NCAA’s "March Madness" tournament Why data is crumbling long-standing myths about performance in soccer How the National Football League is using wearable devices to collect vital health data about its players It’s a new world in sports, where data analytics and related information technologies are changing the experience for teams, players, fans, and investors.

Getting Data Right

Over the last 20 years, companies have invested roughly $3-4 trillion in enterprise software. These investments have been primarily focused on the development and deployment of single systems, applications, functions, and geographies targeted at the automation and optimization of key business processes. Companies are now investing heavily in big data analytics ($44 billion alone in 2014) in an effort to begin analyzing all of the data being generated from their process automation systems. But companies are quickly realizing that one of their key bottlenecks is Data Variety—the silo’d nature of the data that is a natural result of internal and external source proliferation. The problem of big data variety has crept up from the bottom—and the cost of variety is only appreciated when companies attempt to ask simple questions across many business silos (divisions, geographies, functions, etc.). Current top-down, deterministic data unification approaches (such as ETL, ELT, and MDM) were simply not designed to scale to the variety of hundreds or thousands or even tens of thousands of data silos. Download this free eBook to learn about the fundamental challenges that Data Variety poses to enterprises looking to maximize the value of their existing investments—and how new approaches promise to help organizations embrace and leverage the fundamental diversity of data. Readers will also find best practices for designing bottom-up and probabilistic methods for finding and managing data; principles for doing data science at scale in the big data era; preparing and unifying data in ways that complement existing systems; optimizing data warehousing; and how to use “data ops” to automate large-scale integration.