Data Analytics

Handbook of Statistics

2013-05-16 · O'Reilly Data Science Books O'Reilly Amazon

book

by C.R. Rao , Venu Govindaraju

AI/ML Analytics Cyber Security data data-science data-science-tasks statistics

Statistical learning and analysis techniques have become extremely important today, given the tremendous growth in the size of heterogeneous data collections and the ability to process it even from physically distant locations. Recent advances made in the field of machine learning provide a strong framework for robust learning from the diverse corpora and continue to impact a variety of research problems across multiple scientific disciplines. The aim of this handbook is to familiarize beginners as well as experts with some of the recent techniques in this field. The Handbook is divided in two sections: Theory and Applications, covering machine learning, data analytics, biometrics, document recognition and security. very relevant to current research challenges faced in various fields self-contained reference to machine learning emphasis on applications-oriented techniques

Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses

2013-01-22 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Ambiga Dhiraj , Michael Minelli , Michele Chambers

Analytics BI Big Data Cloud Computing DataViz Marketing business-intelligence data data-science

Unique prospective on the big data analytics phenomenon for both business and IT professionals The availability of Big Data, low-cost commodity hardware and new information management and analytics software has produced a unique moment in the history of business. The convergence of these trends means that we have the capabilities required to analyze astonishing data sets quickly and cost-effectively for the first time in history. These capabilities are neither theoretical nor trivial. They represent a genuine leap forward and a clear opportunity to realize enormous gains in terms of efficiency, productivity, revenue and profitability. The Age of Big Data is here, and these are truly revolutionary times. This timely book looks at cutting-edge companies supporting an exciting new generation of business analytics. Learn more about the trends in big data and how they are impacting the business world (Risk, Marketing, Healthcare, Financial Services, etc.) Explains this new technology and how companies can use them effectively to gather the data that they need and glean critical insights Explores relevant topics such as data privacy, data visualization, unstructured data, crowd sourcing data scientists, cloud computing for big data, and much more.

Big Data Analytics: Turning Big Data into Big Money

2012-11-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Frank J. Ohlhorst

Analytics Big Data business-intelligence data data-science

Unique insights to implement big data analytics and reap big returns to your bottom line Focusing on the business and financial value of big data analytics, respected technology journalist Frank J. Ohlhorst shares his insights on the newly emerging field of big data analytics in Big Data AnalyticsM. This breakthrough book demonstrates the importance of analytics, defines the processes, highlights the tangible and intangible values and discusses how you can turn a business liability into actionable material that can be used to redefine markets, improve profits and identify new business opportunities. Reveals big data analytics as the next wave for businesses looking for competitive advantage Takes an in-depth look at the financial value of big data analytics Offers tools and best practices for working with big data Once the domain of large on-line retailers such as eBay and Amazon, big data is now accessible by businesses of all sizes and across industries. From how to mine the data your company collects, to the data that is available on the outside, Big Data Analytics shows how you can leverage big data into a key component in your business's growth strategy.

Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure

2012-10-30 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Michael S. Gendron

Analytics BI Big Data Cloud Computing DWH business-intelligence data data-science

Expert guidance for building an information communication and technology infrastructure that provides best in business intelligence Enterprise performance management (EPM) technology has been rapidly advancing, especially in the areas of predictive analysis and cloud-based solutions. Business intelligence caught on as a concept in the business world as the business strategy application of data warehousing in the early 2000s. With the recent surge in interest in data analytics and big data, it has seen a renewed level of interest as the ability of a business to find the valuable data in a timely—and competitive—fashion. Business Intelligence Applied reveals essential information for building an optimal and effective information and communication technology (ICT) infrastructure. Defines ICT infrastructure Examines best practices for documenting business change and for documenting technology recommendations Includes examples and cases from Europe and Asia Written for business intelligence staff, CIOs, CTOs, and technology managers With examples and cases from Europe and Asia, Business Intelligence Applied expertly covers business intelligence, a hot topic in business today as a key element to business and data analytics.

Service-Oriented Distributed Knowledge Discovery

2012-10-05 · O'Reilly Data Science Books O'Reilly Amazon

book

by Paolo Trunfio , Domenico Talia

AI/ML Analytics data data-science data-science-tasks exploratory-data-analysis

A new approach to distributed large-scale data mining, service-oriented knowledge discovery extracts useful knowledge from often unmanageable volumes of data by exploiting data mining and machine learning distributed models and techniques in service-oriented infrastructures. Service-Oriented Distributed Knowledge Discovery presents techniques, algorithms, and systems based on the service-oriented paradigm. It explains how to design services for data analytics, describes real systems for implementing distributed knowledge discovery applications, and explores mobile data mining models.

Principles of Data Integration

2012-06-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Zachary Ives , AnHai Doan , Alon Halevy

Analytics Cloud Computing DWH data data-engineering

Principles of Data Integration is the first comprehensive textbook of data integration, covering theoretical principles and implementation issues as well as current challenges raised by the semantic web and cloud computing. The book offers a range of data integration solutions enabling you to focus on what is most relevant to the problem at hand. Readers will also learn how to build their own algorithms and implement their own data integration application. Written by three of the most respected experts in the field, this book provides an extensive introduction to the theory and concepts underlying today's data integration techniques, with detailed, instruction for their application using concrete examples throughout to explain the concepts. This text is an ideal resource for database practitioners in industry, including data warehouse engineers, database system designers, data architects/enterprise architects, database researchers, statisticians, and data analysts; students in data analytics and knowledge discovery; and other data professionals working at the R&D and implementation levels. Offers a range of data integration solutions enabling you to focus on what is most relevant to the problem at hand Enables you to build your own algorithms and implement your own data integration applications

Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics

2012-04-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Bill Franks (The International Institute For Analytics (IIA))

Analytics Big Data data data-science

You receive an e-mail. It contains an offer for a complete personal computer system. It seems like the retailer read your mind since you were exploring computers on their web site just a few hours prior.... As you drive to the store to buy the computer bundle, you get an offer for a discounted coffee from the coffee shop you are getting ready to drive past. It says that since you're in the area, you can get 10% off if you stop by in the next 20 minutes.... As you drink your coffee, you receive an apology from the manufacturer of a product that you complained about yesterday on your Facebook page, as well as on the company's web site.... Finally, once you get back home, you receive notice of a special armor upgrade available for purchase in your favorite online video game. It is just what is needed to get past some spots you've been struggling with.... Sound crazy? Are these things that can only happen in the distant future? No. All of these scenarios are possible today! Big data. Advanced analytics. Big data analytics. It seems you can't escape such terms today. Everywhere you turn people are discussing, writing about, and promoting big data and advanced analytics. Well, you can now add this book to the discussion. What is real and what is hype? Such attention can lead one to the suspicion that perhaps the analysis of big data is something that is more hype than substance. While there has been a lot of hype over the past few years, the reality is that we are in a transformative era in terms of analytic capabilities and the leveraging of massive amounts of data. If you take the time to cut through the sometimes-over-zealous hype present in the media, you'll find something very real and very powerful underneath it. With big data, the hype is driven by genuine excitement and anticipation of the business and consumer benefits that analyzing it will yield over time. Big data is the next wave of new data sources that will drive the next wave of analytic innovation in business, government, and academia. These innovations have the potential to radically change how organizations view their business. The analysis that big data enables will lead to decisions that are more informed and, in some cases, different from what they are today. It will yield insights that many can only dream about today. As you'll see, there are many consistencies with the requirements to tame big data and what has always been needed to tame new data sources. However, the additional scale of big data necessitates utilizing the newest tools, technologies, methods, and processes. The old way of approaching analysis just won't work. It is time to evolve the world of advanced analytics to the next level. That's what this book is about. Taming the Big Data Tidal Wave isn't just the title of this book, but rather an activity that will determine which businesses win and which lose in the next decade. By preparing and taking the initiative, organizations can ride the big data tidal wave to success rather than being pummeled underneath the crushing surf. What do you need to know and how do you prepare in order to start taming big data and generating exciting new analytics from it? Sit back, get comfortable, and prepare to find out!

IBM InfoSphere Streams: Assembling Continuous Insight in the Information Revolution

2011-10-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vitali N. Zoubov , Roger Rea , Deepak Rajan , Bugra Gedik , Michael P. Koranda , Kevin Foster , Chuck Ballard , Mike Spicer , Senthil Nathan , Andy Frenkiel , Brian Williams

Analytics Big Data HTML IBM Data Streaming data data-engineering infosphere

In this IBM® Redbooks® publication, we discuss and describe the positioning, functions, capabilities, and advanced programming techniques for IBM InfoSphere™ Streams (V2), a new paradigm and key component of IBM Big Data platform. Data has traditionally been stored in files or databases, and then analyzed by queries and applications. With stream computing, analysis is performed moment by moment as the data is in motion. In fact, the data might never be stored (perhaps only the analytic results). The ability to analyze data in motion is called real-time analytic processing (RTAP). IBM InfoSphere Streams takes a fundamentally different approach to Big Data analytics and differentiates itself with its distributed runtime platform, programming model, and tools for developing and debugging analytic applications that have a high volume and variety of data types. Using in-memory techniques and analyzing record by record enables high velocity. Volume, variety and velocity are the key attributes of Big Data. The data streams that are consumable by IBM InfoSphere Streams can originate from sensors, cameras, news feeds, stock tickers, and a variety of other sources, including traditional databases. It provides an execution platform and services for applications that ingest, filter, analyze, and correlate potentially massive volumes of continuous data streams. This book is intended for professionals that require an understanding of how to process high volumes of streaming data or need information about how to implement systems to satisfy those requirements. See: http://www.redbooks.ibm.com/abstracts/sg247865.html for the IBM InfoSphere Streams (V1) release.

PowerPivot for Business Intelligence Using Excel and SharePoint

2011-03-15 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Barry Ralston

Analytics BI DAX Microsoft analytics-platforms data data-science powerpivot

PowerPivot comprises a set of technologies for easy access to data mining and business intelligence analysis from Microsoft Excel and SharePoint. Power users and developers alike can create sophisticated, online analytic processing (OLAP) solutions using PowerPivot for Excel, and then share those solutions with other users via PowerPivot for SharePoint. Data can be pulled in from any of the leading database platforms, as well as from spreadsheets and flat files. PowerPivot for Business Intelligence Using Excel and SharePoint is your key to mastering PowerPivot. The book takes a scenario-based approach to showing you how to collect data, to mine that data through insightful analysis, and to draw conclusions that drive business performance. Each chapter in the book is focused on a specific challenge that you'll encounter when using PowerPivot. Each chapter takes you through a solution technique that's been proven in the real world. Covers the leading technology for bringing data analytics to the desktop Presents real-world solutions to real-world scenarios Written by a Microsoft Virtual Technical Specialist (VTS) for Business Intelligence What you'll learn Install and verify the PowerPivot software Integrated existing, available data to deliver business intelligence Leverage Time Intelligence to report change over time Write Data Analysis Expressions (DAX) to create custom measures Identify and implement solutions for role-playing dimensions Recognize and work-around PowerPivot's missing features Who this book is for PowerPivot Solutions for Excel and SharePoint is aimed at information workers and data analysts who typically use Excel to drive business decisions. The book shows how you can apply PowerPivot to problems typically addressed through complicated and arcane spreadsheet techniques. Business people without the time and interest in learning Excel arcane will especially appreciate how PowerPivot enables them to easily create models and perform analysis far in advance of anything they could do using Excel alone.

Business Intelligence

2011-03-06 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Jerzy Surma

Analytics BI Data Modelling DWH business-intelligence data data-science

This book is about using business intelligence as a management information system for supporting managerial decision making. It concentrates primarily on practical business issues and demonstrates how to apply data warehousing and data analytics to support business decision making. This book progresses through a logical sequence, starting with data model infrastructure, then data preparation, followed by data analysis, integration, knowledge discovery, and finally the actual use of discovered knowledge. All examples are based on the most recent achievements in business intelligence. Finally this book outlines an overview of a methodology that takes into account the complexity of developing applications in an integrated business intelligence environment. This book is written for managers, business consultants, and undergraduate and postgraduates students in business administration.

Practical Applications of Data Mining

2011-01-20 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sang C. Suh

Analytics data data-science data-science-tasks exploratory-data-analysis

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems.

Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

IBM Cognos 8 Report Studio Cookbook

2010-06-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Abhishek Sanghani

Analytics BI Cognos Git IBM XML analytics-platforms data data-science

The "IBM Cognos 8 Report Studio Cookbook" by Abhishek Sanghani provides over 80 hands-on recipes to enhance your proficiency in creating business reports using Cognos 8 Report Studio. From mastering basic techniques to leveraging advanced features, this book is your guide to developing reports that meet real-world business demands. What this Book will help me do Understand and utilize advanced techniques for sorting, filtering, and aggregating data in reports. Implement features like conditional formatting, cascaded prompts, and master-detail queries to enhance report functionality. Create dynamic, user-friendly business reports tailored to specific requirements. Make use of XML specifications to customize reports beyond the capabilities of the default tools. Adopt best practices in report development such as version control and regression testing. Author(s) Abhishek Sanghani is an experienced Business Intelligence professional specializing in IBM Cognos and data analytics. With practical knowledge from implementing solutions for various industries, he brings a wealth of insight into creating powerful business reports. Abhishek's approachable writing makes advanced Report Studio concepts accessible to readers. Who is it for? This book is ideally suited for Business Intelligence or MIS developers working with Cognos Report Studio, seeking advanced guidance for creating reports. Business analysts and power users wanting to extend beyond basic report authoring will also benefit greatly. The book assumes a functional understanding of Cognos Studio and familiarity with its ecosystem.

R in a Nutshell

2010-01-04 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joseph Adler

Analytics DataViz R data data-science data-science-tools r

Why learn R? Because it's rapidly becoming the standard for developing statistical software. R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics. The author introduces you to the R environment, including the R graphical user interface and console, and takes you through the fundamentals of the object-oriented R language. Then, through a variety of practical examples from medicine, business, and sports, you'll learn how you can use this remarkable tool to solve your own data analysis problems. Understand the basics of the language, including the nature of R objects Learn how to write R functions and build your own packages Work with data through visualization, statistical analysis, and other methods Explore the wealth of packages contributed by the R community Become familiar with the lattice graphics package for high-level data visualization Learn about bioinformatics packages provided by Bioconductor "I am excited about this book. R in a Nutshell is a great introduction to R, as well as a comprehensive reference for using R in data analytics and visualization. Adler provides 'real world' examples, practical advice, and scripts, making it accessible to anyone working with data, not just professional statisticians."

Automate data workflows with Gemini in BigQuery

· Google Cloud Next '25

demo

AI/ML Analytics BigQuery LLM product-bigquery product-dataplex product-gemini-in-bigquery

Streamline data analytics workflows with AI assistance. Enhance data team productivity and reduce costs.

Big Data is Dead: Long Live Hot Data 🔥

· Small Data SF 2024 Watch

video

Analytics Big Data BigQuery Cloud Computing Data Engineering DuckDB DWH Hadoop Motherduck Snowflake

Over the last decade, Big Data was everywhere. Let's set the record straight on what is and isn't Big Data. We have been consumed by a conversation about data volumes when we should focus more on the immediate task at hand: Simplifying our work.

Some of us may have Big Data, but our quest to derive insights from it is measured in small slices of work that fit on your laptop or in your hand. Easy data is here— let's make the most of it.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-is-dead/ Small Data Manifesto: https://motherduck.com/blog/small-data-manifesto/ Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/

Explore the "Small Data" movement, a counter-narrative to the prevailing big data conference hype. This talk challenges the assumption that data scale is the most important feature of every workload, defining big data as any dataset too large for a single machine. We'll unpack why this distinction is crucial for modern data engineering and analytics, setting the stage for a new perspective on data architecture.

Delve into the history of big data systems, starting with the non-linear hardware costs that plagued early data practitioners. Discover how Google's foundational papers on GFS, MapReduce, and Bigtable led to the creation of Hadoop, fundamentally changing how we scale data processing. We'll break down the "big data tax"—the inherent latency and system complexity overhead required for distributed systems to function, a critical concept for anyone evaluating data platforms.

Learn about the architectural cornerstone of the modern cloud data warehouse: the separation of storage and compute. This design, popularized by systems like Snowflake and Google BigQuery, allows storage to scale almost infinitely while compute resources are provisioned on-demand. Understand how this model paved the way for massive data lakes but also introduced new complexities and cost considerations that are often overlooked.

We examine the cracks appearing in the big data paradigm, especially for OLAP workloads. While systems like Snowflake are still dominant, the rise of powerful alternatives like DuckDB signals a shift. We reveal the hidden costs of big data analytics, exemplified by a petabyte-scale query costing nearly $6,000, and argue that for most use cases, it's too expensive to run computations over massive datasets.

The key to efficient data processing isn't your total data size, but the size of your "hot data" or working set. This talk argues that the revenge of the single node is here, as modern hardware can often handle the actual data queried without the overhead of the big data tax. This is a crucial optimization technique for reducing cost and improving performance in any data warehouse.

Discover the core principles for designing systems in a post-big data world. We'll show that since only 1 in 500 users run true big data queries, prioritizing simplicity over premature scaling is key. For low latency, process data close to the user with tools like DuckDB and SQLite. This local-first approach offers a compelling alternative to cloud-centric models, enabling faster, more cost-effective, and innovative data architectures.

Demo: Cloud Security Foundations

· Google Cloud Next '25

expo-experience

AI/ML Analytics Cloud Computing GCP Cyber Security

Explore how Google Cloud provides built-in secure controls to help you maintain a strong cloud security posture. See in action how Google Cloud’s Security Foundation recommended products help address most common cloud adoption use cases, including infrastructure modernization, securing AI workloads, data analytics, and application modernization.

Is BI Too Big for Small Data?

· Small Data SF 2024 Watch

video

Analytics BI Big Data Modern Data Stack Motherduck Power BI Tableau

This is a talk about how we thought we had Big Data, and we built everything planning for Big Data, but then it turns out we didn't have Big Data, and while that's nice and fun and seems more chill, it's actually ruining everything, and I am here asking you to please help us figure out what we are supposed to do now.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-... Small Data Manifesto: https://motherduck.com/blog/small-dat... Is Excel Immortal?: https://benn.substack.com/p/is-excel-immortal Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: / motherduck
X/Twitter : / motherduck
Blog: https://motherduck.com/blog/

Mode founder David Wheeler challenges the data industry's obsession with "big data," arguing that most companies are actually working with "small data," and our tools are failing us. This talk deconstructs the common sales narrative for BI tools, exposing why the promise of finding game-changing insights through data exploration often falls flat. If you've ever built dashboards nobody uses or wondered why your analytics platform doesn't deliver on its promises, this is a must-watch reality check on the modern data stack.

We explore the standard BI demo, where an analyst uncovers a critical insight by drilling into event data. This story sells tools like Tableau and Power BI, but it rarely reflects reality, leading to a "revolving door of BI" as companies swap tools every few years. Discover why the narrative of the intrepid analyst finding a needle in the haystack only works in movies and how this disconnect creates a cycle of failed data initiatives and unused "trashboards."

The presentation traces our belief that "data is the new oil" back to the early 2010s, with examples from Target's predictive analytics and Facebook's growth hacking. However, these successes were built on truly massive datasets. For most businesses, analyzing small data results in noisy charts that offer vague "directional vibes" rather than clear, actionable insights. We contrast the promise of big data analytics with the practical challenges of small data interpretation.

Finally, learn actionable strategies for extracting real value from the data you actually have. We argue that BI tools should shift focus from data exploration to data interpretation, helping users understand what their charts actually mean. Learn why "doing things that don't scale," like manually analyzing individual customer journeys, can be more effective than complex models for small datasets. This talk offers a new perspective for data scientists, analysts, and developers looking for better data analysis techniques beyond the big data hype.

Reimagine marketing with data and gen AI

· Google Cloud Next '25

demo

AI/ML Analytics GenAI Marketing product-bigquery product-looker product-managed-kafka product-spanner product-vertex-ai

Unleash fresh avenues for efficient marketing with data analytics and generative AI. Combine creativity with modern analytics to transform marketing workflows from real-time insights to AI-powered recommendations, segmentation, and image and video generation.

Retro Tech Revolution: Level up customer experiences with real-time data and AI insights

· Google Cloud Next '25

demo

AI/ML Analytics LLM product-agones product-bi-agent product-bigquery product-bigquery-continuous-query product-dataflow product-enterprise-document-agent product-gemini product-google-kubernetes-engine product-imagen3 +3 more

A playful take on retro gaming meets real-time data analytics and AI. Dive into an end-to-end architecture that ingests user interactions in various shapes and forms, and provides a tailor-made game experience to the end user using Gemini.

What’s next for data analytics in the AI era in under 7 minutes

· Google Cloud Next '24

session

AI/ML Analytics BI BigQuery Cloud Computing GCP GenAI Looker

With the surge of new generative AI capabilities, companies and their customers can now interact with systems and data in new ways. To activate AI organizations require a data foundation with the scale and efficiency to bring business data together with AI models and ground them in customer reality. Join this session to learn the latest innovations for data analytics and BI, and why tens of thousands of organizations are fueling their journey with BigQuery and Looker.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

talk-data.com

Activity Trend

Top Events

Top Speakers

Handbook of Statistics

Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses

Big Data Analytics: Turning Big Data into Big Money

Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure

Service-Oriented Distributed Knowledge Discovery

Principles of Data Integration

Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics

IBM InfoSphere Streams: Assembling Continuous Insight in the Information Revolution

PowerPivot for Business Intelligence Using Excel and SharePoint

Business Intelligence

Practical Applications of Data Mining

IBM Cognos 8 Report Studio Cookbook

R in a Nutshell

Automate data workflows with Gemini in BigQuery

Big Data is Dead: Long Live Hot Data 🔥

Demo: Cloud Security Foundations

Is BI Too Big for Small Data?

Reimagine marketing with data and gen AI

Retro Tech Revolution: Level up customer experiences with real-time data and AI insights

What’s next for data analytics in the AI era in under 7 minutes