Topic

Confluence

collaboration documentation knowledge_management

Activities

3

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q2

Top Events

O'Reilly Data Engineering Books 2 ACE Paris 2025#3 - Piloter la suite Atlassian au service de l’efficacité collect 1 Google Cloud Next '24 1 O'Reilly Data Science Books 1 Big Data LDN 2024 1 Data Council 2023 1 ACE Berlin : September Edition 1 Databricks DATA + AI Summit 2023 1 Google I/O Extended 2023 North America 1 DataFramed 1

Top Speakers

Peter Norvig (Google) 1 J.J. Allaire (RStudio) 1 Zoriana Bogutska (Kolekti) 1 Dominik Ebeling 1 Glenn A. Fink 1 Houbing Song 1 Abi Brown (Kolekti) 1 Robert Ilijason 1 Serg Masis (Syngenta) 1 Joey Jablonski (Pythian) 1 Hacène SAADOUNI (GRDF) 1 Sabina Jeschke 1

Activities

3 activities · Newest first

All Video Podcast Book

Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud

2020-06-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Robert Ilijason

AI/ML Analytics AWS Azure Big Data Cloud Computing Data Analytics Databricks Hadoop Hive Microsoft Python +5 more

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything aboutconfiguring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Apache Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free Who This Book Is For Data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation.

Security and Privacy in Cyber-Physical Systems

2017-11-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Glenn A. Fink , Houbing Song , Sabina Jeschke

Cyber Security data data-engineering data-security-privacy data security & privacy

Written by a team of experts at the forefront of the cyber-physical systems (CPS) revolution, this book provides an in-depth look at security and privacy, two of the most critical challenges facing both the CPS research and development community and ICT professionals. It explores, in depth, the key technical, social, and legal issues at stake, and it provides readers with the information they need to advance research and development in this exciting area. Cyber-physical systems (CPS) are engineered systems that are built from, and depend upon the seamless integration of computational algorithms and physical components. Advances in CPS will enable capability, adaptability, scalability, resiliency, safety, security, and usability far in excess of what today’s simple embedded systems can provide. Just as the Internet revolutionized the way we interact with information, CPS technology has already begun to transform the way people interact with engineered systems. In the years ahead, smart CPS will drive innovation and competition across industry sectors, from agriculture, energy, and transportation, to architecture, healthcare, and manufacturing. A priceless source of practical information and inspiration, Security and Privacy in Cyber-Physical Systems: Foundations, Principles and Applications is certain to have a profound impact on ongoing R&D and education at the confluence of security, privacy, and CPS.

Discovering Knowledge in Data: An Introduction to Data Mining, 2nd Edition

2014-07-08 · O'Reilly Data Science Books O'Reilly Amazon

book

by Daniel T. Larose

Analytics BI Big Data Computer Science Data Modelling Data Science data data-science data-science-tasks exploratory-data-analysis

The field of data mining lies at the confluence of predictive analytics, statistical analysis, and business intelligence. Due to the ever-increasing complexity and size of data sets and the wide range of applications in computer science, business, and health care, the process of discovering knowledge in data is more relevant than ever before. This book provides the tools needed to thrive in today's big data world. The author demonstrates how to leverage a company's existing databases to increase profits and market share, and carefully explains the most current data science methods and techniques. The reader will "learn data mining by doing data mining". By adding chapters on data modelling preparation, imputation of missing data, and multivariate statistical analysis, Discovering Knowledge in Data, Second Edition remains the eminent reference on data mining. The second edition of a highly praised, successful reference on data mining, with thorough coverage of big data applications, predictive analytics, and statistical analysis. Includes new chapters on Multivariate Statistics, Preparing to Model the Data, and Imputation of Missing Data, and an Appendix on Data Summarization and Visualization Offers extensive coverage of the R statistical programming language Contains 280 end-of-chapter exercises Includes a companion website with further resources for all readers, and Powerpoint slides, a solutions manual, and suggested projects for instructors who adopt the book