data

Implementing CDISC Using SAS

2016-11-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Chris Holland , Jack Shostak

Data Modelling SAS XML analytics-platforms data-science

For decades researchers and programmers have used SAS to analyze, summarize, and report clinical trial data. Now Chris Holland and Jack Shostak have updated their popular Implementing CDISC Using SAS, the first comprehensive book on applying clinical research data and metadata to the Clinical Data Interchange Standards Consortium (CDISC) standards.

Implementing CDISC Using SAS: An End-to-End Guide, Second Edition, is an all-inclusive guide on how to implement and analyze the Study Data Tabulation Model (SDTM) and the Analysis Data Model (ADaM) data and prepare clinical trial data for regulatory submission. Updated to reflect the 2017 FDA mandate for adherence to CDISC standards, this new edition covers creating and using metadata, developing conversion specifications, implementing and validating SDTM and ADaM data, determining solutions for legacy data conversions, and preparing data for regulatory submission. The book covers products such as Base SAS, SAS Clinical Data Integration, and the SAS Clinical Standards Toolkit, as well as JMP Clinical. Topics included in this new edition include an implementation of the Define-XML 2.0 standard, new SDTM domains, validation with Pinnacle 21 software, event narratives in JMP Clinical, and of course new versions of SAS and JMP software.

Any manager or user of clinical trial data in this day and age is likely to benefit from knowing how to either put data into a CDISC standard or analyzing and finding data once it is in a CDISC format. If you are one such person--a data manager, clinical and/or statistical programmer, biostatistician, or even a clinician--then this book is for you.

Visualizing Graph Data

2016-11-23 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Corey Lanum

Big Data DataViz data-science data-science-tasks data-visualization

Visualizing Graph Data teaches you not only how to build graph data structures, but also how to create your own dynamic and interactive visualizations using a variety of tools. This book is loaded with fascinating examples and case studies to show you the real-world value of graph visualizations. About the Technology Assume you are doing a great job collecting data about your customers and products. Are you able to turn your rich data into important insight? Complex relationships in large data sets can be difficult to recognize. Visualizing these connections as graphs makes it possible to see the patterns, so you can find meaning in an otherwise over-whelming sea of facts. About the Book Visualizing Graph Data teaches you how to understand graph data, build graph data structures, and create meaningful visualizations. This engaging book gently introduces graph data visualization through fascinating examples and compelling case studies. You'll discover simple, but effective, techniques to model your data, handle big data, and depict temporal and spatial data. By the end, you'll have a conceptual foundation as well as the practical skills to explore your own data with confidence. What's Inside Techniques for creating effective visualizations Examples using the Gephi and KeyLines visualization packages Real-world case studies About the Reader No prior experience with graph data is required. About the Author Corey Lanum has decades of experience building visualization and analysis applications for companies and government agencies around the globe. Quotes Shows you how to solve visualization problems and explore complex data sets. A pragmatic introduction. - John D. Lewis, DDN Excellent! Hands-on! Shows you how to kick-start your graph data visualization. - Rocio Chongtay, University of Southern Denmark A clear and concise guide to both graph theory and visualization. - Jonathan Suever, PhD, Georgia Institute of Technology Great coverage, with real-life business use cases. - Sumit Pal, Big Data consultant

High Performance SQL Server: The Go Faster Book

2016-11-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Benjamin Nevarez

SQL data-engineering

Design and configure SQL Server instances and databases in support of high-throughput applications that are mission-critical and provide consistent response times in the face of variations in user numbers and query volumes. Learn to configure SQL Server and design your databases to support a given instance and workload. You'll learn advanced configuration options, in-memory technologies, storage and disk configuration, and more, all toward enabling your desired application performance and throughput. Configuration doesn't stop with implementation. Workloads change over time, and other impediments can arise to thwart desired performance. covers monitoring and troubleshooting to aid in detecting and fixing production performance problems and minimizing application outages. You'll learn a variety of tools, ranging from the traditional wait analysis methodology to the new query store, and you'll learn how improving performance is really an iterative process. High Performance SQL Server is based on SQL Server 2016, although most of its content can be applied to prior versions of the product. This book is an excellent complement to performance tuning books focusing on SQL queries, and provides the other half of what you need to know by focusing on configuring the instances on which mission-critical queries are executed. High Performance SQL Server Covers SQL Server instance-configuration for optimal performance Helps in implementing SQL Server in-memory technologies Provides guidance toward monitoring and ongoing diagnostics What You Will Learn Understand SQL Server's database engine and how it processes queries Configure instances in support of high-throughput applications Provide consistent response times to varying user numbers and query volumes Design databases for high-throughput applications with focus on performance Record performance baselines and monitor SQL Server instances against them Troubleshot and fix performance problems Who This Book Is For SQL Server database administrators, developers, and data architects. The book is also of use to system administrators who are managing and are responsible for the physical servers on which SQL Server instances are run.

R Data Structures and Algorithms

2016-11-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by PKS Prakash , Achyutuni Sri Krishna Rao

Computer Science data-science data-science-tools r

"R Data Structures and Algorithms" serves as a comprehensive guide to understanding data structures and algorithms for R developers. You will explore key data structures like stacks, queues, and trees, learn sorting and searching techniques, and apply these concepts to enhance the speed and efficiency of your R programs. What this Book will help me do Analyze algorithm efficiency using Big-O notation. Implement key data structures such as arrays, linked lists, and trees in R. Explore advanced techniques like dynamic programming and graph algorithms. Master sorting and searching algorithms for optimizing data processes. Utilize R-specific structures like vectors and data frames effectively. Author(s) The authors, PKS Prakash and Sri Krishna Rao, bring extensive experience in software development and data analysis, and a passion for making computer science concepts accessible. Their combined expertise ensures readers gain practical knowledge along with a deep theoretical understanding. Who is it for? This book is perfect for R developers aiming to deepen their understanding of data structures and algorithms. Whether you're a beginner with basic R proficiency or an advanced user seeking to boost application performance, this book provides the knowledge you need to succeed.

Implementing IBM FlashSystem 900

2016-11-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Karen Orlando , Jon Herd , Detlef Helmbrecht , Carsten Larsen , Ingo Dimmer , Matt Levan

Analytics Cloud Computing IBM data-engineering

Today’s global organizations depend on being able to unlock business insights from massive volumes of data. Now, with IBM® FlashSystem 900, powered by IBM FlashCore™ technology, they can make faster decisions based on real-time insights and unleash the power of the most demanding applications, including online transaction processing (OLTP) and analytics databases, virtual desktop infrastructures (VDIs), technical computing applications, and cloud environments. This IBM Redbooks® publication introduces clients to the IBM FlashSystem® 900. It provides in-depth knowledge of the product architecture, software and hardware, implementation, and hints and tips. Also illustrated are use cases that show real-world solutions for tiering, flash-only, and preferred-read, and also examples of the benefits gained by integrating the FlashSystem storage into business environments. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand how to implement this new and exciting technology. This book describes the following offerings of the IBM Spectrum™ Storage family: IBM Spectrum Storage™ IBM Spectrum Control™ IBM Spectrum Virtualize™ IBM Spectrum Scale™ IBM Spectrum Accelerate™

Advanced R: Data Programming and the Cloud

2016-11-17 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joshua F. Wiley , Matt Wiley

Cloud Computing Data Management GitHub MongoDB data-science data-science-tools r

Program for data analysis using R and learn practical skills to make your work more efficient. This book covers how to automate running code and the creation of reports to share your results, as well as writing functions and packages. Advanced R is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, it is designed to be a practical guide moving beyond merely using R to programming in R to automate tasks. This book will show you how to manipulate data in modern R structures and includes connecting R to data bases such as SQLite, PostgeSQL, and MongoDB. The book closes with a hands-on section to get R running in the cloud. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics. What You Will Learn Write and document R functions Make an R package and share it via GitHub or privately Add tests to R code to insure it works as intended Build packages automatically with GitHub Use R to talk directly to databases and do complex data management Run R in the Amazon cloud Generate presentation-ready tables and reports using R Who This Book Is For Working professionals, researchers, or students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level.

Apache HBase Primer

2016-11-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Deepak Vohra

API Data Modelling Hadoop Apache HBase NoSQL data-engineering nosql-databases

Learn the fundamental foundations and concepts of the Apache HBase (NoSQL) open source database. It covers the HBase data model, architecture, schema design, API, and administration. Apache HBase is the database for the Apache Hadoop framework. HBase is a column family based NoSQL database that provides a flexible schema model. What You'll Learn Work with the core concepts of HBase Discover the HBase data model, schema design, and architecture Use the HBase API and administration Who This Book Is For Apache HBase (NoSQL) database users, designers, developers, and admins.

How to design with data

2016-11-15 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joel Marsh

Analytics data-science

Data is a key part of analyzing your designs and the way your users use your designs. Analytics can seem intimidating if you are not familiar with them, but the basics are pretty simple once you know what the numbers and graphs mean. What you’ll learn&8212;and how you can apply it You will learn basic tips about how to interpret a graph of user behavior to find the problems in your designs (so you can fix them!), and what the fundamental numbers mean. You will also start to have an intuition about how to compare those numbers to understand the “health” of your site/app and see insights that no one else can see. This lesson is for you because You can start using the information from these lessons today, and you will feel more comfortable learning more about user data and analytics after reading them. Prerequisites: No experience with data is necessary General familiarity with the idea of designing digital things is helpful Materials or downloads needed: None This Lesson in taken from by Joel Marsh. UX for Beginners

R for Microsoft® Excel Users: Making the Transition for Statistical Analysis

2016-11-15 · O'Reilly Data Science Books O'Reilly Amazon

book

by Conrad Carlberg

Microsoft data-science data-science-tools r

Microsoft Excel can perform many statistical analyses, but thousands of business users and analysts are now reaching its limits. R, in contrast, can perform virtually any imaginable analysis—if you can get over its learning curve. In R for Microsoft® Excel Users, Conrad Carlberg shows exactly how to get the most from both programs. Drawing on his immense experience helping organizations apply statistical methods, Carlberg reviews how to perform key tasks in Excel, and then guides you through reaching the same outcome in R—including which packages to install and how to access them. Carlberg offers expert advice on when and how to use Excel, when and how to use R instead, and the strengths and weaknesses of each tool. Writing in clear, understandable English, Carlberg combines essential statistical theory with hands-on examples reflecting real-world challenges. By the time you’ve finished, you’ll be comfortable using R to solve a wide spectrum of problems—including many you just couldn’t handle with Excel. • Smoothly transition to R and its radically different user interface • Leverage the R community’s immense library of packages • Efficiently move data between Excel and R • Use R’s DescTools for descriptive statistics, including bivariate analyses • Perform regression analysis and statistical inference in R and Excel • Analyze variance and covariance, including single-factor and factorial ANOVA • Use R’s mlogit package and glm function for Solver-style logistic regression • Analyze time series and principal components with R and Excel

The Big Data Transformation

2016-11-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ashish Thusoo

Analytics Big Data Data Analytics DWH Hadoop Marketing Vertica data-engineering

Business executives today are well aware of the power of data, especially for gaining actionable insight into products and services. But how do you jump into the big data analytics game without spending millions on data warehouse solutions you don’t need? This 40-page report focuses on massively parallel processing (MPP) analytical databases that enable you to run queries and dashboards on a variety of business metrics at extreme speed and Exabyte scale. Because they leverage the full computational power of a cluster, MPP analytical databases can analyze massive volumes of data—both structured and semi-structured—at unprecedented speeds. This report presents five real-world case studies from Etsy, Cerner Corporation, Criteo and other global enterprises to focus on one big data analytics platform in particular, HPE Vertica. You’ll discover: How one prominent data storage company convinced both business and tech stakeholders to adopt an MPP analytical database Why performance marketing technology company Criteo used a Center of Excellence (CoE) model to ensure the success of its big data analytics endeavors How YPSM uses Vertica to speed up its Hadoop-based data processing environment Why Cerner adopted an analytical database to scale its highly successful health information technology platform How Etsy drives success with the company’s big data initiative by avoiding common technical and organizational mistakes

Forecasting Fundamentals

2016-11-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Nada Sanders

data-science data-science-tasks forecasting statistics time-series

This book is for everyone who wants to make better forecasts. It is not about mathematics and statistics. It is about following a well-established forecasting process to create and implement good forecasts. This is true whether you are forecasting global markets, sales of SKUs, competitive strategy, or market disruptions. Today, most forecasts are generated using software. However, no amount of technology and statistics can compensate for a poor forecasting process. Forecasting is not just about generating a number. Forecasters need to understand the problems they are trying to solve. They also need to follow a process that is justifiable to other parties and be implemented in practice. This is what the book is about. Accurate forecasts are essential for predicting demand, identifying new market opportunities, forecasting risks, disruptions, innovation, competition, market growth and trends. Companies can navigate this daunting landscape and improve their forecasts by following some well-established principles. This book is written to provide the fundamentals business leaders need in order to make good forecasts. These fundamentals hold true regardless of what is being forecast and what technology is being used. It provides the basic foundational principles all companies need to achieve competitive forecast accuracy.

Beginning Hibernate: For Hibernate 5

2016-11-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dave Minter , Joseph Ottinger , Jeff Linwood

Big Data Java MongoDB NoSQL data-engineering database-management-tools hibernate object-relational-mapping

Get started with the Hibernate 5 persistence layer and gain a clear introduction to the current standard for object-relational persistence in Java. This updated edition includes the new Hibernate 5.0 framework as well as coverage of NoSQL, MongoDB, and other related technologies, ranging from applications to big data. Beginning Hibernate is ideal if you're experienced in Java with databases (the traditional, or connected, approach), but new to open-source, lightweight Hibernate. The book keeps its focus on Hibernate without wasting time on nonessential third-party tools, so you'll be able to immediately start building transaction-based engines and applications. Experienced authors Joseph Ottinger with Dave Minter and Jeff Linwood provide more in-depth examples than any other book for Hibernate beginners. They present their material in a lively, example-based manner—not a dry, theoretical, hard-to-read fashion. What You'll Learn Build enterprise Java-based transaction-type applications that access complex data with Hibernate Work with Hibernate 5 using a present-day build process Use Java 8 features with Hibernate Integrate into the persistence life cycle Map using Java's annotations Search and query with the new version of Hibernate Integrate with MongoDB using NoSQL Keep track of versioned data with Hibernate Envers Who This Book Is For Experienced Java developers interested in learning how to use and apply object-relational persistence in Java and who are new to the Hibernate persistence framework.

Stepping Away from the Silos

2016-11-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Margaret Coutts

data-engineering search

For over twenty years, digitisation has been a core element of the modern information landscape. The digital lifecycle is now well defined, and standards and good practice have been developed for most of its key stages. There remains, however, a widespread lack of coordination of digitisation initiatives, both within and across different sectors, and there are disparate approaches to selection criteria. The result is ‘silos’ of digitised content. Stepping away from the Silos examines the strategic context in the UK since the 1990s and its effect on collaboration and coordination of exemplar digitisation initiatives in higher education and related sectors. It identifies the principal criteria for content selection that are common to the international literature in this field. The outputs of the exemplar projects are examined in relation to these criteria. A range of common practices and patterns in content selection appears to have developed over time, forming a de facto strategy from which several areas of critical mass have emerged. The book discusses the potential to improve strategic collaboration and coordinated selection by building on such a platform, and considers planning options in the context of work on national digitisation strategies in the UK and internationally. Summarises the rise of publicly funded digitisation in the UK from the 1990s to date and identifies the need to improve coordination and content selection criteria Reviews the role of digitisation in government and organisational strategies from the 1990s to the present day Examines the strategic position of collaboration within and across different organisations Identifies common selection criteria and outlines the coverage of exemplar projects Discusses the apparent emergence of a de facto selection strategy and the potential for national strategic planning of digitised content based on existing outputs and improved collaboration

Programming Pig, 2nd Edition

2016-11-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Alan Gates , Daniel Dai

Data Modelling Hadoop HDFS Python data-engineering pig

For many organizations, Hadoop is the first step for dealing with massive amounts of data. The next step? Processing and analyzing datasets with the Apache Pig scripting platform. With Pig, you can batch-process data without having to create a full-fledged application, making it easy to experiment with new datasets. Updated with use cases and programming examples, this second edition is the ideal learning tool for new and experienced users alike. You’ll find comprehensive coverage on key features such as the Pig Latin scripting language and the Grunt shell. When you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig. Delve into Pig’s data model, including scalar and complex data types Write Pig Latin scripts to sort, group, join, project, and filter your data Use Grunt to work with the Hadoop Distributed File System (HDFS) Build complex data processing pipelines with Pig’s macros and modularity features Embed Pig Latin in Python for iterative processing and other advanced tasks Use Pig with Apache Tez to build high-performance batch and interactive data processing applications Create your own load and store functions to handle data formats and storage mechanisms

Pro Power BI Desktop

2016-11-08 · O'Reilly Data Science Books O'Reilly Amazon

book

by Adam Aspin

BI Cloud Computing KPI Microsoft Power BI business-intelligence data-science microsoft-power-platform power-bi

This book shows how to deliver eye-catching Business Intelligence with Microsoft Power BI Desktop. You can now take data from virtually any source and use it to produce stunning dashboards and compelling reports that will seize your audience's attention. Slice and dice the data with remarkable ease then add metrics and KPIs to project the insights that create your competitive advantage. Make raw data into clear, accurate, and interactive information with Microsoft's free self-service business intelligence tool. will help you to push your BI delivery to the next level. You'll learn to create great-looking visualizations and let your audience have fun by interacting with the elegant and visually arresting output that you can now deliver. You can choose from a wide range of built-in and third-party visualization types so that your message is always enhanced. You'll be able to deliver those results on the PC, on tablets, on smartphones, as well as share results via the cloud. Finally, this book helps you save time by preparing the underlying data correctly without needing an IT department to prepare it for you. Power BI Desktop will let your analyses speak for themselves. Pro Power BI Desktop Simple techniques to make data into insight. Polished interactive dashboards to deliver attention-grabbing information. Visually arresting output on multiple devices grab the reader's attention. What You Will Learn Produce designer output to astound your bosses and peers. Share business intelligence in the cloud Deliver visually stunning charts, maps, and tables. Make them interactive too! Find new insights as you chop and tweak your data as never before. Adapt delivery to mobile devices such as phones and tablets. Audience is written for any user who is comfortable in Microsoft Office. Everyone from CEOs and Business Intelligence developers through to power users and IT managers can use this book to outshine the competition by producing 21st Century business intelligence visualizations and reporting on a variety of devices from a variety of data sources. All of this is possible through leveraging your existing skill set with the same, common Microsoft tools you already use in your daily work. Pro Power BI Desktop

Business Analytics for Managers, 2nd Edition

2016-11-07 · O'Reilly Data Science Books O'Reilly Amazon

book

by Gert H.N. Laursen , Jesper Thorlund

Analytics Big Data Cloud Computing DWH Hadoop Cyber Security business-intelligence data-science

The intensified used of data based on analytical models to control digitalized operational business processes in an intelligent way is a game changer that continuously disrupts more and more markets. This book exemplifies this development and shows the latest tools and advances in this field Business Analytics for Managers offers real-world guidance for organizations looking to leverage their data into a competitive advantage. This new second edition covers the advances that have revolutionized the field since the first edition's release; big data and real-time digitalized decision making have become major components of any analytics strategy, and new technologies are allowing businesses to gain even more insight from the ever-increasing influx of data. New terms, theories, and technologies are explained and discussed in terms of practical benefit, and the emphasis on forward thinking over historical data describes how analytics can drive better business planning. Coverage includes data warehousing, big data, social media, security, cloud technologies, and future trends, with expert insight on the practical aspects of the current state of the field. Analytics helps businesses move forward. Extensive use of statistical and quantitative analysis alongside explanatory and predictive modeling facilitates fact-based decision making, and evolving technologies continue to streamline every step of the process. This book provides an essential update, and describes how today's tools make business analytics more valuable than ever. Learn how Hadoop can upgrade your data processing and storage Discover the many uses for social media data in analysis and communication Get up to speed on the latest in cloud technologies, data security, and more Prepare for emerging technologies and the future of business analytics Most businesses are caught in a massive, non-stop stream of data. It can become one of your most valuable assets, or a never-ending flood of missed opportunity. Technology moves fast, and keeping up with the cutting edge is crucial for wringing even more value from your data— Business Analytics for Managers brings you up to date, and shows you what analytics can do for you now.

Delivering Business Intelligence with Microsoft SQL Server 2016, Fourth Edition, 4th Edition

2016-11-04 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Brian Larson

Analytics BI C#/.NET Data Modelling DAX KPI Microsoft Power BI SQL SQL Server data-engineering microsoft-sql-server +1 more

Distribute Actionable, Timely BI with Microsoft® SQL Server® 2016 and Power BI Drive better, faster, more informed decision making across your organization using the expert tips and best practices featured in this hands-on guide. Delivering Business Intelligence with Microsoft SQL Server 2016, Fourth Edition, shows, step-by-step, how to distribute high-performance, custom analytics to users enterprise-wide. Discover how to build BI Semantic Models, create data marts and OLAP cubes, write MDX and DAX scripts, and share insights using Microsoft client tools. The book includes coverage of self-service business intelligence with Power BI. • Understand the goals and components of successful BI • Build data marts, OLAP cubes, and Tabular models • Load and cleanse data with SQL Server Integration Services • Manipulate and analyze data using MDX and DAX scripts and queries • Work with SQL Server Analysis Services and the BI Semantic Model • Author interactive reports using SQL Server Data Tools • Create KPIs and digital dashboards • Implement time-based analytics • Embed data model content in custom applications using ADOMD.NET • Use Power BI to gather, model, and visualize data in a self-service environment

Oracle R Enterprise: Harnessing the Power of R in Oracle Database

2016-11-04 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Brendan Tierney

Analytics API Big Data Hadoop Oracle R SQL data-engineering oracle-database-solutions

Master the Big Data Capabilities of Oracle R Enterprise Effectively manage your enterprise’s big data and keep complex processes running smoothly using the hands-on information contained in this Oracle Press guide. Oracle R Enterprise: Harnessing the Power of R in Oracle Database shows, step-by-step, how to create and execute large-scale predictive analytics and maintain superior performance. Discover how to explore and prepare your data, accurately model business processes, generate sophisticated graphics, and write and deploy powerful scripts. You will also find out how to effectively incorporate Oracle R Enterprise features in APEX applications, OBIEE dashboards, and Apache Hadoop systems. Learn to: • Install, configure, and administer Oracle R Enterprise • Establish connections and move data to the database • Create Oracle R Enterprise packages and functions • Use the R language to work with data in Oracle Database • Build models using ODM, ORE, and other algorithms • Develop and deploy R scripts and use the R script repository • Execute embedded R scripts and employ ORE SQL API functions • Map and manipulate data using Oracle R Advanced Analytics for Hadoop • Use ORE in Oracle Data Miner, OBIEE, and other applications

EU General Data Protection Regulation (GDPR): An Implementation and Compliance Guide

2016-11-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by IT Governance Privacy Team

Cloud Computing GDPR/CCPA Cyber Security data-engineering data-security-privacy eu-general-data-protection-regulation-gdpr eu general data protection regulation (gdpr)

An in-depth guide to the changes your organization needs to make to comply with the EU GDPR.

The EU General Data Protection Regulation (GDPR) will supersede the 1995 EU Data Protection Directive (DPD) and all EU member states’ national laws based on it – including the UK Data Protection Act 1998 – in May 2018.

All organizations – wherever they are in the world – that process the personally identifiable information (PII) of EU residents must comply with the Regulation. Failure to do so could result in fines of up to €20 million or 4% of annual global turnover.

US organizations that process EU residents’ personal data can comply with the GDPR via the EU-US Privacy Shield, which replaced the EU-US Safe Harbor framework in 2016. The Privacy Shield is based on the DPD, and will likely be updated once the GDPR is applied in May 2018.

This book provides a detailed commentary on the GDPR, explains the changes you need to make to your data protection and information security regimes, and tells you exactly what you need to do to avoid severe financial penalties.

Product overview

EU GDPR – An Implementation and Compliance Guide is a clear and comprehensive guide to this new data protection law, explaining the Regulation, and setting out the obligations of data processors and controllers in terms you can understand.

Topics covered include:

The role of the data protection officer (DPO) – including whether you need one and what they should do. Risk management and data protection impact assessments (DPIAs), including how, when and why to conduct a DPIA. Data subjects’ rights, including consent and the withdrawal of consent; subject access requests and how to handle them; and data controllers’ and processors’ obligations. International data transfers to “third countries” – including guidance on adequacy decisions and appropriate safeguards; the EU-US Privacy Shield; international organizations; limited transfers; and Cloud providers. How to adjust your data protection processes to transition to GDPR compliance, and the best way of demonstrating that compliance. A full index of the Regulation to help you find the articles and stipulations relevant to your organization.

The GDPR will have a significant impact on organizational data protection regimes around the world. EU GDPR – An implementation and Compliance Guide shows you exactly what you need to do to comply with the new law.

About the authors

IT Governance is a leading global provider of IT governance, risk management, and compliance expertise, and we pride ourselves on our ability to deliver a broad range of integrated, high-quality solutions that meet the real-world needs of our international client base.

Our privacy team – led by Alan Calder, Richard Campo, and Adrian Ross – has substantial experience in privacy, data protection, compliance, and information security. This experience, and our understanding of the background and drivers for the GDPR, are combined in this manual to provide the world’s first guide to implementing the new data protection regulation.

Spark in Action

2016-11-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marko Bonaci , Petar Zecevic

AI/ML Analytics API Big Data DevOps Docker Java Python Scala Spark SQL Data Streaming +3 more

Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. Fully updated for Spark 2.0. About the Technology Big data systems distribute datasets across clusters of machines, making it a challenge to efficiently query, stream, and interpret them. Spark can help. It is a processing system designed specifically for distributed data. It provides easy-to-use interfaces, along with the performance you need for production-quality analytics and machine learning. Spark 2 also adds improved programming APIs, better performance, and countless other upgrades. About the Book Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. You'll get comfortable with the Spark CLI as you work through a few introductory examples. Then, you'll start programming Spark using its core APIs. Along the way, you'll work with structured data using Spark SQL, process near-real-time streaming data, apply machine learning algorithms, and munge graph data using Spark GraphX. For a zero-effort startup, you can download the preconfigured virtual machine ready for you to try the book's code. What's Inside Updated for Spark 2.0 Real-life case studies Spark DevOps with Docker Examples in Scala, and online in Java and Python About the Reader Written for experienced programmers with some background in big data or machine learning. About the Authors Petar Zečević and Marko Bonaći are seasoned developers heavily involved in the Spark community. Quotes Dig in and get your hands dirty with one of the hottest data processing engines today. A great guide. - Jonathan Sharley, Pandora Media Must-have! Speed up your learning of Spark as a distributed computing framework. - Robert Ormandi, Yahoo! An easy-to-follow, step-by-step guide. - Gaurav Bhardwaj, 3Pillar Global An ambitiously comprehensive overview of Spark and its diverse ecosystem. - Jonathan Miller, Optensity

talk-data.com

Activity Trend

Top Events

Top Speakers

Implementing CDISC Using SAS

Visualizing Graph Data

High Performance SQL Server: The Go Faster Book

R Data Structures and Algorithms

Implementing IBM FlashSystem 900

Advanced R: Data Programming and the Cloud

Apache HBase Primer

How to design with data

R for Microsoft® Excel Users: Making the Transition for Statistical Analysis

The Big Data Transformation

Forecasting Fundamentals

Beginning Hibernate: For Hibernate 5

Stepping Away from the Silos

Programming Pig, 2nd Edition

Pro Power BI Desktop

Business Analytics for Managers, 2nd Edition

Delivering Business Intelligence with Microsoft SQL Server 2016, Fourth Edition, 4th Edition

Oracle R Enterprise: Harnessing the Power of R in Oracle Database

EU General Data Protection Regulation (GDPR): An Implementation and Compliance Guide

Spark in Action