IBM z14 Technical Introduction

2017-07-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Esra Ufacik , Frank Packheiser , John Troy , Bill White , Octavian Lascu , Michal Kordyzon , Hervey Kamga , Bo XU

Agile/Scrum Analytics IBM Cyber Security data data-engineering

Abstract This IBM® Redpaper Redbooks® publication introduces the latest IBM Z platform, the IBM z14®. It includes information about the Z environment and how it helps integrate data and transactions more securely, and can infuse insight for faster and more accurate business decisions. The z14 is state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to the digital era and the trust economy. These capabilities include: - Securing data with pervasive encryption - Transforming a transactional platform into a data powerhouse - Getting more out of the platform with IT Operational Analytics - Providing resilience with key to zero downtime - Accelerating digital transformation with agile service delivery - Revolutionizing business processes - Blending open source and Z technologies This book explains how this system uses both new innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and security. With the z14 as the base, applications can run in a trusted, reliable, and secure environment that both improves operations and lessens business risk.

Mastering Apache Spark 2.x - Second Edition

2017-07-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Romeo Kienzler

AI/ML Analytics Big Data Data Analytics IBM Kubernetes Scala Spark SQL apache-spark data data-engineering

Mastering Apache Spark 2.x is the essential guide to harnessing the power of big data processing. Dive into real-time data analytics, machine learning, and cluster computing using Apache Spark's advanced features and modules like Spark SQL and MLlib. What this Book will help me do Gain proficiency in Spark's batch and real-time data processing with SparkSQL. Master techniques for machine learning and deep learning using SparkML and SystemML. Understand the principles of Spark's graph processing with GraphX and GraphFrames. Learn to deploy Apache Spark efficiently on platforms like Kubernetes and IBM Cloud. Optimize Spark cluster performance by configuring parameters effectively. Author(s) Romeo Kienzler is a seasoned professional in big data and machine learning technologies. With years of experience in cloud-based distributed systems, Romeo brings practical insights into leveraging Apache Spark. He combines his deep technical expertise with a clear and engaging writing style. Who is it for? This book is tailored for intermediate Apache Spark users eager to deepen their knowledge in Spark 2.x's advanced features. Ideal for data engineers and big data professionals seeking to enhance their analytics pipelines with Spark. A basic understanding of Spark and Scala is necessary. If you're aiming to optimize Spark for real-world applications, this book is crafted for you.

SQL Server 2016 High Availability Unleashed (includes Content Update Program)

2017-07-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Paul Bertucc , Raju Shreewastava

AWS Azure BI Big Data DWH Microsoft SQL data data-engineering microsoft-sql-server relational-databases

Book + Content Update Program SQL Server 2016 High Availability Unleashed provides start-to-finish coverage of SQL Server’s powerful high availability (HA) solutions for your traditional on-premise databases, cloud-based databases (Azure or AWS), hybrid databases (on-premise coupled with the cloud), and your emerging Big Data solutions. This complete guide introduces an easy-to-follow, formal HA methodology that has been refined over the past several years and helps you identity the right HA solution for your needs. There is also additional coverage of both disaster recovery and business continuity architectures and considerations. You are provided with step-by-step guides, examples, and sample code to help you set up, manage, and administer these highly available solutions. All examples are based on existing production deployments at major Fortune 500 companies around the globe. This book is for all intermediate-to-advanced SQL Server and Big Data professionals, but is also organized so that the first few chapters are great foundation reading for CIOs, CTOs, and even some tech-savvy CFOs. Learn a formal, high availability methodology for understanding and selecting the right HA solution for your needs Deep dive into Microsoft Cluster Services Use selective data replication topologies Explore thorough details on AlwaysOn and availability groups Learn about HA options with log shipping and database mirroring/ snapshots Get details on Microsoft Azure for Big Data and Azure SQL Explore business continuity and disaster recovery Learn about on-premise, cloud, and hybrid deployments Provide all types of database needs, including online transaction processing, data warehouse and business intelligence, and Big Data Explore the future of HA and disaster recovery In addition, this book is part of InformIT’s exciting Content Update Program, which provides content updates for major technology improvements! As significant updates are made to SQL Server, sections of this book will be updated or new sections will be added to match the updates to the technologies. As updates become available, they will be delivered to you via a free Web Edition of this book, which can be accessed with any Internet connection. To learn more, visit informit.com/cup. How to access the Web Edition: Follow the instructions inside to learn how to register your book to access the FREE Web Edition. * The companion material is not available with the online edition on O'Reilly Learning

IBM Spectrum Accelerate Deployment, Usage, and Maintenance

2017-07-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Markus Oscheka , Bertrand Dufrasne , Abilio Oliveira , Grant Kabobel

Agile/Scrum IBM data data-engineering

Abstract This edition applies to IBM® Spectrum Accelerate V11.5.4. IBM Spectrum Accelerate™, a member of IBM Spectrum Storage™, is an agile, software-defined storage solution for enterprise and cloud that builds on the customer-proven and mature IBM XIV® storage software. The key characteristic of Spectrum Accelerate is that it can be easily deployed and run on purpose-built or existing hardware that is chosen by the customer. IBM Spectrum Accelerate enables rapid deployment of high-performance and scalable block data storage infrastructure over commodity hardware on-premises or off-premises. This IBM Redbooks® publication provides a broad understanding of IBM Spectrum Accelerate. The book introduces Spectrum Accelerate and describes planning and preparation that are essential for a successful deployment of the solution. The deployment is described through a step-by-step approach, by using a graphical user interface (GUI) based method or a simple command-line interface (CLI) based procedure. Chapters in this book describe the logical configuration of the system, host support and business continuity functions, and migration. Although it makes many references to the XIV storage software, the book also emphasizes where IBM Spectrum Accelerate differs from XIV. Finally, a substantial portion of the book is dedicated to maintenance and troubleshooting to provide detailed guidance for the customer support personnel.

Moving Hadoop to the Cloud

2017-07-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bill Havanki

Analytics Hadoop Hive Microsoft Cyber Security Spark data data-engineering

Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines. This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them. Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require Explore use cases for high availability, relational data with Hive, and complex analytics with Spark Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance

Learning SAP Analytics Cloud

2017-07-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by David Lai , Riaz Ahmed

Analytics BI SAP data data-engineering

Discover the power of SAP Analytics Cloud in solving business intelligence challenges through concise and clear instruction. This book is the essential guide for beginners, providing you a comprehensive understanding of the platform's features and capabilities. By the end, you'll master creating reports, models, and dashboards, making data-driven decisions with confidence. What this Book will help me do Learn how to navigate and utilize the SAP Analytics Cloud interface effectively. Create data models using various sources like Excel or text files for comprehensive insights. Design and compile visually engaging stories, reports, and dashboards effortlessly. Master collaborative and presentation tools inside SAP Digital Boardroom. Understand how to plan, predict, and analyze seamlessly within a single platform. Author(s) None Ahmed is an experienced SAP consultant and analytics professional, bringing years of practical experience in BI tools and enterprise analytics. As an expert in SAP Analytics Cloud, None has guided numerous teams in deploying effective analytics solutions. Their writing aims to demystify complex tools for learners. Who is it for? This book is ideal for IT professionals, business analysts, and newcomers eager to understand SAP Analytics Cloud. Beginner-level BI developers and managers seeking guided steps for mastering this platform will find it invaluable. If you aim to enhance your career in cloud-based analytics, this book is tailored for you.

Analytics.CLUB Boston Data Science in Information Security Panel

2017-07-06 · The Future of Data Podcast | conversation with leaders, influencers, and change makers in the World of Data & Analytics Listen

podcast_episode

by Mark Gerner , Kalpesh Sheth (Yaxa) , Bob Rudis (Rapid7)

AI/ML Analytics Big Data Data Science Cyber Security

The security challenges of a particular business may often be proportional to the amount of data they need to capture, process, and interpret. As businesses grow their security needs become ever more complex and challenging as the volume, velocity, and variety of data increases. Forward thinking organizations using data science to better process and interpret vast data stores both on-premise and in the cloud to identify threats and intrusions to their local networks and beyond.

Join us to participate in a dynamic discussion from practitioners with deep experience in the areas of data science or information security including:

• Bob Rudis, Chief Security Data Scientist, Rapid7, frequent blogger at rud.is, co-author of Data Driven Security, and ardent R open source contributor. Follow Bob on the web here. Previously, Bob was at Verizon and responsible for the Data Breach Investigations Report (DBIR) known in the security industry as "an unparalleled source of information on cybersecurity threats."

• Mark Gerner, Sr. Economic Data Scientist / Analytics Leader with 10+ years of experience designing, implementing, and communicating the results of analyses in support of customer engagement, strategic planning, and programmatic portfolio management related activities.

• Kalpesh Sheth, Co-founder & CEO, Yaxa, With 20+ years of technical expertise in data networking, network security, Intelligence Surveillance and Reconnaissance (ISR), and Cluster Computing. Before co-founding Yaxa, Sheth was Senior Technical Director at DRS Technologies (acquired by Finmeccanica S.p.A.), Director at RiverDelta Networks (acquired by Motorola and now part of Arris) and fifth employee of Digital Technology (acquired by Agilent Technologies). He is a co-author of VITA 41.6 an ANSI standard, and has spoken at numerous trade conferences as an expert panel member.

Venue Sponsor: @BoozAllen Media Sponsor: X.TAO.ai

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData Data Analytics Leadership Podcast Big Data Strategy

Frank Kane's Taming Big Data with Apache Spark and Python

2017-06-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Frank Kane (Sundog Software)

AI/ML AWS Amazon EMR Big Data Python Spark Data Streaming apache-spark data data-engineering

This book introduces you to the world of Big Data processing using Apache Spark and Python. You will learn to set up and run Spark on different systems, process massive datasets, and create solutions to real-world Big Data challenges with over 15 hands-on examples included. What this Book will help me do Understand the basics of Apache Spark and its ecosystem. Learn how to process large datasets with Spark RDDs using Python. Implement machine learning models with Spark's MLlib library. Master real-time data processing with Spark Streaming modules. Deploy and run Spark jobs on cloud clusters using AWS EMR. Author(s) Frank Kane spent 9 years working at Amazon and IMDb, handling and solving real-world machine learning and Big Data problems. Today, as an instructional designer and educator, he brings his wealth of experience to learners around the globe by creating accessible, practical learning resources. His teaching is clear, engaging, and designed to prepare students for real-world applications. Who is it for? This book is ideal for data scientists or data analysts seeking to delve into Big Data processing with Apache Spark. Readers who have foundational knowledge of Python, as well as some understanding of data processing principles, will find this book useful to sharpen their skills further. It is designed for those eager to learn the practical applications of Big Data tools in today's industry environments. By the end of this book, you should feel confident tackling Big Data challenges using Spark and Python.

Learning Elasticsearch

2017-06-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Abhishek Andhavarapu

Analytics API ELK Kibana data data-engineering elasticsearch search

This comprehensive guide to Elasticsearch will teach you how to build robust and scalable search and analytics applications using Elasticsearch 5.x. You will learn the fundamentals of Elasticsearch, including its APIs and tools, and how to apply them to real-world problems. By the end of the book, you will have a solid grasp of Elasticsearch and be ready to implement your own solutions. What this Book will help me do Master the setup and configuration of Elasticsearch and Kibana. Learn to efficiently query and analyze both structured and unstructured data. Understand how to use Elasticsearch aggregations to perform advanced analytics. Gain knowledge of advanced search features including geospatial queries and autocomplete. Explore the Elastic Stack and learn deployment best practices and cloud hosting options. Author(s) None Andhavarapu is an expert in database technology and distributed systems, with years of experience in Elasticsearch. Their passion for search technologies is reflected in their clear and practical teaching style. They've written this guide to help developers of all levels get up to speed with Elasticsearch quickly and comprehensively. Who is it for? This book is perfect for software developers looking to implement effective search and analytics solutions. It's ideal for those who are new to Elasticsearch as well as for professionals familiar with other search tools like Lucene or Solr. The book assumes basic programming knowledge but no prior experience with Elasticsearch.

MS Build 2017

2017-06-09 · Data Skeptic Listen

podcast_episode

by Kyle Polich , David Carmona , Rohan Kumar (Microsoft)

AI/ML Microsoft

This episode recaps the Microsoft Build Conference. Kyle recently attended and shares some thoughts on cloud, databases, cognitive services, and artificial intelligence. The episode includes interviews with Rohan Kumar and David Carmona.

Decision Support, Analytics, and Business Intelligence, Third Edition

2017-06-08 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Daniel J. Power , Ciara Heavin

Analytics BI Big Data IoT business-intelligence data data-science

Rapid technology change is impacting organizations large and small. Mobile and Cloud computing, the Internet of Things (IoT), and “Big Data” are driving forces in organizational digital transformation. Decision support and analytics are available to many people in a business or organization. Business professionals need to learn about and understand computerized decision support for organizations to succeed. This text is targeted to busy managers and students who need to grasp the basics of computerized decision support, including: What is analytics? What is a decision support system? What is “Big Data”? What are “Big Data” business use cases? Overall, it addresses 61 fundamental questions. In a short period of time, readers can “get up to speed” on decision support, analytics, and business intelligence. The book then provides a quick reference to important recurring questions.

Apache Spark 2.x Cookbook

2017-05-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rishi Yadav (Roost.ai)

AI/ML Analytics Big Data Data Analytics Kafka Scala Spark Data Streaming apache-spark data data-engineering

Discover how to harness the power of Apache Spark 2.x for your Big Data processing projects. In this book, you will explore over 70 cloud-ready recipes that will guide you to perform distributed data analytics, structured streaming, machine learning, and much more. What this Book will help me do Effectively install and configure Apache Spark with various cluster managers and platforms. Set up and utilize development environments tailored for Spark applications. Operate on schema-aware data using RDDs, DataFrames, and Datasets. Perform real-time streaming analytics with sources such as Apache Kafka. Leverage MLlib for supervised learning, unsupervised learning, and recommendation systems. Author(s) None Yadav is a seasoned data engineer with a deep understanding of Big Data tools and technologies, particularly Apache Spark. With years of experience in the field of distributed computing and data analysis, Yadav brings practical insights and techniques to enrich the learning experience of readers. Who is it for? This book is ideal for data engineers, data scientists, and Big Data professionals who are keen to enhance their Apache Spark 2.x skills. If you're working with distributed processing and want to solve complex data challenges, this book addresses practical problems. Note that a basic understanding of Scala is recommended to get the most out of this resource.

Business Intelligence Tools for Small Companies: A Guide to Free and Low-Cost Solutions

2017-05-31 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Juan Valladares , Albert Nogués

Agile/Scrum AWS BI Big Data Dashboard DWH ERP ETL/ELT KPI MariaDB MySQL Oracle +7 more

Learn how to transition from Excel-based business intelligence (BI) analysis to enterprise stacks of open-source BI tools. Select and implement the best free and freemium open-source BI tools for your company's needs and design, implement, and integrate BI automation across the full stack using agile methodologies. Business Intelligence Tools for Small Companies provides hands-on demonstrations of open-source tools suitable for the BI requirements of small businesses. The authors draw on their deep experience as BI consultants, developers, and administrators to guide you through the extract-transform-load/data warehousing (ETL/DWH) sequence of extracting data from an enterprise resource planning (ERP) database freely available on the Internet, transforming the data, manipulating them, and loading them into a relational database. The authors demonstrate how to extract, report, and dashboard key performance indicators (KPIs) in a visually appealing format from the relational database management system (RDBMS). They model the selection and implementation of free and freemium tools such as Pentaho Data Integrator and Talend for ELT, Oracle XE and MySQL/MariaDB for RDBMS, and Qliksense, Power BI, and MicroStrategy Desktop for reporting. This richly illustrated guide models the deployment of a small company BI stack on an inexpensive cloud platform such as AWS. What You'll Learn You will learn how to manage, integrate, and automate the processes of BI by selecting and implementing tools to: Implement and manage the business intelligence/data warehousing (BI/DWH) infrastructure Extract data from any enterprise resource planning (ERP) tool Process and integrate BI data using open-source extract-transform-load (ETL) tools Query, report, and analyze BI data using open-source visualization and dashboard tools Use a MOLAP tool to define next year's budget, integrating real data with target scenarios Deploy BI solutions and big data experiments inexpensively on cloud platforms Who This Book Is For Engineers, DBAs, analysts, consultants, and managers at small companies with limited resources but whose BI requirements have outgrown the limitations of Excel spreadsheets; personnel in mid-sized companies with established BI systems who are exploring technological updates and more cost-efficient solutions

Mastering Ceph

2017-05-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Nick Fisk

Ansible ceph data data-engineering

Mastering Ceph offers a comprehensive guide to mastering the Ceph distributed storage system, empowering you to implement and manage scalable storage solutions effectively. As you delve into the chapters, you'll gain the practical experience needed to handle Ceph with confidence, achieve resource optimization, and ensure high availability for critical applications. What this Book will help me do Understand and utilize Ceph's advanced capabilities such as erasure coding and tiering for storage efficiency. Implement and manage scalable and resilient Ceph clusters effectively, easing resource allocation. Use tools like Ansible and Vagrant to deploy Ceph clusters quickly and reproducibly. Enhance your troubleshooting skills to resolve complex storage issues and ensure cluster stability. Develop applications to integrate with Ceph using Librados and distributed computation classes. Author(s) This book was authored by None Fisk, an experienced professional in cloud and distributed storage systems. Known for their expertise in Ceph, None Fisk shares practical insights developed over years of working as an administrator and developer. Through their accessible and systematic writing, they guide readers to overcome real-world storage challenges. Who is it for? This detailed guide is ideal for developers and system administrators familiar with deploying Ceph, who want to deepen their understanding of its advanced features. If you're aiming to optimize performance and design robust storage solutions, this is the book for you. Prior experience with Ceph is recommended to fully benefit from the book's insights.

Oracle on IBM z Systems

2017-05-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Helene Grosch , David J Simpson , Armelle Chevé , Moshe Reder , Narjisse Zaki , Lydia Parziale , Sam Amsavelu

IBM Linux Oracle data data-engineering oracle-database-solutions

Abstract Oracle Database 12c Release 1 running on Linux is available for deployment on IBM® z Systems®. The enterprise-grade Linux on IBM z Systems solution is designed to add value to Oracle Database solutions, including the new functions that are introduced in Oracle Database 12c. In this IBM Redbooks® publication, we explore the IBM and Oracle Alliance and describe how Oracle Database benefits from IBM z Systems®. We then explain how to set up Linux guests to install Oracle Database 12c. We also describe how to use the Oracle Enterprise Manager Cloud Control Agent to manage Oracle Database 12c Release 1. We also describe a successful consolidation project from sizing to migration, performance management topics, and high availability. Finally, we end with a chapter about surrounding Oracle with Open Source software. The audience for this publication includes database consultants, installers, administrators, and system programmers. This publication is not meant to replace Oracle documentation, but to supplement it with our experiences while installing and using Oracle products.

Oracle on LinuxONE

2017-05-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Helene Grosch , David J Simpson , Armelle Chevé , Moshe Reder , Narjisse Zaki , Lydia Parziale , Sam Amsavelu

IBM Linux Oracle data data-engineering oracle-database-solutions

Abstract Oracle Database 12c Release 1 running on Linux is available for deployment on IBM® LinuxONE. The enterprise-grade Linux on LinuxONE solution is designed to add value to Oracle Database solutions, including the new functions that are introduced in Oracle Database 12c. In this IBM Redbooks® publication, we explore the IBM and Oracle Alliance and describe how Oracle Database benefits from LinuxONE. We then explain how to set up Linux guests to install Oracle Database 12c. We also describe how to use the Oracle Enterprise Manager Cloud Control Agent to manage Oracle Database 12c Release 1. We also describe a successful consolidation project from sizing to migration, performance management topics, and high availability. Finally, we end with a chapter about surrounding Oracle with Open Source software. The audience for this publication includes database consultants, installers, administrators, and system programmers. This publication is not meant to replace Oracle documentation, but to supplement it with our experiences while installing and using Oracle products.

Mastering Machine Learning with R - Second Edition

2017-04-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Vikram Dhillon , Miroslav Kopecky , Doug Ortiz , Cory Lesmeister

AI/ML data data-science data-science-tools r

Dive into the world of advanced machine learning techniques with "Mastering Machine Learning with R, Second Edition." This comprehensive guide equips you with the skills to implement sophisticated algorithms and create powerful prediction models using R 3.x. You will explore topics such as supervised and unsupervised learning, decision trees, ensemble methods, and deep learning. What this Book will help me do Implement machine learning workflows using a variety of R packages like XGBOOST. Effectively use linear and logistic regression for statistical analysis and pattern recognition. Develop skills in advanced methods such as support vector machines and neural networks. Learn actionable techniques to create recommendation engines and perform text mining. Gain hands-on experience running R-based machine learning analyses on cloud platforms. Author(s) None Lesmeister, a seasoned data scientist, combines extensive hands-on experience and a passion for teaching to deliver technical concepts in a practical, engaging manner. With a strong background in statistical analysis and machine learning, they are dedicated to providing readers with actionable knowledge and step-by-step guidance. Who is it for? This book is ideal for data scientists, analysts, and machine learning practitioners aiming to deepen their expertise in R. Readers should have a fundamental understanding of machine learning concepts and a basic knowledge of R programming. If you're looking to master advanced learning methods and apply them effectively, this book is tailored for you.

Sams Teach Yourself Hadoop in 24 Hours

2017-04-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jeffrey Aven

API Big Data Hadoop HDFS Hive Java Spark data data-engineering

Apache Hadoop is the technology at the heart of the Big Data revolution, and Hadoop skills are in enormous demand. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques you'll need to deploy each key component of a Hadoop platform in your local environment or in the cloud, building a fully functional Hadoop cluster and using it with real programs and datasets. Each short, easy lesson builds on all that's come before, helping you master all of Hadoop's essentials, and extend it to meet your unique challenges. Apache Hadoop in 24 Hours, Sams Teach Yourself covers all this, and much more: Understanding Hadoop and the Hadoop Distributed File System (HDFS) Importing data into Hadoop, and process it there Mastering basic MapReduce Java programming, and using advanced MapReduce API concepts Making the most of Apache Pig and Apache Hive Implementing and administering YARN Taking advantage of the full Hadoop ecosystem Managing Hadoop clusters with Apache Ambari Working with the Hadoop User Environment (HUE) Scaling, securing, and troubleshooting Hadoop environments Integrating Hadoop into the enterprise Deploying Hadoop in the cloud Getting started with Apache Spark Step-by-step instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; "Did You Know?" tips offer insider advice and shortcuts; and "Watch Out!" alerts help you avoid pitfalls. By the time you're finished, you'll be comfortable using Apache Hadoop to solve a wide spectrum of Big Data problems.

Exam Ref 70-761 Querying Data with Transact-SQL, 1st Edition

2017-04-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Itzik Ben-Gan

Azure Data Management JSON Microsoft SQL XML data data-engineering microsoft-sql-server relational-databases transact-sql

Prepare for Microsoft Exam 70-761–and help demonstrate your real-world mastery of SQL Server 2016 Transact-SQL data management, queries, and database programming. Designed for experienced IT professionals ready to advance their status, Exam Ref focuses on the critical-thinking and decision-making acumen needed for success at the MCSA level. Focus on the expertise measured by these objectives: Filter, sort, join, aggregate, and modify data Use subqueries, table expressions, grouping sets, and pivoting Query temporal and non-relational data, and output XML or JSON Create views, user-defined functions, and stored procedures Implement error handling, transactions, data types, and nulls This Microsoft Exam Ref: Organizes its coverage by exam objectives Features strategic, what-if scenarios to challenge you Assumes you have experience working with SQL Server as a database administrator, system engineer, or developer Includes downloadable sample database and code for SQL Server 2016 SP1 (or later) and Azure SQL Database Querying Data with Transact-SQL About the Exam Exam 70-761 focuses on the skills and knowledge necessary to manage and query data and to program databases with Transact-SQL in SQL Server 2016. About Microsoft Certification Passing this exam earns you credit toward a Microsoft Certified Solutions Associate (MCSA) certification that demonstrates your mastery of essential skills for building and implementing on-premises and cloud-based databases across organizations. Exam 70-762 (Developing SQL Databases) is also required for MCSA: SQL 2016 Database Development certification. See full details at: microsoft.com/learning

Oracle Database 12c Release 2 Performance Tuning Tips & Techniques

2017-03-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Richard Niemiec

Oracle Cyber Security SQL data data-engineering oracle-database-solutions

Proven Database Optimization Solutions―Fully Updated for Oracle Database 12c Release 2 Systematically identify and eliminate database performance problems with help from Oracle Certified Master Richard Niemiec. Filled with real-world case studies and best practices, Oracle Database 12c Release 2 Performance Tuning Tips and Techniques details the latest monitoring, troubleshooting, and optimization methods. Find out how to identify and fix bottlenecks on premises and in the cloud, configure storage devices, execute effective queries, and develop bug-free SQL and PL/SQL code. Testing, reporting, and security enhancements are also covered in this Oracle Press guide. • Properly index and partition Oracle Database 12c Release 2 • Work effectively with Oracle Cloud, Oracle Exadata, and Oracle Enterprise Manager • Efficiently manage disk drives, ASM, RAID arrays, and memory • Tune queries with Oracle SQL hints and the Trace utility • Troubleshoot databases using V$ views and X$ tables • Create your first cloud database service and prepare for hybrid cloud • Generate reports using Oracle’s Statspack and Automatic Workload Repository tools • Use sar, vmstat, and iostat to monitor operating system statistics

talk-data.com

Cloud Computing

Activity Trend

Top Events

Top Speakers

IBM z14 Technical Introduction

Mastering Apache Spark 2.x - Second Edition

SQL Server 2016 High Availability Unleashed (includes Content Update Program)

IBM Spectrum Accelerate Deployment, Usage, and Maintenance

Moving Hadoop to the Cloud

Learning SAP Analytics Cloud

Analytics.CLUB Boston Data Science in Information Security Panel

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Frank Kane's Taming Big Data with Apache Spark and Python

Learning Elasticsearch

MS Build 2017

Decision Support, Analytics, and Business Intelligence, Third Edition

Apache Spark 2.x Cookbook

Business Intelligence Tools for Small Companies: A Guide to Free and Low-Cost Solutions

Mastering Ceph

Oracle on IBM z Systems

Oracle on LinuxONE

Mastering Machine Learning with R - Second Edition

Sams Teach Yourself Hadoop in 24 Hours

Exam Ref 70-761 Querying Data with Transact-SQL, 1st Edition

Oracle Database 12c Release 2 Performance Tuning Tips & Techniques