data-engineering

Dynamic SQL: Applications, Performance, and Security in Microsoft SQL Server

2018-12-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Edward Pollack

Analytics BI Microsoft Cyber Security SQL SQL Server data microsoft-sql-server relational-databases

Take a deep dive into the many uses of dynamic SQL in Microsoft SQL Server. This edition has been updated to use the newest features in SQL Server 2016 and SQL Server 2017 as well as incorporating the changing landscape of analytics and database administration. Code examples have been updated with new system objects and functions to improve efficiency and maintainability. Executing dynamic SQL is key to large-scale searching based on user-entered criteria. Dynamic SQL can generate lists of values and even code with minimal impact on performance. Dynamic SQL enables dynamic pivoting of data for business intelligence solutions as well as customizing of database objects. Yet dynamic SQL is feared by many due to concerns over SQL injection or code maintainability. Dynamic SQL: Applications, Performance, and Security in Microsoft SQL Server helps you bring the productivity and user-satisfaction of flexible and responsive applications to your organization safely and securely. Your organization’s increased ability to respond to rapidly changing business scenarios will build competitive advantage in an increasingly crowded and competitive global marketplace. With a focus on new applications and modern database architecture, this edition illustrates that dynamic SQL continues to evolve and be a valuable tool for administration, performance optimization, and analytics. What You'ill Learn Build flexible applications that respond to changing business needs Take advantage of creative, innovative, and productive uses of dynamic SQL Know about SQL injection and be confident in your defenses against it Address performance concerns in stored procedures and dynamic SQL Troubleshoot and debug dynamic SQL to ensure correct results Automate your administration of features within SQL Server Who This Book is For Developers and database administrators looking to hone and build their T-SQL coding skills. The book is ideal for developers wanting to plumb the depths of application flexibility and troubleshoot performance issues involving dynamic SQL. The book is also ideal for programmers wanting to learn what dynamic SQL is about and how it can help them deliver competitive advantage to their organizations.

BizTalk Server 2016: Performance Tuning and Optimization

2018-12-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Agustín Mántaras

data enterprise-service-bus microsoft-biztalk-server streaming-messaging

Gain an in depth view of optimizing the performance of BizTalk Server. This book provides best practices and techniques for improving development of high mission critical solutions. You'll see how the BizTalk Server engine works and how to proactively detect and remedy potential bottlenecks before they occur. The book starts with an overview of the BizTalk Server internal mechanisms that will help you understand the optimizations detailed throughout the book. You'll then see how the mechanisms can be applied to a BizTalk Server environment to improve low and high latency throughput scenarios. A section on testing BizTalk server solutions will guide you through the most frequently adopted techniques used to develop solutions such as performance and unit testing as part of the development cycle. With BizTalk Server 2016 you'll see how to apply side-by-side versioning to your solutions to reduce the chances of downtime, You'll also review instrumentation techniques using Event Traces for windows and business activity monitoring (BAM). While the book is focused on the latest version of BizTalk Server, most of the topics discussed will also work with BizTalk Server 2013R2. What You'll Learn Review BizTalk Server internals and how the message engine works Understand BizTalk Server architecture Gather and analyze BizTalk Server performance data Develop BizTalk Server performance solutions Use advanced troubleshooting tools to help diagnose your platform Who This Book Is For Those who have strong BizTalk and .NET Framework knowledge and want to get their BizTalk Server knowledge to the next level

Machine Learning with Apache Spark Quick Start Guide

2018-12-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jillur Quddus

AI/ML Analytics Big Data Spark apache-spark data

"Machine Learning with Apache Spark Quick Start Guide" introduces you to the fundamental concepts and tools needed to harness the power of Apache Spark for data processing and machine learning. This book combines practical examples and real-world scenarios to show you how to manage big data efficiently while uncovering actionable insights through advanced analytics. What this Book will help me do Understand the role of Apache Spark in the big data ecosystem. Set up and configure an Apache Spark development environment. Learn and implement supervised and unsupervised learning models using Spark MLlib. Apply advanced analytical algorithms to real-world big data problems. Develop and deploy real-time machine learning pipelines with Apache Spark. Author(s) None Quddus is an experienced practitioner in the fields of big data, distributed technologies, and machine learning. With a career dedicated to using advanced analytics to solve real-world problems, Quddus brings practical expertise to each topic addressed. Their approachable writing style ensures readers can apply concepts effectively, even in complex scenarios. Who is it for? This book is ideal for business analysts, data analysts, and data scientists who are eager to gain hands-on experience with big data technologies. Whether you are new to Apache Spark or looking to expand your knowledge of its machine learning capabilities, this guide provides the tools and insights necessary to achieve those goals. Technical professionals wanting to develop their skills in processing and analyzing big data will find this resource invaluable.

Fast Data Architectures for Streaming Applications, 2nd Edition

2018-12-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dean Wampler

Flink Big Data ELK IoT Kafka Spark Data Streaming data streaming-messaging

Why have stream-oriented data systems become so popular, when batch-oriented systems have served big data needs for many years? In the updated edition of this report, Dean Wampler examines the rise of streaming systems for handling time-sensitive problems—such as detecting fraudulent financial activity as it happens. You’ll explore the characteristics of fast data architectures, along with several open source tools for implementing them. Batch processing isn’t going away, but exclusive use of these systems is now a competitive disadvantage. You’ll learn that, while fast data architectures using tools such as Kafka, Akka, Spark, and Flink are much harder to build, they represent the state of the art for dealing with mountains of data that require immediate attention. Learn how a basic fast data architecture works, step-by-step Examine how Kafka’s data backplane combines the best abstractions of log-oriented and message queue systems for integrating components Evaluate four streaming engines, including Kafka Streams, Akka Streams, Spark, and Flink Learn which streaming engines work best for different use cases Get recommendations for making real-world streaming systems responsive, resilient, elastic, and message driven Explore an example IoT streaming application that includes telemetry ingestion and anomaly detection

Apache Spark 2: Data Processing and Real-Time Analytics

2018-12-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Romeo Kienzler , Sridhar Alla , Md. Rezaul Karim , Siamak Amirghodsi

AI/ML Analytics Big Data Data Analytics Scala Spark SQL Data Streaming apache-spark data

Build efficient data flow and machine learning programs with this flexible, multi-functional open-source cluster-computing framework Key Features Master the art of real-time big data processing and machine learning Explore a wide range of use-cases to analyze large data Discover ways to optimize your work by using many features of Spark 2.x and Scala Book Description Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform. You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools. By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle. This Learning Path includes content from the following Packt products: Mastering Apache Spark 2.x by Romeo Kienzler Scala and Spark for Big Data Analytics by Md. Rezaul Karim, Sridhar Alla Apache Spark 2.x Machine Learning Cookbook by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen MeiCookbook What you will learn Get to grips with all the features of Apache Spark 2.x Perform highly optimized real-time big data processing Use ML and DL techniques with Spark MLlib and third-party tools Analyze structured and unstructured data using SparkSQL and GraphX Understand tuning, debugging, and monitoring of big data applications Build scalable and fault-tolerant streaming applications Develop scalable recommendation engines Who this book is for If you are an intermediate-level Spark developer looking to master the advanced capabilities and use-cases of Apache Spark 2.x, this Learning Path is ideal for you. Big data professionals who want to learn how to integrate and use the features of Apache Spark and build a strong big data pipeline will also find this Learning Path useful. To grasp the concepts explained in this Learning Path, you must know the fundamentals of Apache Spark and Scala.

Apache Superset Quick Start Guide

2018-12-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Shashank Shekhar

BI Dashboard DataViz MySQL Cyber Security SQL Superset data relational-databases

Apache Superset Quick Start Guide teaches you how to leverage Apache Superset to create interactive and insightful data visualizations. With this book, you'll understand how to integrate Superset with popular databases and build user-friendly dashboards tailored for business intelligence needs. What this Book will help me do Set up and configure Apache Superset for data visualization tasks. Integrate data from SQL databases into Superset for dashboards. Design dashboards tailored to represent business metrics and insights. Use Superset's visualization techniques to explore and present various datasets. Understand and apply user role management and security features in Superset. Author(s) None Shekhar is an experienced data visualization and business intelligence specialist with years of experience in working with Apache Superset. They have written several guides on utilizing open-source tools for enterprise needs. Their technical expertise and approachable writing style make this guide practical and engaging. Who is it for? This book is geared towards data analysts, business intelligence professionals, and developers. Beginners to Superset can quickly grasp the fundamentals, while those with prior experience in data visualization will appreciate the advanced techniques. It's perfect for anyone looking to enhance their data storytelling and dashboard design skills.

Vertically Integrated Architectures: Versioned Data Models, Implicit Services, and Persistence-Aware Programming

2018-12-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jos Jong

Computer Science data data-models

Understand how and why the separation between layers and tiers in service-oriented architectures holds software developers back from being truly productive, and how you can remedy that problem. Strong processes and development tools can help developers write more complex software, but large amounts of code can still be directly deduced from the underlying database model, hampering developer productivity. In a world with a shortage of developers, this is bad news. More code also increases maintenance costs and the risk of bugs, meaning less time is spent improving the quality of systems. You will learn that by making relationships first-class citizens within an item/relationship model, you can develop an extremely compact query language, inspired by natural language. You will also learn how this model can serve as both a database schema and an object model upon which to build business logic. Implicit services free you from writing code for standard read/write operations, while still supporting fine-grained authorization. Vertically Integrated Architectures explains how functional schema mappings can solve database migrations and service versioning at the same time, and how all this can support any client, from free-format to fully vertically integrated types. Unleash the potential and use VIA to drastically increase developer productivity and quality. What You'll Learn See how the separation between application server and database in a SOA-based architecture might be justifiable from a historical perspective, but can also hold us back Examine how the vertical integration of application logic and database functionality can drastically increase developer productivity and quality Review why application developers only need to write pure business logic if an architecture takes care of basic read/write client-server communication and data persistence Understand why a set-oriented and persistence-aware programming language would not only make it easier to build applications, but would also enable the fully optimized execution of incoming service requests Who This Book Is For Software architects, senior software developers, computer science professionals and students, and the open source community.

IBM Tape Library Guide for Open Systems

2018-12-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Michael Engelbrecht Larry Coyne Simon Browne, Illarion Borisevich

IBM data

Abstract This IBM® Redbooks® publication presents a general introduction to the latest IBM tape and tape library technologies. Featured tape technologies include the IBM LTO Ultrium and Enterprise 3592 tape drives, and their implementation in IBM tape libraries. This 16th edition introduces the new TS1160 tape drive with up to 20 TB capacity on JE media and the latest updates to the IBM TS4500 and TS4300 tape libraries, It includes generalized sections about Small Computer System Interface (SCSI) and Fibre Channel connections, and multipath architecture configurations. This book also covers tools and techniques for library management. It is intended for anyone who wants to understand more about IBM tape products and their implementation. It is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists. If you do not have a background in computer tape storage products, you might need to read other sources of information. In the interest of being concise, topics that are generally understood are not covered in detail.

Machine Learning with PySpark: With Natural Language Processing and Recommender Systems

2018-12-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pramod Singh

AI/ML Data Science NLP PySpark Spark apache-spark data

Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and recommender systems using PySpark. Machine Learning with PySpark shows you how to build supervised machine learning models such as linear regression, logistic regression, decision trees, and random forest. You’ll also see unsupervised machine learning models such as K-means and hierarchical clustering. A major portion of the book focuses on feature engineering to create useful features with PySpark to train the machine learning models. The natural language processing section covers text processing, text mining, and embedding for classification. After reading thisbook, you will understand how to use PySpark’s machine learning library to build and train various machine learning models. Additionally you’ll become comfortable with related PySpark components, such as data ingestion, data processing, and data analysis, that you can use to develop data-driven intelligent applications. What You Will Learn Build a spectrum of supervised and unsupervised machine learning algorithms Implement machine learning algorithms with Spark MLlib libraries Develop a recommender system with Spark MLlib libraries Handle issues related to feature engineering, class balance, bias and variance, and cross validation for building an optimal fit model Who This Book Is For Data science and machine learning professionals.

Practical Apache Spark: Using the Scala API

2018-12-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dharanitharan Ganesan , Subhashini Chellappan

AI/ML API Hive Kafka Scala Spark SQL Data Streaming apache-spark data

Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You’ll follow a learn-to-do-by-yourself approach to learning – learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure. On completion, you’ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You’ll also become familiar with machine learning algorithms with real-time usage. What You Will Learn Discover the functional programming features of Scala Understand the completearchitecture of Spark and its components Integrate Apache Spark with Hive and Kafka Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries Work with different machine learning concepts and libraries using Spark's MLlib packages Who This Book Is For Developers and professionals who deal with batch and stream data processing.

IBM Power Systems RAID Solutions Introduction and Technical Overview

2018-12-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Scott Vetter , Harihara Balakrishnan , Swarna Narendra Babu

IBM SAS data

This IBM® Redpaper™ publication given an overview and technical introduction to IBM Power Systems™ RAID solutions. The book is organized to start with an introduction to Redundant Array of Independent Disks (RAID), and various RAID levels with their benefits. A brief comparison of Direct Attached Storage (DAS) and networked storage systems such as SAN / NAS is provided with a focus on emerging applications that typically use the DAS model over networked storage models. The book focuses on IBM Power Systems I/O architecture and various SAS RAID adapters that are supported in IBM POWER8™ processor-based systems. A detailed description of the SAS adapters, along with their feature comparison tables, is included in Chapter 3, "RAID adapters for IBM Power Systems" on page 45. The book is aimed at readers who have the responsibility of configuring IBM Power Systems for individual solution requirements. This audience includes IT Architects, IBM Technical Sales Teams, IBM Business Partner Solution Architects and Technical Sales teams, and systems administrators who need to understand the SAS RAID hardware and RAID software solutions supported in POWER8 processor-based systems.

Dynamic Oracle Performance Analytics: Using Normalized Metrics to Improve Database Speed

2018-12-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Roger Cornejo

Analytics Big Data Oracle data oracle-database-solutions

Use an innovative approach that relies on big data and advanced analytical techniques to analyze and improve Oracle Database performance. The approach used in this book represents a step-change paradigm shift away from traditional methods. Instead of relying on a few hand-picked, favorite metrics, or wading through multiple specialized tables of information such as those found in an automatic workload repository (AWR) report, you will draw on all available data, applying big data methods and analytical techniques to help the performance tuner draw impactful, focused performance improvement conclusions. This book briefly reviews past and present practices, along with available tools, to help you recognize areas where improvements can be made. The book then guides you through a step-by-step method that can be used to take advantage of all available metrics to identify problem areas and work toward improving them. The method presented simplifies the tuning process and solves the problem of metric overload. You will learn how to: collect and normalize data, generate deltas that are useful in performing statistical analysis, create and use a taxonomy to enhance your understanding of problem performance areas in your database and its applications, and create a root cause analysis report that enables understanding of a specific performance problem and its likely solutions. What You'll Learn Collect and prepare metrics for analysis from a wide array of sources Apply statistical techniques to select relevant metrics Create a taxonomy to provide additional insight into problem areas Provide a metrics-based root cause analysis regarding the performance issue Generate an actionable tuning plan prioritized according to problem areas Monitor performance using database-specific normal ranges Who This Book Is For Professional tuners: responsible for maintaining the efficient operation of large-scale databases who wish to focus on analysis, who want to expand their repertoire to include a big data methodology and use metrics without being overwhelmed, who desire to provide accurate root cause analysis and avoid the cyclical fix-test cycles that are inevitable when speculation is used

IBM TS4500 R5 Tape Library Guide

2018-12-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Michael Engelbrecht Larry Coyne Simon Browne, Illarion Borisevich, Robert Beiderbeck

Cloud Computing ELK IBM Cyber Security data

Abstract The IBM® TS4500 (TS4500) tape library is a next-generation tape solution that offers higher storage density and integrated management than previous solutions. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today’s and tomorrow’s data growth requires. It has the cost-effectiveness and the manageability to grow with business data needs, while you preserve existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot because the TS4500 can store up to 11 petabytes (PB) of uncompressed data in a single frame library or scale up to 2 PB per square foot to over 350 PB. The TS4500 offers the following benefits: High availability: Dual active accessors with integrated service bays reduce inactive service space by 40%. The Elastic Capacity option can be used to completely eliminate inactive service space. Flexibility to grow: The TS4500 library can grow from the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to another 17 expansion frames with a capacity of over 23,000 cartridges. High-density (HD) generation 1 frames from the TS3500 library can be redeployed in a TS4500. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for IBM TS1160 while also supporting TS1155, TS1150, and TS1140 tape drive: The TS1160 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1160 offers high-performance, flexible data storage with support for data encryption. Also, this enhanced fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. The new TS1160 Tape Drive Model 60E delivers a dual 10 Gb or 25 Gb Ethernet host attachment interface that is optimized for cloud-based and hyperscale environments. The TS1160 Tape Drive Model 60F delivers a native data rate of 400 MBps, the same load/ready, locate speeds, and access times as the TS1155, and includes dual-port 16 Gb Fibre Channel support. Support of the IBM Linear Tape-Open (LTO) Ultrium 8 tape drive: The LTO Ultrium 8 offering represents significant improvements in capacity, performance, and reliability over the previous generation, LTO Ultrium 7, while still protecting your investment in the previous technology. Support of LTO 8 Type M cartridge (M8): The LTO Program is introducing a new capability with LTO-8 drives. The ability of the LTO-8 drive to write 9 TB on a brand new LTO-7 cartridge instead of 6 TB as specified by the LTO-7 format. Such a cartridge is called an LTO-7 initialized LTO-8 Type M cartridge. Integrated TS7700 back-end Fibre Channel (FC) switches are available. Up to four library-managed encryption (LME) key paths per logical library are available. This book describes the TS4500 components, feature codes, specifications, supported tape drives, encryption, new integrated management console (IMC), and command-line interface (CLI). You learn how to accomplish the following specific tasks: Improve storage density with increased expansion frame capacity up to 2.4 times and support 33% more tape drives per frame. Manage storage by using the ALMS feature. Improve business continuity and disaster recovery with dual active accessor, automatic control path failover, and data path failover. Help ensure security and regulatory compliance with tape-drive encryption and Write Once Read Many (WORM) media. Support IBM LTO Ultrium 8, 7, 6, and 5, IBM TS1160, TS1155, TS1150, and TS1140 tape drives. Provide a flexible upgrade path for users who want to expand their tape storage as their needs grow. Reduce the storage footprint and simplify cabling with 10 U of rack space on top of the library. This guide is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Introducing the IBM DS8882F Rack Mounted Storage System

2018-12-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sherry Brunson , Stephen Manthorpe , Bert Dufrasne

IBM data

This IBM® Redpaper™ presents and positions the DS8882F. The DS8882F adds a modular rack-mountable enterprise storage system to the DS8880 family of all-flash enterprise storage systems. The modular system can be integrated into 16U contiguous space of an existing IBM z14™ Model ZR1 (z14 Model ZR1), IBM LinuxONE™ Rockhopper II (z14 Model LR1), or other standard 19-inch wide rack. The DS8882F allows you to take advantage of the performance boost of DS8880 all-flash enterprise systems and advanced features while limiting datacenter footprint and power infrastructure requirements.

Migrating to MariaDB: Toward an Open Source Database Solution

2018-12-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by William Wood

MariaDB MySQL Cyber Security data relational-databases

Mitigate the risks involved in migrating away from a proprietary database platform toward MariaDB’s open source database engine. This book will help you assess the risks and the work involved, and ensure a successful migration. Migrating to MariaDB describes the process and lessons learned during a migration from a proprietary database management engine to the MariaDB open source solution. The book discusses the drivers for making the decision and change, walking you through all aspects of the process from evaluating the licensing, navigating the pitfalls and hurdles of a migration, through to final implementation on the new platform. The book highlights the cost-effectiveness of MariaDB and how the licensing worries are simplified in comparison to running on a proprietary platform. You’ll learn to do your own risk assessment, to identify database and application code that may need to be modified or re-implemented, and to identify MariaDB features to provide the security and failover protection needed by corporate customers. Let the author’s experience in migrating a financial firm to MariaDB inform your own efforts, helping you to develop a road map for both technical and political success within your own organization as you migrate away from proprietary lock-in toward MariaDB’s open source solution. What You'll Learn Evaluate and compare licensing costs between proprietary databases and MariaDB Perform a proper risk assessment to inform your planning and execution of the migration Build a migration road map from the book’s example that is specific to your situation Make needed application changes and migrate data to the MariaDB open source database engine Who This Book Is For Technical professionals (including database administrators, programmers, and technical management) who are interested in migrating away from a proprietary database platform toward MariaDB’s open source database engine and need to assess the risks and the work involved

IBM DS8880 High-Performance Flash Enclosure Gen2

2018-12-04 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Tamas Toser , Axel Westphal , Stephen Manthorpe

IBM data

This IBM® Redpaper™ publication describes the IBM DS8880 High-Performance Enclosure (HPFE) Gen2 architecture and configuration, as of DS8880 Release 8.51. The DS8880 HPFE Gen2 is a 2U Redundant Array of Independent Disks (RAID) flash enclosure with associated Flash RAID adapters that can be used exclusively with DS8880 models. The flash enclosure and Flash RAID adapters are installed in pairs. Each storage enclosure pair can support 16, 32, or 48 encryption-capable flash drives (2.5-inch, 63.5 mm form factor).

IBM Storage Networking SAN768C-6 Product Guide

2018-12-04 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jon Tate

IBM Fabric Cyber Security data

This IBM® Redbooks® Product Guide describes the IBM Storage Networking SAN768C-6. IBM Storage Networking SAN768C-6 has the industry's highest port density for a storage area network (SAN) director and features 768 line-rate 32 gigabits per second (Gbps) or 16 Gbps Fibre Channel ports. Designed to support multiprotocol workloads, IBM Storage Networking SAN768C-6 enables SAN consolidation and collapsed-core solutions for large enterprises, which reduces the number of managed switches and leads to easy-to-manage deployments. IBM Storage Networking SAN768C-6 supports the 48-Port 32 Gbps Fibre Channel Switching Module, the 48-Port 16 Gbps Fibre Channel Switching Module, the 48-port 10 Gbps FCoE Switching Module, the 24-port 40 Gbps FCoE switching module, and the 24/10-port SAN Extension Module. By reducing the number of front-panel ports that are used on inter-switch links (ISLs), it also offers room for future growth. IBM Storage Networking SAN768C-6 addresses the mounting storage requirements of today's large virtualized data centers. As a director-class SAN switch, IBM Storage Networking SAN768C-6 uses the same operating system and management interface as other IBM data center switches. It brings intelligent capabilities to a high-performance, protocol-independent switch fabric, and delivers uncompromising availability, security, scalability, simplified management, and the flexibility to integrate new technologies. You can use IBM Storage Networking SAN768C-6 to transparently deploy unified fabrics with Fibre Channel and Fibre Channel over Ethernet (FCoE) connectivity to achieve low total cost of ownership (TCO). For mission-critical enterprise storage networks that require secure, robust, cost-effective business-continuance services, the FCIP extension module is designed to deliver outstanding SAN extension performance, reducing latency for disk and tape operations with FCIP acceleration features, including FCIP write acceleration and FCIP tape write and read acceleration.

Hands-On Big Data Modeling

2018-11-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Tao Wei , Suresh Kumar Mukhiya , Lee James (Domo)

BI Big Data Data Management Data Modelling Python SQL data data-models

This book, Hands-On Big Data Modeling, provides you with practical guidance on data modeling techniques, focusing particularly on the challenges of big data. You will learn the concepts behind various data models, explore tools and platforms for efficient data management, and gain hands-on experience with structured and unstructured data. What this Book will help me do Master the fundamental concepts of big data and its challenges. Explore advanced data modeling techniques using SQL, Python, and R. Design effective models for structured, semi-structured, and unstructured data types. Apply data modeling to real-world datasets like social media and sensor data. Optimize data models for performance and scalability in various big data platforms. Author(s) The authors of this book are experienced data architects and engineers with a strong background in developing scalable data solutions. They bring their collective expertise to simplify complex concepts in big data modeling, ensuring readers can effectively apply these techniques in their projects. Who is it for? This book is intended for data architects, business intelligence professionals, and any programmer interested in understanding and applying big data modeling concepts. If you are already familiar with basic data management principles and want to enhance your skills, this book is perfect for you. You will learn to tackle real-world datasets and create scalable models. Additionally, it is suitable for professionals transitioning to working with big data frameworks.

Hands-On Geospatial Analysis with R and QGIS

2018-11-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Shammunul Islam

AI/ML GIS data geographic-information-system-gis geographic information system (gis) location-data

Dive into the intricate world of geospatial data with "Hands-On Geospatial Analysis with R and QGIS". This book guides readers through managing, analyzing, and visualizing spatial data using the popular tools R and QGIS. Packed with practical examples, it empowers you to effectively handle GIS and remote sensing data in your projects. What this Book will help me do Understand how to install and set up R and QGIS environments for geospatial tasks. Learn the fundamentals of spatial data processing, including management, visualization, and analysis. Create compelling geospatial visualizations using R packages like ggplot2 and tools in QGIS. Master raster data handling and leverage the QGIS graphical modeler for automating geoprocessing tasks. Apply machine learning techniques to geospatial problems such as landslide susceptibility mapping using real-world data. Author(s) None Hamson and None Islam are experts in the field of geospatial analysis and provide practical, actionable insights throughout this book. With extensive experience in GIS and remote sensing technologies, they focus on guiding readers from basic principles to advanced applications. Their collaborative teaching style ensures clarity and accessibility for learners at different skill levels. Who is it for? This book is ideal for geographers, environmental scientists, and other professionals working with spatial data. Beginner to intermediate-level readers will find it approachable, with step-by-step instructions to build their expertise. While prior familiarity with R or QGIS can be helpful, it is not required. The book is tailored for those eager to expand their skills in geospatial data analysis and visualization.

Hands-On Data Science with SQL Server 2017

2018-11-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vladimír Mužný , Marek Chmel

Analytics Azure BI Big Data Data Science Power BI Python SQL data

In "Hands-On Data Science with SQL Server 2017," you will discover how to implement end-to-end data analysis workflows, leveraging SQL Server's robust capabilities. This book guides you through collecting, cleaning, and transforming data, querying for insights, creating compelling visualizations, and even constructing predictive models for sophisticated analytics. What this Book will help me do Grasp the essential data science processes and how SQL Server supports them. Conduct data analysis and create interactive visualizations using Power BI. Build, train, and assess predictive models using SQL Server tools. Integrate SQL Server with R, Python, and Azure for enhanced functionality. Apply best practices for managing and transforming big data with SQL Server. Author(s) Marek Chmel and Vladimír Mužný bring their extensive experience in data science and database management to this book. Marek is a seasoned database specialist with a strong background in SQL, while Vladimír is known for his instructional expertise in analytics and data manipulation. Together, they focus on providing actionable insights and practical examples tailored for data professionals. Who is it for? This book is an ideal resource for aspiring and seasoned data scientists, data analysts, and database professionals aiming to deepen their expertise in SQL Server for data science workflows. Beginners with fundamental SQL knowledge will find it a guided entry into data science applications. It is especially suited for those who aim to implement data-driven solutions in their roles while leveraging SQL's capabilities.

talk-data.com

Activity Trend

Top Events

Top Speakers

Dynamic SQL: Applications, Performance, and Security in Microsoft SQL Server

BizTalk Server 2016: Performance Tuning and Optimization

Machine Learning with Apache Spark Quick Start Guide

Fast Data Architectures for Streaming Applications, 2nd Edition

Apache Spark 2: Data Processing and Real-Time Analytics

Apache Superset Quick Start Guide

Vertically Integrated Architectures: Versioned Data Models, Implicit Services, and Persistence-Aware Programming

IBM Tape Library Guide for Open Systems

Machine Learning with PySpark: With Natural Language Processing and Recommender Systems

Practical Apache Spark: Using the Scala API

IBM Power Systems RAID Solutions Introduction and Technical Overview

Dynamic Oracle Performance Analytics: Using Normalized Metrics to Improve Database Speed

IBM TS4500 R5 Tape Library Guide

Introducing the IBM DS8882F Rack Mounted Storage System

Migrating to MariaDB: Toward an Open Source Database Solution

IBM DS8880 High-Performance Flash Enclosure Gen2

IBM Storage Networking SAN768C-6 Product Guide

Hands-On Big Data Modeling

Hands-On Geospatial Analysis with R and QGIS

Hands-On Data Science with SQL Server 2017