data

Neo4j Graph Data Modelling

2015-07-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mahesh K Lal

Data Modelling Neo4j data-engineering graph-databases

Neo4j Graph Data Modelling provides practical guidance in designing and implementing graph databases using Neo4j. This book walks you through modeling concepts, database evolution, and performance optimization. You'll learn how to model real-world domains, write Cypher queries, and adapt your database as requirements change. What this Book will help me do Model data effectively using Neo4j to represent complex relationships. Translate real-world problems into graph database designs efficiently. Write optimized Cypher queries to retrieve and manipulate data. Improve database performance through thoughtful design practices. Adapt and evolve databases seamlessly as application needs change. Author(s) Mahesh K Lal is an experienced developer and database specialist with a deep understanding of graph data modeling. With a focus on practical and accessible instruction, Mahesh's work provides actionable insights into database design. Neo4j Graph Data Modelling reflects his years of hands-on experience with Neo4j. Who is it for? This book is designed for software developers and data professionals looking to explore graph databases. If you aim to effectively model real-world situations using Neo4j or optimize database queries, this guide is for you. Prior experience with databases is helpful but not mandatory.

Oracle Goldengate 12c Implementers Guide

2015-07-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by John P Jeffries

Oracle data-engineering oracle-database-solutions

Master Oracle GoldenGate 12c to manage high-volume data replication and integration in real time. This guide provides you with comprehensive knowledge and skills to optimize database processes through GoldenGate's capabilities. What this Book will help me do Install and configure Oracle GoldenGate 12c within your environment effectively. Leverage GoldenGate's high-availability features for robust system setups. Optimize replication processes with advanced configuration and performance tuning techniques. Troubleshoot common GoldenGate issues to ensure seamless operations. Apply best practices for GoldenGate in enterprise database architectures. Author(s) John P Jeffries, the author of this guide, is a seasoned Oracle database consultant with over a decade of experience specializing in high-availability architectures and data replication. His mission is to make complex systems accessible through clear and detailed instructional writing. Who is it for? This book is designed for Oracle database administrators wanting to integrate GoldenGate into their architecture. Ideal for solution architects building robust systems and project managers overseeing database projects. A basic understanding of Oracle databases is assumed, but no prior knowledge of GoldenGate is required.

Spark Cookbook

2015-07-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rishi Yadav (Roost.ai)

AI/ML Analytics Big Data Java Python Spark SQL Data Streaming apache-spark data-engineering

Spark Cookbook is your practical guide to mastering Apache Spark, encompassing a comprehensive set of patterns and examples. Through its over 60 recipes, you will gain actionable insights into using Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX effectively for your big data needs. What this Book will help me do Understand how to install and configure Apache Spark in various environments. Build data pipelines and perform real-time analytics with Spark Streaming. Utilize Spark SQL for interactive data querying and reporting. Apply machine learning workflows using MLlib, including supervised and unsupervised models. Develop optimized big data solutions and integrate them into enterprise platforms. Author(s) None Yadav, the author of Spark Cookbook, is an experienced data engineer and technical expert with deep insights into big data processing frameworks. Yadav has spent years working with Spark and its ecosystem, providing practical guidance to developers and data scientists alike. This book reflects their commitment to sharing actionable knowledge. Who is it for? This book is designed for data engineers, developers, and data scientists who work with big data systems and wish to utilize Apache Spark effectively. Whether you're looking to optimize existing Spark applications or explore its libraries for new use cases, this book will provide the guidance you need. A basic familiarity with big data concepts and programming in languages like Java or Python is recommended to make the most out of this book.

ElasticSearch Blueprints

2015-07-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vineeth Mohan

Analytics ELK data-engineering elasticsearch search

Dive into search technology with "ElasticSearch Blueprints"! This is the perfect project-based guide to help you master Elasticsearch. You will learn how to build and design scalable, effective search solutions, improve search relevancy, manage data efficiently, perform analytics, and visualize your data in comprehensive ways. What this Book will help me do Build and fine-tune scalable search engine features with Elasticsearch. Design and implement accurate ecommerce search solutions using filters. Analyze and visualize data with Elasticsearch's powerful data aggregation capabilities. Increase search relevancy and enhance user query assistance using analyzers. Incorporate enhanced data organization methods, including parent-child relationships. Author(s) None Mohan is an experienced professional specializing in search technologies. With a strong technical background, they have engaged deeply with Elasticsearch, creating solutions that address practical challenges. Their approach focuses on making technical topics accessible, guiding readers step-by-step through projects. Who is it for? This book is tailored for data professionals, application developers, and enthusiasts eager to delve into search technologies. Whether you're beginning with Elasticsearch or aiming to refine your skills, this guide will advance your expertise. By working through practical cases, you'll gain confidence in using Elasticsearch effectively to meet diverse requirements.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

2015-07-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by George J. Trujillo Jr. , Charles Kim , Rommel Garcia , Justin Murray , Steven Jones

Big Data Cloud Computing Data Management Hadoop HDFS Linux Cyber Security SQL data-engineering

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution. First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices. Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it. Coverage includes the following: • Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop • Understanding YARN resource management, HDFS storage, and I/O • Designing data ingestion, movement, and organization for modern enterprise data platforms • Defining SQL engine strategies to meet strict SLAs • Considering security, data isolation, and scheduling for multitenant environments • Deploying Hadoop as a service in the cloud • Reviewing the essential concepts, capabilities, and terminology of virtualization • Applying current best practices, guidelines, and key metrics for Hadoop virtualization • Managing multiple Hadoop frameworks and products as one unified system • Virtualizing master and worker nodes to maximize availability and performance • Installing and configuring Linux for a Hadoop environment

Exposure-Response Modeling

2015-07-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jixian Wang

data-engineering data-models

This book explores a wide range of topics in exposure-response modeling, from traditional PKPD modeling to other areas in drug development and beyond. It incorporates numerous examples and software programs for implementing novel methods. The book emphasizes dose adjustment and treatment adaptation based on dynamic exposure-response models, illustrates how to apply causal inference to exposure-response modeling in pharmacometrics and epidemiology, and links exposure-response modeling to clinical decision making through model-based decision analysis.

Building web applications with Python and Neo4j

2015-07-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sumit Gupta

Neo4j Python data-engineering graph-databases

Expand your Python web development expertise by integrating Neo4j into your applications. Through this book, you'll journey from understanding Neo4j's fundamentals to building powerful Python-based applications using tools like Flask, Py2neo, and Django. Learn how to model, query, and update graph data effectively. What this Book will help me do Gain an in-depth understanding of Neo4j installation, licensing, and tools. Master using Cypher for querying and modifying graph data models. Learn how to integrate Python with Neo4j effectively using Py2neo. Build RESTful services with Flask leveraging Neo4j for structured data. Create robust Django applications using graph-based data models with Neomodel. Author(s) Sumit Gupta is a seasoned Python developer with a strong background in graph database design and integration. He has extensive experience using Neo4j to create efficient, scalable applications for real-world problems. His hands-on approach combines practical examples with the depth of knowledge required to develop expertise. Who is it for? This book is ideal for Python developers with an interest in enhancing their applications through graph database technology. If you possess a moderate understanding of Python and wish to explore Neo4j for creating smarter, more interconnected data-driven solutions, this book is for you. You should be comfortable with basic programming concepts to fully benefit from this book.

Toad for Oracle Unleashed

2015-07-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dan Hotka , Bert Scalzo

HTML Oracle SQL data-engineering database-management-tools toad

Bert Scalzo and Dan Hotka have written the definitive, up-to-date guide to Version 12.x, Dell’s powerful new release of Toad for Oracle. Packed with step-by-step recipes, detailed screen shots, and hands-on exercises, Toad for Oracle Unleashed shows both developers and DBAs how to maximize their productivity. Drawing on their unsurpassed experience running Toad in production Oracle environments, Scalzo and Hotka thoroughly cover every area of Toad’s functionality. You’ll find practical insights into each of Toad’s most useful tools, from App Designer to Doc Generator, ER Diagrammer to Code Road Map. The authors offer proven solutions you can apply immediately to solve a wide variety of problems, from maintaining code integrity to automating performance and scalability testing. Learn how to… Install and launch Toad, connect to a database, and explore Toad’s new features Customize Toad to optimize productivity in your environment Use the Editor Window to execute SQL and PL/SQL, and view, save, or convert data Browse your schema, and create and edit objects Quickly generate useful reports with FastReport and Report Manager Clarify your database’s tables and data with the powerful Entity Relationship Diagrammer (ERD) and HTML documentation generator Work more efficiently with PL/SQL using code templates, snippets, and shortcuts Automate actions and applications with Automation Designer Perform key DBA tasks including database health checks, tablespace management, database and schema comparisons, and object rebuilding Identify and optimize poorlyperforming SQL and applications ON THE WEB:Download all examples and source code presented in this book from informit.com/title/9780134131856 as it becomes available.

Hadoop Application Architectures

2015-07-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mark Grover (Stemma) , Ted Malaska , Jonathan Seidman , Gwen Shapira

Data Management Hadoop data-engineering

Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.

Base SAS 9.4 Procedures Guide, Fourth Edition, 4th Edition

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by SAS Institute SAS

SAS analytics-platforms data-science

Contains the complete reference for all Base SAS procedures. Provides information about what each procedure does and, if relevant, the kind of output that it produces.

Data Analysis with Competing Risks and Intermediate States

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ronald B. Geskus

data-science data-science-tasks statistics

This practical and thorough book explains when and how to use models and techniques for the analysis of competing risks and intermediate states. It covers the most recent insights on estimation techniques and discusses in detail how to interpret the obtained results.

Information Professionals' Career Confidential

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ulla de Stricker

data-science data-science-as-a-profession

Based in part on a selection of the author's past blog postings, Information Professionals' Career Confidential is a convenient, browsable, and illuminating pocket compendium of insights on topics relevant for information and knowledge professionals at any stage of their careers. This book collects comments on matters of interest to new and experienced information professionals alike in 1-2 minute “quick takes, inviting further thought. Topics range from the value of knowledge management and effective communication in organizations to assessing employers’ perception of information professionals and how best to increase one’s value through professional organizations and volunteering. This unique resource will be illuminating for anyone in library and information science, career development, or knowledge and information management. Raises questions – in a lively and concise manner – relevant for information professionals Offers readers the opportunity to read entries one at a time for reflection, or to read the entire book and then go back to certain entries to consolidate the meaning Presents ideas and concepts from thoughtful perspectives in a style designed to make professionals and students reflect on their own careers

SAS 9.4 Macro Language: Reference, Fourth Edition, 4th Edition

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by SAS Institute SAS

SAS analytics-platforms data-science

Explains how to increase the modularity, flexibility, and maintainability of your SAS code using the SAS macro facility. Provides complete information about macro language elements,

SAS 9.4 SQL Procedure User's Guide, Second Edition, 2nd Edition

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by SAS Institute SAS

SAS SQL analytics-platforms data-science

Describes the basics of using the SQL procedure and provides comprehensive reference information. The usage information includes retrieving data from single and multiple tables; selecting specific data from tables; subsetting, ordering, and summarizing data; updating tables; combining tables to create new tables and useful reports; performing queries on database management system (DBMS) tables; using PROC SQL with the SAS macro facility; and debugging and optimizing PROC SQL code. The reference information includes statements, dictionary components, and system options.

SAS and Hadoop Technology: Overview

2015-07-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by SAS Institute SAS

Hadoop SAS data-engineering

Provides overview information for SAS and Hadoop technologies and explains how SAS and Hadoop work together. Use this document as a starting point to learn about the SAS technology that interacts with Hadoop.

Web Scraping with Python

2015-07-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ryan Mitchell

API Python Cyber Security data-science data-science-tasks web-scraping

Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.

Leadership and Women in Statistics

2015-07-13 · O'Reilly Data Science Books O'Reilly Amazon

book

by Yulia R. Gel , Ingram Olkin , Amanda L. Golbeck

data-science data-science-tasks statistics

This unique and insightful book examines leadership within the roles of statistician and data scientist from international and diverse perspectives. Featuring contributions from leadership experts and statisticians at various stages on the career jungle gym, the text supplies a greater understanding of leadership within teams, research consulting, and project management. It encourages reflection on leadership behaviors, promoting natural and organizational leadership, identifying existing opportunities to foster creative outputs and develop strong leadership voices, and explaining how to convert a passion for statistical science into visionary, ethical, and transformational leadership.

Accumulo

2015-07-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Michael Wall , Billie Rinaldi , Aaron Cordova

Cyber Security accumulo data-engineering nosql-databases

Get up to speed on Apache Accumulo, the flexible, high-performance key/value store created by the National Security Agency (NSA) and based on Google’s BigTable data storage system. Written by former NSA team members, this comprehensive tutorial and reference covers Accumulo architecture, application development, table design, and cell-level security. With clear information on system administration, performance tuning, and best practices, this book is ideal for developers seeking to write Accumulo applications, administrators charged with installing and maintaining Accumulo, and other professionals interested in what Accumulo has to offer. You will find everything you need to use this system fully.

Implementing IBM FlashSystem 840

2015-07-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Karen Orlando , Jon Herd , Detlef Helmbrecht , Carsten Larsen , Matt Levan

Cloud Computing IBM data-engineering

Almost all technological components in the data center are getting faster: central processing units, networks, storage area networks (SANs), and memory. All of them have improved their speed by a minimum of 10X; some of them by 100X, for example, data networks. However, spinning disk performance has only increased by 1.2 times. IBM® FlashSystem™ 840 version 1.3 closes this gap. The FlashSystem 840 is optimized for the data center to enable organizations of all sizes to strategically harness the value of stored data. It provides flexible capacity and extreme performance for the most demanding applications, including virtualized or bare-metal online transaction processing (OLTP) and online analytical processing (OLAP) databases, virtual desktop infrastructures (VDI), technical computing applications, and cloud environments. The system accelerates response times with IBM MicroLatency® access times as low as 90 µs write latency and 135 µs read latency to enable faster decision making. The introduction of a low capacity 1 TB flash module allows the FlashSystem 840 to be configured in capacity points as low as 2 TB in protected RAID 5 mode. Coupled with 10 GB iSCSI, the FlashSystem is positioned to bring extreme performance to small and medium-sized businesses (SMB) and growth markets. Implementing the IBM FlashSystem® 840 provides value that goes beyond those benefits that are seen on disk-based arrays. These benefits include better user experience, server and application consolidation, development cycle reduction, application scalability, data center footprint savings, and improved price performance economics. This IBM Redbooks® publication discusses IBM FlashSystem 840 version 1.3. It provides in-depth knowledge of the product architecture, software and hardware, its implementation, and hints and tips. Also illustrated are use cases that show real-world solutions for tiering, flash-only, and preferred read, as well as examples of the benefits gained by integrating the FlashSystem storage into business environments. Also described are product integration scenarios running the IBM FlashSystem 840 with the IBM SAN Volume Controller, and the IBM Storwize® family of products such V7000, V5000, and the V3700, as well as considerations when integrating with the IBM FlashSystem 840. The preferred practice guidance is provided for your FlashSystem environment with IBM 16 Gbps b-type products and features, focusing on Fibre Channel design. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand and learn how to implement this exciting technology.

Beginning Oracle Database 12c Administration: From Novice to Professional, Second Edition

2015-07-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ignatius Fernandez

Linux Oracle RDBMS data-engineering oracle-database-solutions

Beginning Oracle Database 12c Administration is your entry point into a successful and satisfying career as an Oracle Database Administrator. The chapters of this book are logically organized into four parts closely tracking the way your database administration career will naturally evolve. Part 1 "Database Concepts" gives necessary background in relational database theory and Oracle Database concepts, Part 2 "Database Implementation" teaches how to implement an Oracle database correctly, Part 3 "Database Support" exposes you to the daily routine of a database administrator, and Part 4 "Database Tuning" introduces the fine art of performance tuning. Beginning Oracle Database 12c Administration provides information that you won't find in other books on Oracle Database. You'll discover not only technical information, but also guidance on work practices that are as vital to your success as are your technical skills. The author's favorite chapter is "The Big Picture and the Ten Deliverables." (It is the editor’s favorite chapter too!) If you take the lessons in that chapter to heart, you can quickly become a much better Oracle database administrator than you ever thought possible. You will grasp the key aspects of theory behind relational database management systems and learn how to: Install and configure an Oracle database, and ensure that it’s properly licensed; Execute common management tasks in a Linux environment; Defend against data loss by implementing sound backup and recovery practices; and Improve database and query performance.

talk-data.com

Activity Trend

Top Events

Top Speakers

Neo4j Graph Data Modelling

Oracle Goldengate 12c Implementers Guide

Spark Cookbook

ElasticSearch Blueprints

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

Exposure-Response Modeling

Building web applications with Python and Neo4j

Toad for Oracle Unleashed

Hadoop Application Architectures

Base SAS 9.4 Procedures Guide, Fourth Edition, 4th Edition

Data Analysis with Competing Risks and Intermediate States

Information Professionals' Career Confidential

SAS 9.4 Macro Language: Reference, Fourth Edition, 4th Edition

SAS 9.4 SQL Procedure User's Guide, Second Edition, 2nd Edition

SAS and Hadoop Technology: Overview

Web Scraping with Python

Leadership and Women in Statistics

Accumulo

Implementing IBM FlashSystem 840

Beginning Oracle Database 12c Administration: From Novice to Professional, Second Edition