data

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data

2015-01-27 · O'Reilly Data Science Books O'Reilly Amazon

book

by EMC Education Services

Analytics Big Data Data Analytics Data Science data-science

Data Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software. This book will help you: Become a contributor on a data science team Deploy a structured lifecycle approach to data analytics problems Apply appropriate analytic techniques and tools to analyzing big data Learn how to tell a compelling story with data to drive business action Prepare for EMC Proven Professional Data Science Certification Corresponding data sets are available at www.wiley.com/go/9781118876138. Get started discovering, analyzing, visualizing, and presenting data in a meaningful way today!

Graph Analysis and Visualization: Discovering Business Opportunity in Linked Data

2015-01-27 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Richard Brath , David Jonker

Big Data Marketing Python Cyber Security data-science data-science-tasks graph-analytics

Wring more out of the data with a scientific approach to analysis Graph Analysis and Visualization brings graph theory out of the lab and into the real world. Using sophisticated methods and tools that span analysis functions, this guide shows you how to exploit graph and network analytic techniques to enable the discovery of new business insights and opportunities. Published in full color, the book describes the process of creating powerful visualizations using a rich and engaging set of examples from sports, finance, marketing, security, social media, and more. You will find practical guidance toward pattern identification and using various data sources, including Big Data, plus clear instruction on the use of software and programming. The companion website offers data sets, full code examples in Python, and links to all the tools covered in the book. Science has already reaped the benefit of network and graph theory, which has powered breakthroughs in physics, economics, genetics, and more. This book brings those proven techniques into the world of business, finance, strategy, and design, helping extract more information from data and better communicate the results to decision-makers. Study graphical examples of networks using clear and insightful visualizations Analyze specifically-curated, easy-to-use data sets from various industries Learn the software tools and programming languages that extract insights from data Code examples using the popular Python programming language There is a tremendous body of scientific work on network and graph theory, but very little of it directly applies to analyst functions outside of the core sciences - until now. Written for those seeking empirically based, systematic analysis methods and powerful tools that apply outside the lab, Graph Analysis and Visualization is a thorough, authoritative resource.

Implementing High Availability and Disaster Recovery in IBM PureApplication Systems V2

2015-01-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rajeev Gandhi , Addison Goering , Margaret Ticknor , Venkata Gadepalli , Stanley Shieh , Bertrand Portier , Sung-Ik Son , Hendrik Van Run

IBM data-engineering

This IBM Redbooks publication describes and demonstrates common, prescriptive scenarios for setting up disaster recovery for common workloads using IBM WebSphere Application Server, IBM DB2, and WebSphere MQ between two IBM PureApplication System racks using the features in PureApplication System V2. The intended audience for this book is pattern developers and operations team members who are setting up production systems using software patterns from IBM that must be highly available or able to recover from a disaster (defined as the complete loss of a data center).

Solr Cookbook - Third Edition - Third Edition

2015-01-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rafal Kuc

data-engineering search solr

Master Apache Solr with the comprehensive 'Solr Cookbook - Third Edition', which introduces over 100 practical recipes to help you exploit the full potential of Apache Solr versions 4.x to 5. By following this book, you'll gain actionable insights and solutions to solve real-world problems effectively with Solr. What this Book will help me do Effectively index data from various sources and formats into Solr for optimized searches. Utilize and configure faceting to enhance aggregated data insights. Implement and configure SolrCloud for scalable and robust search infrastructures. Identify and resolve performance bottlenecks in Solr and Solr clusters. Develop and deploy advanced query features like autocomplete and document highlighting. Author(s) Rafal Kuc is a seasoned software architect with years of experience working with Apache Solr in production environments. He specializes in search technologies, distributed systems, and empowering developers with actionable knowledge. Rafal approaches writing with a practical mindset, focusing on how to solve real-world challenges efficiently. Who is it for? This book is ideal for intermediate Solr developers, system architects, or IT professionals responsible for search systems. It assumes a basic familiarity with Solr but provides deep dives into advanced functionalities and configurations. Readers looking to enhance their understanding of Solr 4.x and 5.x capabilities will find this book valuable. Whether you're improving search performance or exploring new Solr features, this book guides you step-by-step.

Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining

2015-01-20 · O'Reilly Data Science Books O'Reilly Amazon

book

by Dominic Nyhuis , Simon Munzert , Christian Rubba , Peter Meissner

Data Collection HTML JSON SQL XML data-science data-science-tasks web-scraping

A hands on guide to web scraping and text mining for both beginners and experienced users of R Introduces fundamental concepts of the main architecture of the web and databases and covers HTTP, HTML, XML, JSON, SQL. Provides basic techniques to query web documents and data sets (XPath and regular expressions). An extensive set of exercises are presented to guide the reader through each technique. Explores both supervised and unsupervised techniques as well as advanced techniques such as data scraping and text management. Case studies are featured throughout along with examples for each technique presented. R code and solutions to exercises featured in the book are provided on a supporting website.

Getting Started with IBM InfoSphere Optim Workload Replay for DB2

2015-01-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Leif Pedersen , Hassi Norlen , Whei-Jen Chen , John Vonau , Tom Toomire , Patrick Titzler , Nisanti Mohanraj

IBM Linux SQL Unix data-engineering ibm-db2 relational-databases

This IBM® Redbooks® publication will help you install, configure, and use IBM InfoSphere® Optim™ Workload Replay (InfoSphere Workload Replay), a web-based tool that lets you capture real production SQL workload data and then replay the workload data in a pre-production environment. With InfoSphere Workload Replay, you can set up and run realistic tests for enterprise database changes without the need to create a complex client and application infrastructure to mimic your production environment. The publication goes through the steps to install and configure the InfoSphere Workload Replay appliance and related database components for IBM DB2® for Linux, UNIX, and Windows and for DB2 for IBM z/OS®. The capture, replay, and reporting process, including user ID and roles management, is described in detail to quickly get you up and running. Ongoing operations, such as appliance health monitoring, starting and stopping the product, and backup and restore in your day-to-day management of the product, extensive troubleshooting information, and information about how to integrate InfoSphere Workload Replay with other InfoSphere products are covered in separate chapters.

Implementing the IBM Storwize V7000 Gen2

2015-01-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Nancy Kinney , Lev Sturmer , Jon Tate , Morten Dannemand , Massimo Rosati

Big Data IBM data-engineering

Data is the new currency of business, the most critical asset of the modern organization. In fact, enterprises that can gain business insights from their data are twice as likely to outperform their competitors. Nevertheless, 72% of them have not started, or are only planning, big data activities. In addition, organizations often spend too much money and time managing where their data is stored. The average firm purchases 24% more storage every year, but uses less than half of the capacity that it already has. The IBM® Storwize® family, including the IBM SAN Volume Controller Data Platform, is a storage virtualization system that enables a single point of control for storage resources. This functionality helps support improved business application availability and greater resource use. The following list describes the business objectives of this system: To manage storage resources in your information technology (IT) infrastructure To make sure that those resources are used to the advantage of your business To do it quickly, efficiently, and in real time, while avoiding increases in administrative costs Storwize functions benefit all virtualized storage. For example, IBM Easy Tier® optimizes use of flash memory. In addition, IBM Real-time Compression™ enhances efficiency even further by enabling the storage of up to five times as much active primary data in the same physical disk space. Finally, high-performance thin provisioning helps automate provisioning. These benefits can help extend the useful life of existing storage assets, reducing costs. Integrating these functions into Storwize also means that they are designed to operate smoothly together, reducing management effort. This IBM Redbooks® publication provides information about the latest features and functions of the Storwize V7000 Gen2 and software version 7.3 implementation, architectural improvements, and Easy Tier.

Data Driven

2015-01-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Hilary Mason (Hidden Door) , DJ Patil (GreatPoint Ventures)

Big Data Hadoop data-engineering

Succeeding with data isn’t just a matter of putting Hadoop in your machine room, or hiring some physicists with crazy math skills. It requires you to develop a data culture that involves people throughout the organization. In this O’Reilly report, DJ Patil and Hilary Mason outline the steps you need to take if your company is to be truly data-driven—including the questions you should ask and the methods you should adopt. You’ll not only learn examples of how Google, LinkedIn, and Facebook use their data, but also how Walmart, UPS, and other organizations took advantage of this resource long before the advent of Big Data. No matter how you approach it, building a data culture is the key to success in the 21st century. You’ll explore: Data scientist skills—and why every company needs a Spock How the benefits of giving company-wide access to data outweigh the costs Why data-driven organizations use the scientific method to explore and solve data problems Key questions to help you develop a research-specific process for tackling important issues What to consider when assembling your data team Developing processes to keep your data team (and company) engaged Choosing technologies that are powerful, support teamwork, and easy to use and learn

Data Privacy for the Smart Grid

2015-01-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Christine Hertzog , Rebecca Herold

Cyber Security data-engineering data-security-privacy data security & privacy

Privacy for the Smart Grid provides easy-to-understand guidance on data privacy issues and the implications for creating privacy risk management programs, along with privacy policies and practices required to ensure Smart Grid privacy. It addresses privacy in electric, natural gas, and water grids from two different perspectives of the topic, one from a Smart Grid expert and another from a privacy and information security expert. While considering privacy in the Smart Grid, the book also examines the data created by Smart Grid technologies and machine-to-machine applications.

Business Applications of Multiple Regression, Second Edition

2015-01-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ronny Richardson

Microsoft data-science data-science-tasks regression-analysis statistics

This second edition of Business Applications of Multiple Regression describes the use of the statistical procedure called multiple regression in business situations, including forecasting and understanding the relationships between variables. The book assumes a basic understanding of statistics but reviews correlation analysis and simple regression to prepare the reader to understand and use multiple regression. The techniques described in the book are illustrated using both Microsoft Excel and a professional statistical program. Along the way, several real-world data sets are analyzed in detail to better prepare the reader for working with actual data in a business environment. This book will be a useful guide to managers at all levels who need to understand and make decisions based on data analysis performed using multiple regression. It also provides the beginning analyst with the detailed understanding required to use multiple regression to analyze data sets.

Digital Privacy in the Marketplace

2015-01-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by George Milne

data-engineering data-security-privacy data security & privacy

Digital Privacy in the Marketplace focuses on the data ex-changes between marketers and consumers, with special ttention to the privacy challenges that are brought about by new information technologies. The purpose of this book is to provide a background source to help the reader think more deeply about the impact of privacy issues on both consumers and marketers. It covers topics such as: why privacy is needed, the technological, historical and academic theories of privacy, how market exchange af-fects privacy, what are the privacy harms and protections available, and what is the likely future of privacy.

Key Management Models, 3rd Edition

2015-01-14 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Gerben Van den Berg , Paul Pietersma

data-engineering data-models

This best selling management book is a true classic. If you want to be a model manager, keep this new, even better 3rd edition close at hand. Key Management Models has the winning combination of brevity and clarity, giving you short, practical overviews of the top classic and cutting edge management models in an easy-to-use, ready reference format. Whether you want to remind yourself about models you’ve already come across, or want to find new ones, you’ll find yourself referring back to it again and again. It's the essential guide to all the management models you’ll ever need to know about. Includes the classic and essential management models from the previous editions. Thoroughly updated to include cutting edge new models. Two-colour illustrations and case studies throughout. The full text downloaded to your computer With eBooks you can: search for key concepts, words and phrases make highlights and notes as you study share your notes with friends eBooks are downloaded to your computer and accessible either offline through the Bookshelf (available as a free download), available online and also via the iPad and Android apps. Upon purchase, you will receive via email the code and instructions on how to access this product. Time limit The eBooks products do not have an expiry date. You will continue to access your digital ebook products whilst you have your Bookshelf installed.

Getting a Big Data Job For Dummies

2015-01-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jason Williamson

Big Data data-engineering

Hone your analytic talents and become part of the next big thing Getting a Big Data Job For Dummies is the ultimate guide to landing a position in one of the fastest-growing fields in the modern economy. Learn exactly what "big data" means, why it's so important across all industries, and how you can obtain one of the most sought-after skill sets of the decade. This book walks you through the process of identifying your ideal big data job, shaping the perfect resume, and nailing the interview, all in one easy-to-read guide. Companies from all industries, including finance, technology, medicine, and defense, are harnessing massive amounts of data to reap a competitive advantage. The demand for big data professionals is growing every year, and experts forecast an estimated 1.9 million additional U.S. jobs in big data by 2015. Whether your niche is developing the technology, handling the data, or analyzing the results, turning your attention to a career in big data can lead to a more secure, more lucrative career path. Getting a Big Data Job For Dummies provides an overview of the big data career arc, and then shows you how to get your foot in the door with topics like: The education you need to succeed The range of big data career path options An overview of major big data employers A plan to develop your job-landing strategy Your analytic inclinations may be your ticket to long-lasting success. In a highly competitive job market, developing your data skills can create a situation where you pick your employer rather than the other way around. If you're ready to get in on the ground floor of the next big thing, Getting a Big Data Job For Dummies will teach you everything you need to know to get started today.

Oracle Database 12c Security

2015-01-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Scott Gaetjen , William Maroulis , David Knox

Cloud Computing Oracle Cyber Security data-engineering oracle-database-solutions

Best Practices for Comprehensive Oracle Database Security Written by renowned experts from Oracle's National Security Group, Oracle Database 12c Security provides proven techniques for designing, implementing, and certifying secure Oracle Database systems in a multitenant architecture. The strategies are also applicable to standalone databases. This Oracle Press guide addresses everything from infrastructure to audit lifecycle and describes how to apply security measures in a holistic manner. The latest security features of Oracle Database 12c are explored in detail with practical and easy-to-understand examples. Connect users to databases in a secure manner Manage identity, authentication, and access control Implement database application security Provide security policies across enterprise applications using Real Application Security Control data access with Oracle Virtual Private Database Control sensitive data using data redaction and transparent sensitive data protection Control data access with Oracle Label Security Use Oracle Database Vault and Transparent Data Encryption for compliance, cybersecurity, and insider threats Implement auditing technologies, including Unified Audit Trail Manage security policies and monitor a secure database environment with Oracle Enterprise Manager Cloud Control

IBM TS4500 Tape Library Guide

2015-01-08 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Larry Coyne , Michael Engelbrecht

IBM data-engineering

The IBM® TS4500 tape library is a next-generation tape solution that offers higher storage density and integrated management. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth require, with the cost-effectiveness and the manageability to grow with business data needs, while preserving existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot, because the TS4500 can store up to 5.5 PBs of data in a single 10 square foot library frame, which is up to 3.4 times more capacity than the IBM TS3500 tape library. This guide describes TS4500 components, feature codes, specifications, supported tape drives, encryption, the new integrated management console, and the command-line interface (CLI) and provides instructions for several specific tasks. It is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Visio Services Quick Guide: Using Visio with Sharepoint 2013 and Office 365

2015-01-07 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sahil Malik , Srini Sistla

API JavaScript SQL data-science data-science-tasks data-visualization microsoft-visio

In this fast-paced 100-page guide, you’ll learn to load, display and interact with dynamic, data-powered Visio diagrams in SharePoint 2013 or Office 365. Visio Services Quick Guide gives you the tools to build anything from a simple project workflow to an organizational infrastructure diagram, powered by real data from SharePoint or SQL Server. Colleagues can load your diagrams entirely in the browser, meaning that a single Visio client installation is enough to get started. Readers with JavaScript experience will also find out how to get additional control over Visio diagrams using the JavaScript mashup API, and how to build a custom data provider. The final chapter covers some useful information on administering Visio Services. Get started bringing your Visio diagrams to life with the Visio Services Quick Guide.

PHP and MySQL Web Development: A Beginner’s Guide

2015-01-05 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marty Matthews

HTML JavaScript MySQL SQL data-engineering relational-databases

Essential Skills—Made Easy! PHP and MySQL Web Development: A Beginner's Guide takes you from building static web pages to creating comprehensive database-driven web applications. The book reviews HTML, CSS, and JavaScript and then explores PHP--its structure, control statements, arrays, functions, use with forms, and file handling capabilities. Next, the book examines MySQL, including SQL, the MySQL command set, and how to use it with PHP to create a relational database and build secure, databasedriven web applications. This practical resource features complete, step-by-step examples with code that you can use as templates for your own projects. Designed for Easy Learning Key Skills & Concepts--Chapter-opening lists of specific skills covered in the chapter Try This--Hands-on exercises that show you how to apply your skills Notes--Extra information related to the topic being covered Tips--Helpful reminders or alternate ways of doing things Cautions--Errors and pitfalls to avoid Self Tests--End-of-chapter quizzes to reinforce your skills Annotated Syntax--Example code with commentary that describes the programming techniques being illustrated Ready-to-use code at www.mhprofessional.com

Practical Neo4j

2015-01-05 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Gregory Jordan

Big Data Data Modelling Java Neo4j NoSQL Python data-engineering graph-databases

" Why have developers at places like Facebook and Twitter increasingly turned to graph databases to manage their highly connected big data? The short answer is that graphs offer superior speed and flexibility to get the job done. It’s time you added skills in graph databases to your toolkit. In Practical Neo4j, database expert Greg Jordan guides you through the background and basics of graph databases and gets you quickly up and running with Neo4j, the most prominent graph database on the market today. Jordan walks you through the data modeling stages for projects such as social networks, recommendation engines, and geo-based applications. The book also dives into the configuration steps as well as the language options used to create your Neo4j-backed applications. Neo4j runs some of the largest connected datasets in the world, and developing with it offers you a fast, proven NoSQL database option. Besides those working for social media, database, and networking companies of all sizes, academics and researchers will find Neo4j a powerful research tool that can help connect large sets of diverse data and provide insights that would otherwise remain hidden. Using Practical Neo4j, you will learn how to harness that power and create elegant solutions that address complex data problems. This book: Explains the basics of graph databases Demonstrates how to configure and maintain Neo4j Shows how to import data into Neo4j from a variety of sources Provides a working example of a Neo4j-based application using an array of language of options including Java, .Net, PHP, Python, Spring, and Ruby As you’ll discover, Neo4j offers a blend of simplicity and speed while allowing data relationships to maintain first-class status. That’s one reason among many that such a wide range of industries and fields have turned to graph databases to analyze deep, dense relationships. After reading this book, you’ll have a potent, elegant tool you can use to develop projects profitably and improve your career options.

Running Applications on Oracle Exadata

2015-01-05 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Joyjeet Banerjee

Oracle data-engineering oracle-database-solutions

Maximize Application Performance on Oracle Exadata Written by an enterprise architect specializing in applications on Oracle's engineered systems, Running Applications on Oracle Exadata: Tuning Tips & Techniques reveals proven methods for configuring and tuning Oracle Exadata to achieve peak results from applications. You'll get complete details on application migration, consolidation, and administration. Deliver unparalleled enterprise application performance on Oracle Exadata using the best practices provided in this Oracle Press guide. Understand Oracle Exadata architecture, hardware components, and software features Achieve peak performance from online transaction processing (OLTP) systems Size Oracle Exadata for applications using comparative and predictive methods Migrate and consolidate applications to Oracle Exadata Monitor, manage, and administer all Oracle Exadata components to ensure high availability and performance Develop and implement a backup and recovery strategy Learn best practices for running applications on Oracle Exadata Code examples in the book are available for download at OraclePressBooks.com

Beginning SQL Server for Developers, Fourth Edition

2014-12-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Robin Dewson

BI Microsoft SQL data-engineering

Beginning SQL Server for Developers is the perfect book for developers new to SQL Server and planning to create and deploy applications against Microsoft’s market-leading database system for the Windows platform. Now in its fourth edition, the book is enhanced to cover the very latest developments in SQL Server, including the in-memory features that are introduced in SQL Server 2014. Within the book, there are plenty of examples of tasks that developers routinely perform. You’ll learn to create tables and indexes, and be introduced to best practices for securing your valuable data. You’ll learn design tradeoffs and find out how to make sound decisions resulting in scalable databases and maintainable code. SQL Server 2014 introduces in-memory tables and stored procedures. It's now possible to accelerate applications by creating tables (and their indexes) that reside entirely in memory, and never on disk. These new, in-memory structures differ from caching mechanisms of the past, and make possible the extraordinarily swift execution of certain types of queries such as are used in business intelligence applications. Beginning SQL Server for Developers helps you realize the promises of this new feature set while avoiding pitfalls that can occur when mixing in-memory tables and code with traditional, disk-based tables and code. Beginning SQL Server for Developers takes you through the entire database development process, from installing the software to creating a database to writing the code to connect to that database and move data in and out. By the end of the book, you’ll be able to design and create solid and reliable database solutions using SQL Server. Takes you through the entire database application development lifecycle Includes brand new coverage of the in-memory features Introduces the freely-available Express Edition

talk-data.com

Activity Trend

Top Events

Top Speakers

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data

Graph Analysis and Visualization: Discovering Business Opportunity in Linked Data

Implementing High Availability and Disaster Recovery in IBM PureApplication Systems V2

Solr Cookbook - Third Edition - Third Edition

Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining

Getting Started with IBM InfoSphere Optim Workload Replay for DB2

Implementing the IBM Storwize V7000 Gen2

Data Driven

Data Privacy for the Smart Grid

Business Applications of Multiple Regression, Second Edition

Digital Privacy in the Marketplace

Key Management Models, 3rd Edition

Getting a Big Data Job For Dummies

Oracle Database 12c Security

IBM TS4500 Tape Library Guide

Visio Services Quick Guide: Using Visio with Sharepoint 2013 and Office 365

PHP and MySQL Web Development: A Beginner’s Guide

Practical Neo4j

Running Applications on Oracle Exadata

Beginning SQL Server for Developers, Fourth Edition