talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q2

Activities

5765 activities · Newest first

Big Data, Open Data and Data Development

The world has become digital and technological advances have multiplied circuits with access to data, their processing and their diffusion. New technologies have now reached a certain maturity. Data are available to everyone, anywhere on the planet. The number of Internet users in 2014 was 2.9 billion or 41% of the world population. The need for knowledge is becoming apparent in order to understand this multitude of data. We must educate, inform and train the masses. The development of related technologies, such as the advent of the Internet, social networks, "cloud-computing" (digital factories), has increased the available volumes of data. Currently, each individual creates, consumes, uses digital information: more than 3.4 million e-mails are sent worldwide every second, or 107,000 billion annually with 14,600 e-mails per year per person, but more than 70% are spam. Billions of pieces of content are shared on social networks such as Facebook, more than 2.46 million every minute. We spend more than 4.8 hours a day on the Internet using a computer, and 2.1 hours using a mobile. Data, this new ethereal manna from heaven, is produced in real time. It comes in a continuous stream from a multitude of sources which are generally heterogeneous. This accumulation of data of all types (audio, video, files, photos, etc.) generates new activities, the aim of which is to analyze this enormous mass of information. It is then necessary to adapt and try new approaches, new methods, new knowledge and new ways of working, resulting in new properties and new challenges since SEO logic must be created and implemented. At company level, this mass of data is difficult to manage. Its interpretation is primarily a challenge. This impacts those who are there to "manipulate" the mass and requires a specific infrastructure for creation, storage, processing, analysis and recovery. The biggest challenge lies in "the valuing of data" available in quantity, diversity and access speed.

Exploratory Factor Analysis with SAS

Explore the mysteries of Exploratory Factor Analysis (EFA) with SAS with an applied and user-friendly approach.

Exploratory Factor Analysis with SAS focuses solely on EFA, presenting a thorough and modern treatise on the different options, in accessible language targeted to the practicing statistician or researcher. This book provides real-world examples using real data, guidance for implementing best practices in the context of SAS, interpretation of results for end users, and it provides resources on the book's author page. Faculty teaching with this book can utilize these resources for their classes, and individual users can learn at their own pace, reinforcing their comprehension as they go.

Exploratory Factor Analysis with SAS reviews each of the major steps in EFA: data cleaning, extraction, rotation, interpretation, and replication. The last step, replication, is discussed less frequently in the context of EFA but, as we show, the results are of considerable use. Finally, two other practices that are commonly applied in EFA, estimation of factor scores and higher-order factors, are reviewed. Best practices are highlighted throughout the chapters.

A rudimentary working knowledge of SAS is required but no familiarity with EFA or with the SAS routines that are related to EFA is assumed.

Using SAS University Edition? You can use the code and data sets provided with this book. This helpful link will get you started: http://support.sas.com/publishing/import_ue.data.html

Emerging Trends in Applications and Infrastructures for Computational Biology, Bioinformatics, and Systems Biology

Emerging Trends in Applications and Infrastructures for Computational Biology, Bioinformatics, and Systems Biology: Systems and Applications covers the latest trends in the field with special emphasis on their applications. The first part covers the major areas of computational biology, development and application of data-analytical and theoretical methods, mathematical modeling, and computational simulation techniques for the study of biological and behavioral systems. The second part covers bioinformatics, an interdisciplinary field concerned with methods for storing, retrieving, organizing, and analyzing biological data. The book also explores the software tools used to generate useful biological knowledge. The third part, on systems biology, explores how to obtain, integrate, and analyze complex datasets from multiple experimental sources using interdisciplinary tools and techniques, with the final section focusing on big data and the collection of datasets so large and complex that it becomes difficult to process using conventional database management systems or traditional data processing applications. Explores all the latest advances in this fast-developing field from an applied perspective Provides the only coherent and comprehensive treatment of the subject available Covers the algorithm development, software design, and database applications that have been developed to foster research

World-Class Warehousing and Material Handling, 2nd Edition

The classic guide to warehouse operations—now fully revised and updated with the latest strategies, best practices, and case studies Under the influence of e-commerce, supply chain collaboration, globalization, and quick response, warehouses today are being asked to do more with less. The expectation now is that warehouses execute an increase in smaller transactions, handle and store more items, provide more product and service customization, process more returns, offer more value-added services, and receive and ship more international orders. Compounding the difficulty of meeting this increased demand is the fact that warehouses now have less time to process an order, less margin for error and fewer skilled personnel. How can a warehouse not only stay afloat but thrive in today’s marketplace? Efficiency and accuracy are the keys to success in warehousing. Despite today's just-in-time production mentality and efforts to eliminate warehouses and their inventory carrying costs, effective warehousing continues to play a critical bottom-line role for companies worldwide. World-Class Warehousing and Material Handling, 2 nd Edition is the first widely published methodology for warehouse problem solving across all areas of the supply chain, providing an organized set of principles that can be used to streamline all types of warehousing operations. Readers will discover state-of-the-art tools, metrics, and methodologies for dramatically increasing the effectiveness, accuracy, and overall productivity of warehousing operations. This comprehensive resource provides authoritative answers on such topics as: The seven principles of world-class warehousing · Warehouse activity profiling · Warehouse performance measures · Warehouse automation and computerization · Receiving, storage and retrieval operations · Picking and packing, and humanizing warehouse operations · Written by one of today's recognized logistics thought leaders, this fully updated comprehensive resource presents timeless insights for planning and managing 21st-century warehouse operations. About the Author Dr. Ed Frazelle is President and CEO of Logistics Resources International and Executive Director of The RightChain Institute. He is also the founding director of The Logistics Institute at Georgia Tech, the world's largest center for supply chain research and professional education.

Systems Analysis and Synthesis

Systems Analysis and Synthesis: Bridging Computer Science and Information Technology presents several new graph-theoretical methods that relate system design to core computer science concepts, and enable correct systems to be synthesized from specifications. Based on material refined in the author’s university courses, the book has immediate applicability for working system engineers or recent graduates who understand computer technology, but have the unfamiliar task of applying their knowledge to a real business problem. Starting with a comparison of synthesis and analysis, the book explains the fundamental building blocks of systems-atoms and events-and takes a graph-theoretical approach to database design to encourage a well-designed schema. The author explains how database systems work-useful both when working with a commercial database management system and when hand-crafting data structures-and how events control the way data flows through a system. Later chapters deal with system dynamics and modelling, rule-based systems, user psychology, and project management, to round out readers’ ability to understand and solve business problems. Bridges computer science theory with practical business problems to lead readers from requirements to a working system without error or backtracking Explains use-definition analysis to derive process graphs and avoid large-scale designs that don’t quite work Demonstrates functional dependency graphs to allow databases to be designed without painful iteration Includes chapters on system dynamics and modeling, rule-based systems, user psychology, and project management

The SAS Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques

The SAS Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques is intended for programmers of all skill levels. Learn how to link multiple reports, add graphics and logos, and manipulate table of contents values to help refine your programs, macrotize where possible, troubleshoot easily, and get great-looking reports every time. From beginner to advanced, the examples in this book will help you harness all the power and capability of PROC REPORT.

With dozens of useful examples, this book is completely unique in three ways. First, this book describes the default behavior of table of contents nodes and labels, and how to change the nodes inside of PROC REPORT. The chapter also explains how to use PROC DOCUMENT in conjunction with PROC REPORT. Secondly, an entire chapter is dedicated to the troubleshooting of errors, warnings, and notes that are produced by PROC REPORT, including explanations of what generated the log message and how to avoid it. Third, the book explains how to preprocess your data in order to get the best output from PROC REPORT, and it explores reports that require multiple steps to create. Whether you work in banking/finance, pharmaceuticals, the health and life sciences, or government, this handbook is sure to be your new favorite reporting reference.

Clinical Graphs Using SAS

SAS users in the Health and Life Sciences industry need to create complex graphs to analyze biostatistics data and clinical data, and they need to submit drugs for approval to the FDA. Graphs used in the HLS industry are complex in nature and require innovative usage of the graphics features. Clinical Graphs Using SAS® provides the knowledge, the code, and real-world examples that enable you to create common clinical graphs using SAS graphics tools, such as the Statistical Graphics procedures and the Graph Template Language.

This book describes detailed processes to create many commonly used graphs in the Health and Life Sciences industry. For SAS® 9.3 and SAS® 9.4 it covers many improvements in the graphics features that are supported by the Statistical Graphics procedures and the Graph Template Language, many of which are a direct result of the needs of the Health and Life Sciences community. With the addition of new features in SAS® 9.4, these graphs become positively easy to create.

Topics covered include the usage of SGPLOT procedure, the SGPANEL procedure and the Graph Template Language for the creation of graphs like forest plots, swimmer plots, and survival plots.

Spark

Production-targeted Spark guidance with real-world use cases Spark: Big Data Cluster Computing in Production goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big-data clustering in production. Written by an expert team well-known in the big data community, this book walks you through the challenges in moving from proof-of-concept or demo Spark applications to live Spark in production. Real use cases provide deep insight into common problems, limitations, challenges, and opportunities, while expert tips and tricks help you get the most out of Spark performance. Coverage includes Spark SQL, Tachyon, Kerberos, ML Lib, YARN, and Mesos, with clear, actionable guidance on resource scheduling, db connectors, streaming, security, and much more. Spark has become the tool of choice for many Big Data problems, with more active contributors than any other Apache Software project. General introductory books abound, but this book is the first to provide deep insight and real-world advice on using Spark in production. Specific guidance, expert tips, and invaluable foresight make this guide an incredibly useful resource for real production settings. Review Spark hardware requirements and estimate cluster size Gain insight from real-world production use cases Tighten security, schedule resources, and fine-tune performance Overcome common problems encountered using Spark in production Spark works with other big data tools including MapReduce and Hadoop, and uses languages you already know like Java, Scala, Python, and R. Lightning speed makes Spark too good to pass up, but understanding limitations and challenges in advance goes a long way toward easing actual production implementation. Spark: Big Data Cluster Computing in Production tells you everything you need to know, with real-world production insight and expert guidance, tips, and tricks.

The DS2 Procedure: SAS Programming Methods at Work

The issue facing most SAS programmers today is not that data space has become bigger ("Big Data"), but that our programming problem space has become bigger. Through the power of DS2, this book shows programmers how easily they can manage complex problems using modular coding techniques.

The DS2 Procedure: SAS Programming Methods at Work outlines the basic structure of a DS2 program and teaches you how each component can help you address problems. The DS2 programming language in SAS 9.4 simplifies and speeds data preparation with user-defined methods, storing methods and attributes in shareable packages, and threaded execution on multicore symmetric multiprocessing (SMP) and massively parallel processing (MPP) machines. This book is intended for all BASE SAS programmers looking to learn about DS2; readers need only an introductory level of SAS to get started. Topics covered include introductions to Object Oriented Programming methods, DATA step programs, user-defined methods, predefined packages, and threaded processing.

Getting Started with RethinkDB

Dive into the world of NoSQL databases with RethinkDB, a modern and powerful document-oriented database designed for developing real-time applications. Through this book, you'll explore essential RethinkDB features and learn how to integrate it seamlessly with Node.js, enabling you to build and deploy responsive web apps. What this Book will help me do Master the basics of installing and configuring RethinkDB on your system. Learn how to use the intuitive ReQL language to perform complex queries. Set up and manage RethinkDB clusters by mastering sharding and replication. Optimize database performance using indexing and advanced query techniques. Develop interactive real-time applications by integrating RethinkDB with Node.js. Author(s) None Tiepolo is an experienced developer and educator specializing in real-time database technologies. With extensive expertise in NoSQL solutions and hands-on experience in software engineering, Tiepolo combines a teacher's clarity with a programmer's practicality to make complex topics accessible. Who is it for? This book is tailored for developers eager to grasp RethinkDB, particularly those with an interest in building real-time applications. If you are new to database programming, you'll find accessible guidance here. Developers with basic experience in JavaScript or Node.js will gain further insights into real-world applications of these skills.

Data and Electric Power

Traditional engineering is built upon a world of knowledge and scientific laws, with components and systems that operate predictably. But what happens when a large number of these devices are interconnected? You get a complex system that’s no longer deterministic, but probabilistic. That’s happening today in many industries, including manufacturing, petroleum, transportation, and energy. In this O’Reilly report, Sean Patrick Murphy, Chief Data Scientist at PingThings, describes how data science is helping electric utilities make sense of a stochastic world filled with increasing uncertainty—including fundamental changes to the energy market and random phenomena such as weather and solar activity. Murphy also reviews several cutting-edge tools for storing and processing big data that he’s used in his work with electric utilities—tools that can help traditional engineers pursue a data-driven approach in many industries. Topics in this report include: Key drivers that have changed the electric grid from a deterministic machine into probabilistic system Fundamental differences that put traditional engineering and data science at odds with one another Why the time is right for engineering organizations to adopt a complete data-driven approach Contemporary tools that traditional engineers can use to store and process big data A PingThings case study for dealing with random geomagnetic disturbances to the energy grid

Finding Profit in Your Organization's Data

Using log data to create value isn’t new to mechanized industries. But in today’s data-driven environment—particularly with the rise of the Internet of Things—this type of data exhaust can be converted from inactive, latent assets to critical-path components of an overall production ecosystem. In this report, Cameron Turner provides three real-world case studies in which his company, The Data Guild, served as a product co-development consultancy. You’ll learn how an energy efficiency firm, a tech company, and a healthcare organization combined their historical logs with newly generated sensor data from the IoT. By leveraging machine learning to proactively identify efficiency and opportunity through prediction and recommendation, each company was able to deploy an ROI-generating solution and gain a significant business advantage. This report also provides advice for successfully implementing IoT data, as well as key factors to consider when performing data analysis.

Going Pro in Data Science

Digging for answers to your pressing business questions probably won’t resemble those tidy case studies that lead you step-by-step from data collection to cool insights. Data science is not so clear-cut in the real world. Instead of high-quality data with the right velocity, variety, and volume, many data scientists have to work with missing or sketchy information extracted from people in the organization. In this O’Reilly report, Jerry Overton—Distinguished Engineer at global IT leader DXC—introduces practices for making good decisions in a messy and complicated world. What he simply calls “data science that works” is a trial-and-error process of creating and testing hypotheses, gathering evidence, and drawing conclusions. These skills are far more useful for practicing data scientists than, say, mastering the details of a machine-learning algorithm. Adapted and expanded from a series of articles Overton published on O’Reilly Radar and on the CSC Blog, each chapter is ideal for current and aspiring data scientists who want to go pro, as well as IT execs and managers looking to hire in this field. The report covers: Using the scientific method to gain a competitive advantage The skill set you need to look for when choosing a data scientist Why practical induction is a key part of thinking like a data scientist Best practices for writing solid code in your data science gig How agile experimentation lets you find answers (or dead ends) much faster Advice for surviving (and even thriving) as a data scientist in your organization

Hadoop: What You Need to Know

Hadoop has revolutionized data processing and enterprise data warehousing, but its explosive growth has come with a large amount of uncertainty, hype, and confusion. With this report, enterprise decision makers will receive a concise crash course on what Hadoop is and why it’s important. Hadoop represents a major shift from traditional enterprise data warehousing and data analytics, and its technology can be daunting at first. Donald Miner, founder of the data science firm Miner & Kasch, covers just enough ground so you can make intelligent decisions about Hadoop in your enterprise. By the end of this report, you’ll know the basics of technologies such as HDFS, MapReduce, and YARN, without becoming mired in the details. Not only will you learn the basics of how Hadoop works and why it’s such an important technology, you’ll get examples of how you should probably be using it.

Self-Service Analytics

Organizations today are swimming in data, but most of them manage to analyze only a fraction of what they collect. To help build a stronger data-driven culture, many organizations are adopting a new approach called self-service analytics. This O’Reilly report examines how this approach provides data access to more people across a company, allowing business users to work with data themselves and create their own customized analyses. The result? More eyes looking at more data in more ways. Along with the perceived benefits, author Sandra Swanson also delves into the potential pitfalls of self-service analytics: balancing greater data access with concerns about security, data governance, and siloed data stores. Read this report and gain insights from enterprise tech (Yahoo), government (the City of Chicago), and disruptive retail (Warby Parker and Talend). Learn how these organizations are handling self-service analytics in practice. Sandra Swanson is a Chicago-based writer who’s covered technology, science, and business for dozens of publications, including ScientificAmerican.com. Connect with her on Twitter (@saswanson) or at www.saswanson.com.

Ten Signs of Data Science Maturity

How well prepared is your organization to innovate, using data science? In this report, two leading data scientists at the consulting firm Booz Allen Hamilton describe ten characteristics of a mature data science capability. After spending years helping clients such as the US government and commercial organizations worldwide build innovative data science capabilities, Peter Guerra and Dr. Kirk Borne identified these characteristics to help you measure your company’s competence in this area. This report provides a detailed discussion of each of the 10 signs of data science maturity, which—among many other things—encourage you to: Give members of your organization access to all your available data Use Agile and leverage "DataOps"—DevOps for data product development Help your data science team sharpen its skills through open or internal competitions Personify data science as a way of doing things, and not a thing to do

Hiding Behind the Keyboard

Hiding Behind the Keyboard: Uncovering Covert Communication Methods with Forensic Analysis exposes the latest electronic covert communication techniques used by cybercriminals, along with the needed investigative methods for identifying them. The book shows how to use the Internet for legitimate covert communication, while giving investigators the information they need for detecting cybercriminals who attempt to hide their true identity. Intended for practitioners and investigators, the book offers concrete examples on how to communicate securely, serving as an ideal reference for those who truly need protection, as well as those who investigate cybercriminals. Covers high-level strategies, what they can achieve, and how to implement them Shows discovery and mitigation methods using examples, court cases, and more Explores how social media sites and gaming technologies can be used for illicit communications activities Explores the currently in-use technologies such as TAILS and TOR that help with keeping anonymous online

Data Simplification

Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools. This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data. Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user. Discusses data simplification principles, methods, and tools that must be studied and mastered Provides open source tools, free utilities, and snippets of code that can be reused and repurposed to simplify data Explains how to best utilize indexes to search, retrieve, and analyze textual data Shows the data scientist how to apply ontologies, classifications, classes, properties, and instances to data using tried and true methods

Learning QGIS, Third Edition - Third Edition

Learning QGIS, Third Edition, serves as a comprehensive guide for GIS users looking to enhance their skills using the QGIS platform. By following the structured, step-by-step instructions, you'll master data visualization, manipulation, and advanced mapping techniques. The book emphasizes practical knowledge, enabling you to efficiently handle both data processing and cartographic output. What this Book will help me do Install and effectively navigate the QGIS software interface to enable GIS tasks. Load, visualize, and manage vector and raster spatial data from various sources. Create, edit, and analyze spatial datasets with precision using QGIS tools. Perform and automate complex geoprocessing tasks using the Processing toolbox. Configure advanced cartographic outputs including printable maps tailored to your needs. Author(s) Anita Graser, a notable GIS expert, brings her extensive experience and knowledge in open-source geospatial technologies to this book. She is a core developer of QGIS and regularly publishes content on GIS applications and spatial analysis. Anita excels in presenting complex concepts in a user-friendly manner, making advanced GIS techniques accessible to learners of diverse backgrounds. Who is it for? This book is tailored for GIS professionals, consultants, or developers looking to expand their expertise in QGIS. Whether you're familiar with GIS principles or are an experienced user of other platforms, this book helps bridge the gap to using QGIS effectively. If you're aiming to enhance your mapping and geospatial analysis capabilities, this guide is greatly suited for your ambitions.

Gnuplot in Action, Second Edition

Gnuplot in Action, Second Edition is a major revision of this popular and authoritative guide for developers, engineers, and scientists who want to learn and use gnuplot effectively. Fully updated for gnuplot version 5, the book includes four pages of color illustrations and four bonus appendixes available in the eBook. About the Technology Gnuplot is an open-source graphics program that helps you analyze, interpret, and present numerical data. Available for Unix, Mac, and Windows, it is well-maintained, mature, and totally free. About the Book Gnuplot in Action, Second Edition is a major revision of this authoritative guide for developers, engineers, and scientists. The book starts with a tutorial introduction, followed by a systematic overview of gnuplot's core features and full coverage of gnuplot's advanced capabilities. Experienced readers will appreciate the discussion of gnuplot 5?s features, including new plot types, improved text and color handling, and support for interactive, web-based display formats. The book concludes with chapters on graphical effects and general techniques for understanding data with graphs. It includes four pages of color illustrations. 3D graphics, false-color plots, heatmaps, and multivariate visualizations are covered in chapter-length appendixes available in the eBook. What's Inside Creating different types of graphs in detail Animations, scripting, batch operations Extensive discussion of terminals Updated to cover gnuplot version 5 About the Reader No prior experience with gnuplot is required. This book concentrates on practical applications of gnuplot relevant to users of all levels. About the Author Philipp K. Janert, Ph.D, is a programmer and scientist. He is the author of several books on data analysis and applied math and has been a gnuplot power user and developer for over 20 years. Quotes The highly anticipated, updated version of my go-to-for-everything book on gnuplot. - Ryan Balfanz, Shift Medical, Inc. The essential guide for newcomers and the definitive handbook for advanced users. - Zoltán Vörös, University of Innsbruck Learn how to use gnuplot to convert meaningful data into attention-grabbing visualizations that communicate your message quickly and accurately. - David Kerns, Rincon Research Corporation An accessible guide to gnuplot and best practices of everyday data visualization. - Wesley R. Elsberry,PhD, RealPage, Inc.