talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Advanced Object-Oriented Programming in R: Statistical Programming for Data Science, Analysis and Finance

Learn how to write object-oriented programs in R and how to construct classes and class hierarchies in the three object-oriented systems available in R. This book gives an introduction to object-oriented programming in the R programming language and shows you how to use and apply R in an object-oriented manner. You will then be able to use this powerful programming style in your own statistical programming projects to write flexible and extendable software. After reading Advanced Object-Oriented Programming in R, you'll come away with a practical project that you can reuse in your own analytics coding endeavors. You'll then be able to visualize your data as objects that have state and then manipulate those objects with polymorphic or generic methods. Your projects will benefit from the high degree of flexibility provided by polymorphism, where the choice of concrete method to execute depends on the type of data being manipulated. What You'll Learn Define and use classes and generic functions using R Work with the R class hierarchies Benefit from implementation reuse Handle operator overloading Apply the S4 and R6 classes Who This Book Is For Experienced programmers and for those with at least some prior experience with R programming language.

Introduction to Google Analytics: A Guide for Absolute Beginners

Develop your digital/online marketing skills and learn web analytics to understand the performance of websites and ad campaigns. Approaches covered will be immediately useful for business or nonprofit organizations. If you are completely new to Google Analytics and you want to learn the basics, this guide will introduce you to the content quickly. Web analytics is critical to online marketers as they seek to track return on investment and optimize their websites. Introduction to Google Analytics covers the basics of Google Analytics, starting with creating a blog, and monitoring the number of people who see the blog posts and where they come from. What You'll Learn Understand basic techniques to generate traffic for a blog or website Review the performance of a website or campaign Set up a Shopify account to track ROI Create and maximize AdWords to track conversion Discover opportunities offered by Google, including the Google Individual Qualification Who This Book Is For Those who need to get up to speed on Google Analytics tools and techniques for business or personal use. This book is also suitable as a student reference.

R: Mining Spatial, Text, Web, and Social Media Data

Create data mining algorithms About This Book Develop a strong strategy to solve predictive modeling problems using the most popular data mining algorithms Real-world case studies will take you from novice to intermediate to apply data mining techniques Deploy cutting-edge sentiment analysis techniques to real-world social media data using R Who This Book Is For This Learning Path is for R developers who are looking to making a career in data analysis or data mining. Those who come across data mining problems of different complexities from web, text, numerical, political, and social media domains will find all information in this single learning path. What You Will Learn Discover how to manipulate data in R Get to know top classification algorithms written in R Explore solutions written in R based on R Hadoop projects Apply data management skills in handling large data sets Acquire knowledge about neural network concepts and their applications in data mining Create predictive models for classification, prediction, and recommendation Use various libraries on R CRAN for data mining Discover more about data potential, the pitfalls, and inferencial gotchas Gain an insight into the concepts of supervised and unsupervised learning Delve into exploratory data analysis Understand the minute details of sentiment analysis In Detail Data mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining. You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and time-series data. You will also be introduced to solutions written in R based on R Hadoop projects. Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of end-to-end data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects. After this, you will gain hands-on experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data. This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products: Learning Data Mining with R by Bater Makhabel R Data Mining Blueprints by Pradeepta Mishra Social Media Mining with R by Nathan Danneman and Richard Heimann Style and approach A complete package with which will take you from the basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining. Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Delivering Embedded Analytics in Modern Applications

Organizations are rapidly consuming more data than ever before, and to drive their competitive advantage, they’re demanding interactive visualizations and interactive analyses of that data be embedded in their applications and business processes. This will enable them to make faster and more effective decisions based on data, not guesses. This practical book examines the considerations that software developers, product managers, and vendors need to take into account when making visualization and analytics a seamlessly integrated part of the applications they deliver, as well as the impact of migrating their applications to modern data platforms. Authors Federico Castanedo (Vodafone Group) and Andy Oram (O’Reilly Media) explore the basic requirements for embedding domain expertise with fast, powerful, and interactive visual analytics that will delight and inform customers more than spreadsheets and custom-generated charts. Particular focus is placed on the characteristics of effective visual analytics for big and fast data. Learn the impact of trends driving embedded analytics Review examples of big data applications and their analytics requirements in retail, direct service, cybersecurity, the Internet of Things, and logistics Explore requirements for embedding visual analytics in modern data environments, including collection, storage, retrieval, data models, speed, microservices, parallelism, and interactivity Take a deep dive into the characteristics of effective visual analytics and criteria for evaluating modern embedded analytics tools Use a self-assessment rating chart to determine the value of your organization’s BI in the modern data setting

Development Workflows for Data Scientists

Data science teams often borrow best practices from software development, but since the product of a data science project is insight, not code, software development workflows are not a perfect fit. How can data scientists create workflows tailored to their needs? Through interviews with several data-driven organizations, this practical report reveals how data science teams are improving the way they define, enforce, and automate a development workflow. Data science workflows differ from team to team because their tasks, goals, and skills vary so much. In this report, author Ciara Byrne talked to teams from BinaryEdge, Airbnb, GitHub, Scotiabank, Fast Forward Labs, Datascope, and others about their approaches to the data science process, including their procedures for: Defining team structure and roles Asking interesting questions Examining previous work Collecting, exploring, and modeling data Testing, documenting, and deploying code to production Communicating the results With this report, you’ll also examine a complete data science workflow developed by the team from Swiss cybersecurity firm BinaryEdge that includes steps for preliminary data analysis, exploratory data analysis, knowledge discovery, and visualization.

MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence

Get started with MATLAB for deep learning and AI with this in-depth primer. In this book, you start with machine learning fundamentals, then move on to neural networks, deep learning, and then convolutional neural networks. In a blend of fundamentals and applications, MATLAB Deep Learning employs MATLAB as the underlying programming language and tool for the examples and case studies in this book. With this book, you'll be able to tackle some of today's real world big data, smart bots, and other complex data problems. You'll see how deep learning is a complex and more intelligent aspect of machine learning for modern smart data analysis and usage. What You'll Learn Use MATLAB for deep learning Discover neural networks and multi-layer neural networks Work with convolution and pooling layers Build a MNIST example with these layers Who This Book Is For Those who want to learn deep learning using MATLAB. Some MATLAB experience may be useful.

Understanding Message Brokers

Messaging is one of the more poorly understood areas of IT; most developers and architects have only a passing familiarity with how broker-based messaging technologies work. This practical report not only helps you get up to speed on the essentials of messaging, but also compares two of today’s most popular messaging technologies—Apache ActiveMQ and Apache Kafka. Author and consultant Jakub Korab describes use cases and design choices that lead developers to very different approaches for developing message-based systems. You’ll come away with a high-level understanding of both ActiveMQ and Kafka, including how they should and should not be used, how they handle concerns such as throughput and high-availability, and what to look out for when considering other messaging technologies in future. Understand the types of problems that messaging systems address Explore three primary messaging patterns: point-to-point, publish-subscribe, and a hybrid of both Dive into ActiveMQ, a classic broker-centric design implemented through Java libraries that works for a broad range of messaging use cases Examine Kafka, a distributed system that can be scaled to provide massive performance and fault tolerance through replication Learn the mechanical complexities that message-based systems need to address, and some patterns you can apply to deal with those complexities

R for Everyone: Advanced Analytics and Graphics, 2nd Edition

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. is the solution. R for Everyone, Second Edition, Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, manipulation, and visualization; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. After all this you'll make your code reproducible with LaTeX, RMarkdown, and Shiny. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. Coverage includes Explore R, RStudio, and R packages Use R for math: variable types, vectors, calling functions, and more Exploit data structures, including data.frames, matrices, and lists Read many different types of data Create attractive, intuitive statistical graphics Write user-defined functions Control program flow with if, ifelse, and complex checks Improve program efficiency with group manipulations Combine and reshape multiple datasets Manipulate strings using R's facilities and regular expressions Create normal, binomial, and Poisson probability distributions Build linear, generalized linear, and nonlinear models Program basic statistics: mean, standard deviation, and t-tests Train machine learning models Assess the quality of models and variable selection Prevent overfitting and perform variable selection, using the Elastic Net and Bayesian methods Analyze univariate and multivariate time series data Group data via K-means and hierarchical clustering Prepare reports, slideshows, and web pages with knitr Display interactive data with RMarkdown and htmlwidgets Implement dashboards with Shiny Build reusable R packages with devtools and Rcpp

Agile Data Science 2.0

Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools. Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You’ll learn an iterative approach that lets you quickly change the kind of analysis you’re doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization. Build value from your data in a series of agile sprints, using the data-value pyramid Extract features for statistical models from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future via classification and regression Translate predictions into actions Get feedback from users after each sprint to keep your project on track

Practical GIS

Practical GIS introduces you to the world of Geographic Information Systems (GIS) using accessible, open source tools. From setting up your GIS environment to creating and analyzing spatial data and publishing it online, this book covers everything you need to perform both beginner and advanced GIS tasks. What this Book will help me do Understand the fundamentals of GIS and use open source tools effectively. Be able to collect, store, query, and manage spatial data efficiently. Perform advanced spatial analyses and solve real-world GIS problems practically. Learn how to publish and share GIS data and results using QGIS Server and GeoServer. Create web maps using lightweight web mapping libraries like Leaflet. Author(s) The authors of Practical GIS bring years of professional experience in GIS and data analysis, combining technical know-how with a teaching approach accessible to a wide range of learners. They strive to convey complex GIS concepts simply and practically, fostering a hands-on learning experience. Who is it for? This book is ideal for IT professionals new to GIS or those considering entering the GIS field. If you're looking for a cost-effective way to learn GIS without investing in expensive commercial software or formal training, Practical GIS provides the knowledge and tools you need. Beginners and intermediate learners alike will find this book to be a helpful stepping stone in mastering GIS.

Advanced Analytics with Spark, 2nd Edition

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming. You’ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques—including classification, clustering, collaborative filtering, and anomaly detection—to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you’ll find the book’s patterns useful for working on your own data applications. With this book, you will: Familiarize yourself with the Spark programming model Become comfortable within the Spark ecosystem Learn general approaches in data science Examine complete implementations that analyze large public data sets Discover which machine learning tools make sense for particular problems Acquire code that can be adapted to many uses

Data Science with Java

Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today’s data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java. You’ll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you’ll find code examples you can use in your applications. Examine methods for obtaining, cleaning, and arranging data into its purest form Understand the matrix structure that your data should take Learn basic concepts for testing the origin and validity of data Transform your data into stable and usable numerical values Understand supervised and unsupervised learning algorithms, and methods for evaluating their success Get up and running with MapReduce, using customized components suitable for data science algorithms

Decision Support, Analytics, and Business Intelligence, Third Edition

Rapid technology change is impacting organizations large and small. Mobile and Cloud computing, the Internet of Things (IoT), and “Big Data” are driving forces in organizational digital transformation. Decision support and analytics are available to many people in a business or organization. Business professionals need to learn about and understand computerized decision support for organizations to succeed. This text is targeted to busy managers and students who need to grasp the basics of computerized decision support, including: What is analytics? What is a decision support system? What is “Big Data”? What are “Big Data” business use cases? Overall, it addresses 61 fundamental questions. In a short period of time, readers can “get up to speed” on decision support, analytics, and business intelligence. The book then provides a quick reference to important recurring questions.

Business in Real-Time Using Azure IoT and Cortana Intelligence Suite: Driving Your Digital Transformation

Learn how today’s businesses can transform themselves by leveraging real-time data and advanced machine learning analytics. This book provides prescriptive guidance for architects and developers on the design and development of modern Internet of Things (IoT) and Advanced Analytics solutions. In addition, Business in Real-Time Using Azure IoT and Cortana Intelligence Suite offers patterns and practices for those looking to engage their customers and partners through Software-as-a-Service solutions that work on any device. Whether you're working in Health & Life Sciences, Manufacturing, Retail, Smart Cities and Buildings or Process Control, there exists a common platform from which you can create your targeted vertical solutions. Business in Real-Time Using Azure IoT and Cortana Intelligence Suite uses a reference architecture as a road map. Building on Azure’s PaaS services, you'll see how a solution architecture unfolds that demonstrates a complete end-to-end IoT and Advanced Analytics scenario. What You'll Learn: Automate your software product life cycle using PowerShell, Azure Resource Manager Templates, and Visual Studio Team Services Implement smart devices using Node.JS and C# Use Azure Streaming Analytics to ingest millions of events Provide both "Hot" and "Cold" path outputs for real-time alerts, data transformations, and aggregation analytics Implement batch processing using Azure Data Factory Create a new form of Actionable Intelligence (AI) to drive mission critical business processes Provide rich Data Visualizations across a wide variety of mobile and web devices Who This Book is For: Solution Architects, Software Developers, Data Architects, Data Scientists, and CIO/CTA Technical Leadership Professionals

Metaprogramming in R: Advanced Statistical Programming for Data Science, Analysis and Finance

Learn how to manipulate functions and expressions to modify how the R language interprets itself. This book is an introduction to metaprogramming in the R language, so you will write programs to manipulate other programs. Metaprogramming in R shows you how to treat code as data that you can generate, analyze, or modify. R is a very high-level language where all operations are functions and all functions are data that can be manipulated. This book shows you how to leverage R's natural flexibility in how function calls and expressions are evaluated, to create small domain-specific languages to extend R within the R language itself. What You'll Learn Find out about the anatomy of a function in R Look inside a function call Work with R expressions and environments Manipulate expressions in R Use substitutions Who This Book Is For Those with at least some experience with R and certainly for those with experience in other programming languages

Apache Spark 2.x Cookbook

Discover how to harness the power of Apache Spark 2.x for your Big Data processing projects. In this book, you will explore over 70 cloud-ready recipes that will guide you to perform distributed data analytics, structured streaming, machine learning, and much more. What this Book will help me do Effectively install and configure Apache Spark with various cluster managers and platforms. Set up and utilize development environments tailored for Spark applications. Operate on schema-aware data using RDDs, DataFrames, and Datasets. Perform real-time streaming analytics with sources such as Apache Kafka. Leverage MLlib for supervised learning, unsupervised learning, and recommendation systems. Author(s) None Yadav is a seasoned data engineer with a deep understanding of Big Data tools and technologies, particularly Apache Spark. With years of experience in the field of distributed computing and data analysis, Yadav brings practical insights and techniques to enrich the learning experience of readers. Who is it for? This book is ideal for data engineers, data scientists, and Big Data professionals who are keen to enhance their Apache Spark 2.x skills. If you're working with distributed processing and want to solve complex data challenges, this book addresses practical problems. Note that a basic understanding of Scala is recommended to get the most out of this resource.

Business Intelligence Tools for Small Companies: A Guide to Free and Low-Cost Solutions

Learn how to transition from Excel-based business intelligence (BI) analysis to enterprise stacks of open-source BI tools. Select and implement the best free and freemium open-source BI tools for your company's needs and design, implement, and integrate BI automation across the full stack using agile methodologies. Business Intelligence Tools for Small Companies provides hands-on demonstrations of open-source tools suitable for the BI requirements of small businesses. The authors draw on their deep experience as BI consultants, developers, and administrators to guide you through the extract-transform-load/data warehousing (ETL/DWH) sequence of extracting data from an enterprise resource planning (ERP) database freely available on the Internet, transforming the data, manipulating them, and loading them into a relational database. The authors demonstrate how to extract, report, and dashboard key performance indicators (KPIs) in a visually appealing format from the relational database management system (RDBMS). They model the selection and implementation of free and freemium tools such as Pentaho Data Integrator and Talend for ELT, Oracle XE and MySQL/MariaDB for RDBMS, and Qliksense, Power BI, and MicroStrategy Desktop for reporting. This richly illustrated guide models the deployment of a small company BI stack on an inexpensive cloud platform such as AWS. What You'll Learn You will learn how to manage, integrate, and automate the processes of BI by selecting and implementing tools to: Implement and manage the business intelligence/data warehousing (BI/DWH) infrastructure Extract data from any enterprise resource planning (ERP) tool Process and integrate BI data using open-source extract-transform-load (ETL) tools Query, report, and analyze BI data using open-source visualization and dashboard tools Use a MOLAP tool to define next year's budget, integrating real data with target scenarios Deploy BI solutions and big data experiments inexpensively on cloud platforms Who This Book Is For Engineers, DBAs, analysts, consultants, and managers at small companies with limited resources but whose BI requirements have outgrown the limitations of Excel spreadsheets; personnel in mid-sized companies with established BI systems who are exploring technological updates and more cost-efficient solutions

Data Lake for Enterprises

"Data Lake for Enterprises" is a comprehensive guide to building data lakes using the Lambda Architecture. It introduces big data technologies like Hadoop, Spark, and Flume, showing how to use them effectively to manage and leverage enterprise-scale data. You'll gain the skills to design and implement data systems that handle complex data challenges. What this Book will help me do Master the use of Lambda Architecture to create scalable and effective data management systems. Understand and implement technologies like Hadoop, Spark, Kafka, and Flume in an enterprise data lake. Integrate batch and stream processing techniques using big data tools for comprehensive data analysis. Optimize data lakes for performance and reliability with practical insights and techniques. Implement real-world use cases of data lakes and machine learning for predictive data insights. Author(s) None Mishra, None John, and Pankaj Misra are recognized experts in big data systems with a strong background in designing and deploying data solutions. With a clear and methodical teaching style, they bring years of experience to this book, providing readers with the tools and knowledge required to excel in enterprise big data initiatives. Who is it for? This book is ideal for software developers, data architects, and IT professionals looking to integrate a data lake strategy into their enterprises. It caters to readers with a foundational understanding of Java and big data concepts, aiming to advance their practical knowledge of building scalable data systems. If you're eager to delve into cutting-edge technologies and transform enterprise data management, this book is for you.

Mastering PostGIS

"Mastering PostGIS" is your guide to unlocking the powerful capabilities of the PostGIS spatial database system. Across 328 pages, this book takes you through the essentials of spatial data handling, from importing, analyzing, and exporting datasets to building fully-functional GIS applications. You'll explore concepts such as spatial querying, data types, and integrating PostGIS with powerful tools like GeoServer and OpenLayers. What this Book will help me do Understand the fundamentals of PostGIS and its role in GIS workflows. Gain hands-on experience in SQL-based spatial queries and data manipulation. Develop the ability to integrate PostGIS with web platforms like Node.js, GeoServer, and OpenLayers. Discover strategies for spatial data ETL (Extract, Transform, Load) processes and live updates. Build scalable, efficient GIS applications leveraging PostGIS's capabilities. Author(s) George Silva, None Mikiewicz, and Michal Mackiewicz None are experts in GIS systems and database technologies with years of experience working with spatial databases such as PostGIS. Passionate about imparting practical knowledge, they provide clear, hands-on examples in every chapter to help you master spatial database solutions. Who is it for? This book is perfect for GIS developers and analysts looking to deepen their knowledge of PostGIS. If you aim to enhance your skills in designing GIS applications or performing spatial data analysis, this is your ideal resource. Prior experience with PostgreSQL and a basic installation of PostGIS are recommended for readers.

Design and Analysis of Experiments, 9th Edition

Design and Analysis of Experiments, 9th Edition continues to help senior and graduate students in engineering, business, and statistics--as well as working practitioners--to design and analyze experiments for improving the quality, efficiency and performance of working systems. This bestselling text maintains its comprehensive coverage by including: new examples, exercises, and problems (including in the areas of biochemistry and biotechnology); new topics and problems in the area of response surface; new topics in nested and split-plot design; and the residual maximum likelihood method is now emphasized throughout the book.