talk-data.com talk-data.com

Topic

AWS

Amazon Web Services (AWS)

cloud cloud provider infrastructure services

15

tagged

Activity Trend

190 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Mastering Microsoft Fabric: SAASification of Analytics

Learn and explore the capabilities of Microsoft Fabric, the latest evolution in cloud analytics suites. This book will help you understand how users can leverage Microsoft Office equivalent experience for performing data management and advanced analytics activity. The book starts with an overview of the analytics evolution from on premises to cloud infrastructure as a service (IaaS), platform as a service (PaaS), and now software as a service (SaaS version) and provides an introduction to Microsoft Fabric. You will learn how to provision Microsoft Fabric in your tenant along with the key capabilities of SaaS analytics products and the advantage of using Fabric in the enterprise analytics platform. OneLake and Lakehouse for data engineering is discussed as well as OneLake for data science. Author Ghosh teaches you about data warehouse offerings inside Microsoft Fabric and the new data integration experience which brings Azure Data Factory and Power Query Editor of Power BI together in a single platform. Also demonstrated is Real-Time Analytics in Fabric, including capabilities such as Kusto query and database. You will understand how the new event stream feature integrates with OneLake and other computations. You also will know how to configure the real-time alert capability in a zero code manner and go through the Power BI experience in the Fabric workspace. Fabric pricing and its licensing is also covered. After reading this book, you will understand the capabilities of Microsoft Fabric and its Integration with current and upcoming Azure OpenAI capabilities. What You Will Learn Build OneLake for all data like OneDrive for Microsoft Office Leverage shortcuts for cross-cloud data virtualization in Azure and AWS Understand upcoming OpenAI integration Discover new event streaming and Kusto query inside Fabric real-time analytics Utilize seamless tooling for machine learning and data science Who This Book Is For Citizen users and experts in the data engineering and data science fields, along with chief AI officers

Effective Data Science Infrastructure

Simplify data science infrastructure to give data scientists an efficient path from prototype to production. In Effective Data Science Infrastructure you will learn how to: Design data science infrastructure that boosts productivity Handle compute and orchestration in the cloud Deploy machine learning to production Monitor and manage performance and results Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, Conda, and Docker Architect complex applications for multiple teams and large datasets Customize and grow data science infrastructure Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you’ll master scalable techniques for data storage, computation, experiment tracking, and orchestration that are relevant to companies of all shapes and sizes. You’ll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python. The author is donating proceeds from this book to charities that support women and underrepresented groups in data science. About the Technology Growing data science projects from prototype to production requires reliable infrastructure. Using the powerful new techniques and tooling in this book, you can stand up an infrastructure stack that will scale with any organization, from startups to the largest enterprises. About the Book Effective Data Science Infrastructure teaches you to build data pipelines and project workflows that will supercharge data scientists and their projects. Based on state-of-the-art tools and concepts that power data operations of Netflix, this book introduces a customizable cloud-based approach to model development and MLOps that you can easily adapt to your company’s specific needs. As you roll out these practical processes, your teams will produce better and faster results when applying data science and machine learning to a wide array of business problems. What's Inside Handle compute and orchestration in the cloud Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, AWS, and the Python data ecosystem Architect complex applications that require large datasets and models, and a team of data scientists About the Reader For infrastructure engineers and engineering-minded data scientists who are familiar with Python. About the Author At Netflix, Ville Tuulos designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure. Quotes By reading and referring to this book, I’m confident you will learn how to make your machine learning operations much more efficient and productive. - From the Foreword by Travis Oliphant, Author of NumPy, Founder of Anaconda, PyData, and NumFOCUS Effective Data Science Infrastructure is a brilliant book. It’s a must-have for every data science team. - Ninoslav Cerkez, Logit More data science. Less headaches. - Dr. Abel Alejandro Coronado Iruegas, National Institute of Statistics and Geography of Mexico Indispensable. A copy should be on every data engineer’s bookshelf. - Matthew Copple, Grand River Analytics

Reproducible Data Science with Pachyderm

Dive into the world of reproducible data science with Pachyderm, a specialized platform designed for version-controlled data pipelines. By following this book, 'Reproducible Data Science with Pachyderm,' you'll gain the skills to implement robust, scalable machine learning workflows with Pachyderm 2.0, covering setup, integration, and advanced use cases. What this Book will help me do Build scalable, version-controlled data pipelines with Pachyderm's unique features. Understand the principles behind reproducible data science and implement them effectively. Deploy Pachyderm on AWS, Google Cloud, and Azure while integrating with popular tools. Create and manage end-to-end machine learning workflows, including hyperparameter tuning. Leverage advanced integrations, such as Pachyderm Notebooks and language clients like Python and Go. Author(s) Svetlana Karslioglu is a seasoned data scientist with extensive experience in constructing scalable machine learning and data processing systems. With years in both practical implementation and educational endeavors, she has a talent for breaking down complex concepts into accessible learning paths. Her approach is hands-on and results-oriented, aimed at empowering professionals to excel in the field of data science. Who is it for? This book is intended for data scientists, machine learning engineers, and data engineers who are keen to ensure reproducibility in their workflows. Ideal readers may have familiarity with data science basics and some exposure to Kubernetes and programming languages like Python. By studying the book, learners will establish confidence in implementing Pachyderm for scalable and reliable data pipelines.

Time Series Analysis on AWS

Time Series Analysis on AWS is your guide to building and deploying powerful forecasting models and identifying anomalies in your time series data. With this book, you will explore effective strategies for modern time series analysis using Amazon Web Services' powerful AI/ML tools. What this Book will help me do Master the fundamental concepts of time series and its applications using industry-relevant examples. Understand time series forecasting with Amazon Forecast and how to deliver actionable business insights. Build and deploy anomaly detection systems using Amazon Lookout for Equipment for predictive maintenance. Learn to utilize Amazon Lookout for Metrics to identify business operational anomalies effectively. Gain practical experience applying AWS ML tools to real-world time series data challenges. Author(s) None Hoarau is a data scientist with extensive experience in utilizing machine learning to solve real-world problems. Combining strong programming skills with domain expertise, they focus on developing applications leveraging AWS AI services. This book reflects their passion for making technical topics accessible and actionable for professionals. Who is it for? This book is ideal for data analysts, business analysts, and data scientists eager to enhance their skills in time series analysis. It suits readers familiar with statistical concepts but new to machine learning. If you're aiming to solve business problems using data and AWS tools, this resource is tailored for you.

Actionable Insights with Amazon QuickSight

Discover the power of Amazon QuickSight with this comprehensive guide. Learn to create stunning data visualizations, integrate machine learning insights, and automate operations to optimize your data analytics workflows. This book offers practical guidance on utilizing QuickSight to develop insightful and interactive business intelligence solutions. What this Book will help me do Understand the role of Amazon QuickSight within the AWS analytics ecosystem. Learn to configure data sources and develop visualizations effectively. Gain skills in adding interactivity to dashboards using custom controls and parameters. Incorporate machine learning capabilities into your dashboards, including forecasting and anomaly detection. Explore advanced features like QuickSight APIs and embedded multi-tenant analytics design. Author(s) None Samatas is an AWS-certified big data solutions architect with years of experience in designing and implementing scalable analytics solutions. With a clear and practical approach, None teaches how to effectively leverage Amazon QuickSight for efficient and insightful business intelligence applications. Their expertise ensures readers will gain actionable skills. Who is it for? This book is ideal for business intelligence (BI) developers and data analysts looking to deepen their expertise in creating interactive dashboards using Amazon QuickSight. It is a perfect guide for professionals aiming to explore machine learning integration in BI solutions. Familiarity with basic data visualization concepts is recommended, but no prior experience with Amazon QuickSight is needed.

Serverless Analytics with Amazon Athena

Delve into the serverless world of Amazon Athena with the comprehensive book 'Serverless Analytics with Amazon Athena'. This guide introduces you to the power of Athena, showing you how to efficiently query data in Amazon S3 using SQL without the hassle of managing infrastructure. With clear instructions and practical examples, you'll master querying structured, unstructured, and semi-structured data seamlessly. What this Book will help me do Effectively query and analyze both structured and unstructured data stored in S3 using Amazon Athena. Integrate Athena with other AWS services to create powerful, secure, and cost-efficient data workflows. Develop ETL pipelines and machine learning workflows leveraging Athena's compatibility with AWS Glue. Monitor and troubleshoot Athena queries for consistent performance and build scalable serverless data solutions. Implement security best practices and optimize costs when managing your Athena-driven data solutions. Author(s) None Virtuoso, along with co-authors Mert Turkay Hocanin None and None Wishnick, brings a wealth of experience in cloud solutions, serverless technologies, and data engineering. They excel in demystifying complex technical topics and have a passion for empowering readers with practical skills and knowledge. Who is it for? This book is tailored for business intelligence analysts, application developers, and system administrators who want to harness Amazon Athena for seamless, cost-efficient data analytics. It suits individuals with basic SQL knowledge looking to expand their capabilities in querying and processing data. Whether you're managing growing datasets or building data-driven applications, this book provides the know-how to get it right.

Getting Started with Streamlit for Data Science

Getting Started with Streamlit for Data Science is your essential guide to quickly and efficiently building dynamic data science web applications in Python using Streamlit. Whether you're embedding machine learning models, visualizing data, or deploying projects, this book helps you excel in creating and sharing interactive apps with ease. What this Book will help me do Set up a development environment to create your first Streamlit application. Implement and visualize dynamic data workflows by integrating various Python libraries into Streamlit. Develop and showcase machine learning models within Streamlit for clear and interactive presentations. Deploy your projects effortlessly using platforms like Streamlit Sharing, Heroku, and AWS. Utilize tools like Streamlit Components and themes to enhance the aesthetics and usability of your apps. Author(s) Tyler Richards is a data science expert with extensive experience in leveraging technology to present complex data models in an understandable way. He brings practical solutions to readers, aiming to empower them with the tools they need to succeed in the field of data science. Tyler adopts a hands-on teaching method with illustrative examples to ensure clarity and easy learning. Who is it for? This book is designed for anyone involved in data science, from beginners just starting in the field to experienced professionals who want to learn to create interactive web applications using Streamlit. Ideal for those with a working knowledge of Python, this resource will help you streamline your workflows and enhance your project presentations.

Data Science on AWS

With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level up your skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance. Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment Tie everything together into a repeatable machine learning operations pipeline Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more

Metabase Up and Running

Metabase Up and Running is your go-to guide for mastering Metabase, the open-source business intelligence tool. You'll progress from the basics of installation and setup to connecting data sources and creating insightful visualizations and dashboards. By the end, you'll be confident in implementing Metabase in your organization for impactful decision-making. What this Book will help me do Understand how to securely deploy and configure Metabase on Amazon Web Services. Master the creation of dashboards, reports, and visual visualizations using Metabase's tools. Gain expertise in user and permissions management within Metabase. Learn to use Metabase's SQL console for advanced database interactions. Acquire skills to embed Metabase within applications and automate reports via email or Slack. Author(s) None Abraham, an experienced tool specialist, is passionate about teaching others how to leverage data tools effectively. With a background in business analytics, Abraham has guided companies of all sizes. Their approachable writing style ensures a learning journey that is both informative and engaging. Who is it for? This book is ideal for business analysts and data professionals looking to amplify their business intelligence capabilities using Metabase. Readers should have some understanding of data analytics principles. Whether you're starting in analytics or seeking advanced automation, this book offers valuable guidance to meet your goals.

Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny

Program for data analysis using R and learn practical skills to make your work more efficient. This revised book explores how to automate running code and the creation of reports to share your results, as well as writing functions and packages. It includes key R 4 features such as a new color palette for charts, an enhanced reference counting system, and normalization of matrix and array types where matrix objects now formally inherit from the array class, eliminating inconsistencies. Advanced R 4 Data Programming and the Cloud is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, it is designed to be a practical guide moving beyond merely using R; it shows you how to program in R to automate tasks. This book will teach you how to manipulate data in modern R structures and includes connecting R to databases such as PostgreSQL, cloud services such as Amazon Web Services (AWS), and digital dashboards such as Shiny. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics. What You Will Learn Write and document R functions using R 4 Make an R package and share it via GitHub or privately Add tests to R code to ensure it works as intended Use R to talk directly to databases and do complex data management Run R in the Amazon cloud Deploy a Shiny digital dashboard Generate presentation-ready tables and reports using R Who This Book Is For Working professionals, researchers, and students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level.

Learn Grafana 7.0

"Learn Grafana 7.0" is the ultimate beginner's guide to leveraging Grafana's capabilities for analytics and interactive dashboards. You'll master real-time data monitoring, visualization, and learn how to query and explore metrics with a hands-on approach to Grafana 7.0's new features. What this Book will help me do Learn to install and configure Grafana from scratch, preparing you for real-world data analysis tasks. Navigate and utilize the Graph panel in Grafana effectively, ensuring clear and actionable visual insights. Incorporate advanced dashboard features such as annotations, templates, and links to enhance data monitoring. Integrate Grafana with major cloud providers like AWS and Azure for robust monitoring solutions. Implement secure user authentication and fine-tuned permissions for managing teams and sharing insights safely. Author(s) None Salituro, the author of "Learn Grafana 7.0," is an experienced data visualization expert with years of experience in software development and analytics. Salituro focuses on creating understandable and accessible resources for developers and analysts of all skill levels, bringing a hands-on practical approach to technical learning. Who is it for? This book is perfect for data analysts, business intelligence developers, and administrators looking to build skills in data visualization and monitoring with Grafana 7.0. If you're eager to create interactive dashboards and learn practical applications of Grafana's features, this book is for you. Beginners to Grafana are fully accommodated, though familiarity with data visualization principles is beneficial. For those seeking to monitor cloud services like AWS with Grafana, this book is indispensable.

Data Science with Python and Dask

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine

Applied Supervised Learning with R

Applied Supervised Learning with R equips you with the essential knowledge and practical skills to leverage machine learning techniques for solving business problems using R. With this book, you'll gain hands-on experience in implementing various supervised learning models, assessing their performance, and selecting the best-suited method for your objectives. What this Book will help me do Gain expertise in identifying and framing business problems suitable for supervised learning. Acquire skills in data wrangling and visualization using R packages like dplyr and ggplot2. Master techniques for tuning hyperparameters to optimize machine learning models. Understand methods for feature selection and dimensionality reduction to enhance model performance. Learn how to deploy machine learning models to production environments, such as AWS Lambda. Author(s) Karthik Ramasubramanian and Jojo Moolayil are both seasoned data science practitioners and educators who bring a wealth of experience in machine learning and analytics. With a deep understanding of R and its applications in real-world scenarios, they offer practical insights and actionable examples to their readers. Their teaching style focuses on clarity and practical application. Who is it for? This book is ideal for data analysts, data scientists, and data engineers at a beginner to intermediate level who aim to master supervised machine learning with R. Readers should have basic knowledge of statistics, probabilities, and R programming. It is designed for those eager to apply machine learning techniques to real-world problems and improve their decision-making capabilities.

Python Web Scraping Cookbook

Python Web Scraping Cookbook is your comprehensive guide to building efficient and functional web scraping tools using Python. With practical recipes, you'll learn to overcome the challenges of dynamic content, captcha, and irregular web structures while deploying scalable solutions. What this Book will help me do Master the use of Python libraries like BeautifulSoup and Scrapy for scraping data. Perfect techniques for handling JavaScript-heavy sites using Selenium. Learn to overcome web scraping challenges, such as captchas and rate-limiting. Design scalable scraping pipelines with cloud deployment in AWS. Understand web data extraction techniques with XPath, CSS selectors, and more. Author(s) Michael Heydt is a seasoned software engineer and technical author with a focus on data engineering and cloud solutions. Having worked with Python extensively, he brings real-world insights into web scraping. His practical approach simplifies complex concepts. Who is it for? This book is perfect for Python developers and data enthusiasts keen to master web scraping techniques. If you're a programmer with insights into Python scripting and wish to scrape, analyze, and utilize web data efficiently, this book is for you.

A Developer’s Guide to Amazon SimpleDB

The Complete Guide to Building Cloud Computing Solutions with Amazon SimpleDB Using SimpleDB, any organization can leverage Amazon Web Services (AWS), Amazon’s powerful cloud-based computing platform–and dramatically reduce the cost and resources associated with application infrastructure. Now, for the first time, there’s a complete developer’s guide to building production solutions with Amazon SimpleDB. Pioneering SimpleDB developer Mocky Habeeb brings together all the hard-to-find information you need to succeed. Mocky tours the SimpleDB platform and APIs, explains their essential characteristics and tradeoffs, and helps you determine whether your applications are appropriate for SimpleDB. Next, he walks you through all aspects of writing, deploying, querying, optimizing, and securing Amazon SimpleDB applications–from the basics through advanced techniques. Throughout, Mocky draws on his unsurpassed experience supporting developers on SimpleDB’s official Web forums. He offers practical tips and answers that can’t be found anywhere else, and presents extensive working sample code–from snippets to complete applications. With A Developer’s Guide to Amazon SimpleDB you will be able to Evaluate whether a project is suited for Amazon SimpleDB Write SimpleDB applications that take full advantage of SimpleDB’s availability, scalability, and flexibility Effectively manage the entire SimpleDB application lifecycle Deploy cloud computing applications faster and more easily Work with SELECT and bulk data operations Fine tune queries to optimize performance Integrate SimpleDB security into existing organizational security plans Write and enhance runtime SimpleDB clients Build complete applications using AJAX and SimpleDB Understand low-level issues involved in writing clients and frameworks Solve common SimpleDB usage problems and avoid hidden pitfalls This book will be an indispensable resource for every IT professional evaluating or using SimpleDB to build cloud-computing applications, clients, or frameworks.