talk-data.com talk-data.com

Topic

S3

Amazon S3

object_storage cloud_storage aws

5

tagged

Activity Trend

11 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Serverless Analytics with Amazon Athena

Delve into the serverless world of Amazon Athena with the comprehensive book 'Serverless Analytics with Amazon Athena'. This guide introduces you to the power of Athena, showing you how to efficiently query data in Amazon S3 using SQL without the hassle of managing infrastructure. With clear instructions and practical examples, you'll master querying structured, unstructured, and semi-structured data seamlessly. What this Book will help me do Effectively query and analyze both structured and unstructured data stored in S3 using Amazon Athena. Integrate Athena with other AWS services to create powerful, secure, and cost-efficient data workflows. Develop ETL pipelines and machine learning workflows leveraging Athena's compatibility with AWS Glue. Monitor and troubleshoot Athena queries for consistent performance and build scalable serverless data solutions. Implement security best practices and optimize costs when managing your Athena-driven data solutions. Author(s) None Virtuoso, along with co-authors Mert Turkay Hocanin None and None Wishnick, brings a wealth of experience in cloud solutions, serverless technologies, and data engineering. They excel in demystifying complex technical topics and have a passion for empowering readers with practical skills and knowledge. Who is it for? This book is tailored for business intelligence analysts, application developers, and system administrators who want to harness Amazon Athena for seamless, cost-efficient data analytics. It suits individuals with basic SQL knowledge looking to expand their capabilities in querying and processing data. Whether you're managing growing datasets or building data-driven applications, this book provides the know-how to get it right.

Learning Apache Drill

Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you’ll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis Query file types including logfiles, Parquet, JSON, and other complex formats Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL Connect to Drill programmatically using a variety of languages Use Drill even with challenging or ambiguous file formats Perform sophisticated analysis by extending Drill’s functionality with user-defined functions Facilitate data analysis for network security, image metadata, and machine learning

Learning R Programming

This book provides a comprehensive introduction to R programming, a powerful tool for data science and statistics. Throughout the book, readers will explore programming constructs, data structures, and popular R packages, gaining the skills needed for practical applications and problem-solving. What this Book will help me do Understand R's foundational concepts like variables, data types, and functions. Learn how to use R for data analysis, visualization, and machine learning tasks. Develop advanced R skills such as meta-programming and performance optimization. Master object-oriented programming using R's S3, S4, and R6 systems. Gain confidence in utilizing R for creating web scraping scripts and interactive reports. Author(s) None Ren, an experienced software developer and educator, specializes in languages for data analysis, including R. With years of practical experience and teaching R programming, they bring clarity and depth to complex topics. Their approachable writing style ensures learners at any level can engage effectively. Who is it for? This book is ideal for professionals in data science, statistics, and related fields with basic programming skills looking to delve into R programming. It caters to beginners and those consolidating their knowledge of R, aiming to develop practical skills for data manipulation and analysis.

Sams Teach Yourself R in 24 Hours

In just 24 lessons of one hour or less, Sams Teach Yourself R in 24 Hours helps you learn all the R skills you need to solve a wide spectrum of real-world data analysis problems. You’ll master the entire data analysis workflow, learning to build code that’s efficient, reproducible, and suitable for sharing with others. This book’s straightforward, step-by-step approach teaches you how to import, manipulate, summarize, model, and plot data with R; formalize your analytical code; and build powerful R packages using current best practices. Practical, hands-on examples show you how to apply what you learn. Quizzes and exercises help you test your knowledge and stretch your skills. Learn How To Install, configure, and explore the R environment, including RStudio Use basic R syntax, objects, and packages Create and manage data structures, including vectors, matrices, and arrays Understand lists and data frames Work with dates, times, and factors Use common R functions, and learn to write your own Import and export data and connect to databases and spreadsheets Use the popular tidyr, dplyr and data.table packages Write more efficient R code with profiling, vectorization, and initialization Plot data and extend your graphical capabilities with ggplot2 and Lattice graphics Develop common types of models Construct high-quality packages, both simple and complex Write R classes: S3, S4, and Reference Classes Use R to generate dynamic reports Build web applications with Shiny Register your book at informit.com/register for convenient access to updates and corrections as they become available. This book’s source code can be found at http://www.mango-solutions.com/wp/teach-yourself-r-in-24-hours-book/.

Hands-On Programming with R

Learn how to program by diving into the R language, and then use your newfound skills to solve practical data science problems. With this book, you’ll learn how to load data, assemble and disassemble data objects, navigate R’s environment system, write your own functions, and use all of R’s programming tools. RStudio Master Instructor Garrett Grolemund not only teaches you how to program, but also shows you how to get more from R than just visualizing and modeling data. You’ll gain valuable programming skills and support your work as a data scientist at the same time. Work hands-on with three practical data analysis projects based on casino games Store, retrieve, and change data values in your computer’s memory Write programs and simulations that outperform those written by typical R users Use R programming tools such as if else statements, for loops, and S3 classes Learn how to write lightning-fast vectorized R code Take advantage of R’s package system and debugging tools Practice and apply R programming concepts as you learn them