talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (98 results)

See all 98 →
Showing 2 results

Activities & events

Title & Speakers Event
Jonathan Rioux – author

Think big about your data! PySpark brings the powerful Spark big data processing engine to the Python ecosystem, letting you seamlessly scale up your data tasks and create lightning-fast pipelines. In Data Analysis with Python and PySpark you will learn how to: Manage your data as it scales across multiple machines Scale up your data programs with full confidence Read and write data to and from a variety of sources and formats Deal with messy data with PySpark’s data manipulation functionality Discover new data sets and perform exploratory data analysis Build automated data pipelines that transform, summarize, and get insights from data Troubleshoot common PySpark errors Creating reliable long-running jobs Data Analysis with Python and PySpark is your guide to delivering successful Python-driven data projects. Packed with relevant examples and essential techniques, this practical book teaches you to build pipelines for reporting, machine learning, and other data-centric tasks. Quick exercises in every chapter help you practice what you’ve learned, and rapidly start implementing PySpark into your data systems. No previous knowledge of Spark is required. About the Technology The Spark data processing engine is an amazing analytics factory: raw data comes in, insight comes out. PySpark wraps Spark’s core engine with a Python-based API. It helps simplify Spark’s steep learning curve and makes this powerful tool available to anyone working in the Python data ecosystem. About the Book Data Analysis with Python and PySpark helps you solve the daily challenges of data science with PySpark. You’ll learn how to scale your processing capabilities across multiple machines while ingesting data from any source—whether that’s Hadoop clusters, cloud data storage, or local data files. Once you’ve covered the fundamentals, you’ll explore the full versatility of PySpark by building machine learning pipelines, and blending Python, pandas, and PySpark code. What's Inside Organizing your PySpark code Managing your data, no matter the size Scale up your data programs with full confidence Troubleshooting common data pipeline problems Creating reliable long-running jobs About the Reader Written for data scientists and data engineers comfortable with Python. About the Author As a ML director for a data-driven software company, Jonathan Rioux uses PySpark daily. He teaches the software to data scientists, engineers, and data-savvy business analysts. Quotes A clear and in-depth introduction for truly tackling big data with Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine The perfect way to learn how to analyze and master huge datasets. - Gary Bake, Brambles Covers both basic and more advanced topics of PySpark, with a good balance between theory and hands-on. - Philippe Van Bergenl, P² Consulting For beginner to pro, a well-written book to help understand PySpark. - Raushan Kumar Jha, Microsoft

data data-engineering apache-spark PySpark AI/ML Analytics API Big Data Cloud Computing Data Science Hadoop Microsoft Pandas Python Spark
O'Reilly Data Engineering Books
Elijah Meeks – author

D3.js in Action, Second Edition is completely revised and updated for D3 v4 and ES6. It's a practical tutorial for creating interactive graphics and data-driven applications using D3. About the Technology Visualizing complex data is hard. Visualizing complex data on the web is darn near impossible without D3.js. D3 is a JavaScript library that provides a simple but powerful data visualization API over HTML, CSS, and SVG. Start with a structure, dataset, or algorithm; mix in D3; and you can programmatically generate static, animated, or interactive images that scale to any screen or browser. It's easy, and after a little practice, you'll be blown away by how beautiful your results can be! About the Book D3.js in Action, Second Edition is a completely updated revision of Manning's bestselling guide to data visualization with D3. You'll explore dozens of real-world examples in full-color, including force and network diagrams, workflow illustrations, geospatial constructions, and more! Along the way, you'll pick up best practices for building interactive graphics, animations, and live data representations. You'll also step through a fully interactive application created with D3 and React. What's Inside Rich full-color diagrams and illustrations Updated for D3 v4 and ES6 Reusable layouts and components Geospatial data visualizations Mixed-mode rendering About the Reader Suitable for web developers with HTML, CSS, and JavaScript skills. No specialized data science skills required. About the Author Elijah Meeks is a senior data visualization engineer at Netflix. Quotes From basic to complex, this book gives you the tools to create beautiful data visualizations. - Claudio Rodriguez, Cox Media Group The best reference for one of the most useful DataViz tools. - Jonathan Rioux, TD Insurance From toy examples to techniques for real projects. Shows how all the pieces fit together. - Scott McKissock, USAID A clever way to immerse yourself in the D3.js world. - Felipe Vildoso Castillo, University of Chile

data data-science data-science-tasks data-visualization d3 API Data Science DataViz HTML JavaScript React
O'Reilly Data Science Books
Showing 2 results