talk-data.com talk-data.com

Topic

Keras

deep_learning neural_networks machine_learning

23

tagged

Activity Trend

2 peak/qtr
2020-Q1 2026-Q1

Activities

23 activities · Newest first

Bridging Accessibility and AI: Sign Language Recognition & Inclusive Design with Sheida Rashidi

As AI continues to shape human-computer interaction, there’s a growing opportunity and responsibility to ensure these technologies serve everyone, including people with communication disabilities. In this talk, I will present my ongoing work in developing a real-time American Sign Language (ASL) recognition system, and explore how integrating accessible design principles into AI research can expand both usability and impact.

The core of the talk will cover the Sign Language Recogniser project (available on GitHub), in which I used MediaPipe Studio together with TensorFlow, Keras, and OpenCV to train a model that classifies ASL letters from hand-tracking features.

I’ll share the methodology: data collection, feature extraction via MediaPipe, model training, and demo/testing results. I’ll also discuss challenges encountered, such as dealing with gesture variability, lighting and camera differences, latency constraints, and model generalization.

Beyond the technical implementation, I’ll reflect on the broader implications: how accessibility-focused AI projects can promote inclusion, how design decisions affect trust and usability, and how women in AI & data science can lead innovation that is both rigorous and socially meaningful. Attendees will leave with actionable insights for building inclusive AI systems, especially in domains involving rich human modalities such as gesture or sign.

Large language models are often too large to run on personal machines, requiring specialized hardware with massive memory. Quantization provides a way to shrink models, speed them up, and reduce memory usage - all while retaining most of their accuracy.

This talk introduces the fundamentals of neural network quantization, key techniques, and demonstrates how to apply them using Keras’s extensible quantization framework.

Deep Learning with Python, Third Edition

The bestselling book on Python deep learning, now covering generative AI, Keras 3, PyTorch, and JAX! Deep Learning with Python, Third Edition puts the power of deep learning in your hands. This new edition includes the latest Keras and TensorFlow features, generative AI models, and added coverage of PyTorch and JAX. Learn directly from the creator of Keras and step confidently into the world of deep learning with Python. In Deep Learning with Python, Third Edition you’ll discover: Deep learning from first principles The latest features of Keras 3 A primer on JAX, PyTorch, and TensorFlow Image classification and image segmentation Time series forecasting Large Language models Text classification and machine translation Text and image generation—build your own GPT and diffusion models! Scaling and tuning models With over 100,000 copies sold, Deep Learning with Python makes it possible for developers, data scientists, and machine learning enthusiasts to put deep learning into action. In this expanded and updated third edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. You'll master state-of-the-art deep learning tools and techniques, from the latest features of Keras 3 to building AI models that can generate text and images. About the Technology In less than a decade, deep learning has changed the world—twice. First, Python-based libraries like Keras, TensorFlow, and PyTorch elevated neural networks from lab experiments to high-performance production systems deployed at scale. And now, through Large Language Models and other generative AI tools, deep learning is again transforming business and society. In this new edition, Keras creator François Chollet invites you into this amazing subject in the fluid, mentoring style of a true insider. About the Book Deep Learning with Python, Third Edition makes the concepts behind deep learning and generative AI understandable and approachable. This complete rewrite of the bestselling original includes fresh chapters on transformers, building your own GPT-like LLM, and generating images with diffusion models. Each chapter introduces practical projects and code examples that build your understanding of deep learning, layer by layer. What's Inside Hands-on, code-first learning Comprehensive, from basics to generative AI Intuitive and easy math explanations Examples in Keras, PyTorch, JAX, and TensorFlow About the Reader For readers with intermediate Python skills. No previous experience with machine learning or linear algebra required. About the Authors François Chollet is the co-founder of Ndea and the creator of Keras. Matthew Watson is a software engineer at Google working on Gemini and a core maintainer of Keras. Quotes Perfect for anyone interested in learning by doing from one of the industry greats. - Anthony Goldbloom, Founder of Kaggle A sharp, deeply practical guide that teaches you how to think from first principles to build models that actually work. - Santiago Valdarrama, Founder of ml.school The most up-to-date and complete guide to deep learning you’ll find today! - Aran Komatsuzaki, EleutherAI Masterfully conveys the true essence of neural networks. A rare case in recent years of outstanding technical writing. - Salvatore Sanfilippo, Creator of Redis

Cutting Edge Football Analytics using Polars, Keras and Spektral

Football analytics has rapidly evolved over the past five years, becoming a crucial part of professional and fan discourse. While much of the cutting-edge research remains hidden behind the fences of club training grounds, a growing ecosystem of open-source tools now enables anyone to develop advanced football analytics models.

In this talk, I'll showcase key open-source libraries—Polars for high-performance data processing, Keras for deep learning, and Spektral for Graph Neural Networks (GNNs)—to analyze millions of player coordinates from publicly available high-frequency positional tracking data. I'll demonstrate how these tools can be used to build in-game prediction models and extract advanced football metrics that only the most advanced football clubs currently use.

Data Without Labels

Discover all-practical implementations of the key algorithms and models for handling unlabeled data. Full of case studies demonstrating how to apply each technique to real-world problems. In Data Without Labels you’ll learn: Fundamental building blocks and concepts of machine learning and unsupervised learning Data cleaning for structured and unstructured data like text and images Clustering algorithms like K-means, hierarchical clustering, DBSCAN, Gaussian Mixture Models, and Spectral clustering Dimensionality reduction methods like Principal Component Analysis (PCA), SVD, Multidimensional scaling, and t-SNE Association rule algorithms like aPriori, ECLAT, SPADE Unsupervised time series clustering, Gaussian Mixture models, and statistical methods Building neural networks such as GANs and autoencoders Dimensionality reduction methods like Principal Component Analysis and multidimensional scaling Association rule algorithms like aPriori, ECLAT, and SPADE Working with Python tools and libraries like sci-kit learn, numpy, Pandas, matplotlib, Seaborn, Keras, TensorFlow, and Flask How to interpret the results of unsupervised learning Choosing the right algorithm for your problem Deploying unsupervised learning to production Maintenance and refresh of an ML solution Data Without Labels introduces mathematical techniques, key algorithms, and Python implementations that will help you build machine learning models for unannotated data. You’ll discover hands-off and unsupervised machine learning approaches that can still untangle raw, real-world datasets and support sound strategic decisions for your business. Don’t get bogged down in theory—the book bridges the gap between complex math and practical Python implementations, covering end-to-end model development all the way through to production deployment. You’ll discover the business use cases for machine learning and unsupervised learning, and access insightful research papers to complete your knowledge. About the Technology Generative AI, predictive algorithms, fraud detection, and many other analysis tasks rely on cheap and plentiful unlabeled data. Machine learning on data without labels—or unsupervised learning—turns raw text, images, and numbers into insights about your customers, accurate computer vision, and high-quality datasets for training AI models. This book will show you how. About the Book Data Without Labels is a comprehensive guide to unsupervised learning, offering a deep dive into its mathematical foundations, algorithms, and practical applications. It presents practical examples from retail, aviation, and banking using fully annotated Python code. You’ll explore core techniques like clustering and dimensionality reduction along with advanced topics like autoencoders and GANs. As you go, you’ll learn where to apply unsupervised learning in business applications and discover how to develop your own machine learning models end-to-end. What's Inside Master unsupervised learning algorithms Real-world business applications Curate AI training datasets Explore autoencoders and GANs applications About the Reader Intended for data science professionals. Assumes knowledge of Python and basic machine learning. About the Author Vaibhav Verdhan is a seasoned data science professional with extensive experience working on data science projects in a large pharmaceutical company. Quotes An invaluable resource for anyone navigating the complexities of unsupervised learning. A must-have. - Ganna Pogrebna, The Alan Turing Institute Empowers the reader to unlock the hidden potential within their data. - Sonny Shergill, Astra Zeneca A must-have for teams working with unstructured data. Cuts through the fog of theory ili Explains the theory and delivers practical solutions. - Leonardo Gomes da Silva, onGRID Sports Technology The Bible for unsupervised learning! Full of real-world applications, clear explanations, and excellent Python implementations. - Gary Bake, Falconhurst Technologies

Machine Learning for Tabular Data

Business runs on tabular data in databases, spreadsheets, and logs. Crunch that data using deep learning, gradient boosting, and other machine learning techniques. Machine Learning for Tabular Data teaches you to train insightful machine learning models on common tabular business data sources such as spreadsheets, databases, and logs. You’ll discover how to use XGBoost and LightGBM on tabular data, optimize deep learning libraries like TensorFlow and PyTorch for tabular data, and use cloud tools like Vertex AI to create an automated MLOps pipeline. Machine Learning for Tabular Data will teach you how to: Pick the right machine learning approach for your data Apply deep learning to tabular data Deploy tabular machine learning locally and in the cloud Pipelines to automatically train and maintain a model Machine Learning for Tabular Data covers classic machine learning techniques like gradient boosting, and more contemporary deep learning approaches. By the time you’re finished, you’ll be equipped with the skills to apply machine learning to the kinds of data you work with every day. About the Technology Machine learning can accelerate everyday business chores like account reconciliation, demand forecasting, and customer service automation—not to mention more exotic challenges like fraud detection, predictive maintenance, and personalized marketing. This book shows you how to unlock the vital information stored in spreadsheets, ledgers, databases and other tabular data sources using gradient boosting, deep learning, and generative AI. About the Book Machine Learning for Tabular Data delivers practical ML techniques to upgrade every stage of the business data analysis pipeline. In it, you’ll explore examples like using XGBoost and Keras to predict short-term rental prices, deploying a local ML model with Python and Flask, and streamlining workflows using large language models (LLMs). Along the way, you’ll learn to make your models both more powerful and more explainable. What's Inside Master XGBoost Apply deep learning to tabular data Deploy models locally and in the cloud Build pipelines to train and maintain models About the Reader For readers experienced with Python and the basics of machine learning. About the Authors Mark Ryan is the AI Lead of the Developer Knowledge Platform at Google. A three-time Kaggle Grandmaster, Luca Massaron is a Google Developer Expert (GDE) in machine learning and AI. He has published 17 other books. Quotes

Deep Learning and AI Superhero

"Deep Learning and AI Superhero" is an extensive resource for mastering the core concepts and advanced techniques in AI and deep learning using TensorFlow, Keras, and PyTorch. This comprehensive guide walks you through topics from foundational neural network concepts to implementing real-world machine learning solutions. You will gain hands-on experience and theoretical knowledge to elevate your AI development skills. What this Book will help me do Develop a solid foundation in neural networks, their structure, and their training methodologies. Understand and implement deep learning models using TensorFlow and Keras effectively. Gain experience using PyTorch for creating, training, and optimizing advanced machine learning models. Learn advanced applications such as CNNs for computer vision, RNNs for sequential data, and Transformers for natural language processing. Deploy AI models on cloud and edge platforms through practical examples and optimized workflows. Author(s) Cuantum Technologies LLC has established itself as a pioneer in creating educational resources for advanced AI technologies. Their team consists of experts and practitioners in the field, combining years of industry and academic experience. Their books are crafted to ensure readers can practically apply cutting-edge AI techniques with clarity and confidence. Who is it for? This book is ideally suited for software developers, AI enthusiasts, and data scientists who have a basic understanding of programming and machine learning concepts. It's perfect for those seeking to enhance their skills and tackle real-world AI challenges. Whether your goals are professional development, research, or personal learning, you'll find practical and detailed guidance throughout this book.

Generative AI is rapidly growing in business and the popular imagination. Google Cloud was at the forefront of this revolution with the introduction of the Transformer architecture in 2017 and more recently, with the release of Gemini models. This session introduces JAX, a powerful framework and ecosystem for large model development, which we use to develop our Gemini models, and Keras - an easy to use higher level API for deep learning and gen AI.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Low-Code AI

Take a data-first and use-case-driven approach with Low-Code AI to understand machine learning and deep learning concepts. This hands-on guide presents three problem-focused ways to learn no-code ML using AutoML, low-code using BigQuery ML, and custom code using scikit-learn and Keras. In each case, you'll learn key ML concepts by using real-world datasets with realistic problems. Business and data analysts get a project-based introduction to ML/AI using a detailed, data-driven approach: loading and analyzing data; feeding data into an ML model; building, training, and testing; and deploying the model into production. Authors Michael Abel and Gwendolyn Stripling show you how to build machine learning models for retail, healthcare, financial services, energy, and telecommunications. You'll learn how to: Distinguish between structured and unstructured data and the challenges they present Visualize and analyze data Preprocess data for input into a machine learning model Differentiate between the regression and classification supervised learning models Compare different ML model types and architectures, from no code to low code to custom training Design, implement, and tune ML models Export data to a GitHub repository for data management and governance

Python Data Analytics: With Pandas, NumPy, and Matplotlib

Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This third edition is fully updated for the latest version of Python and its related libraries, and includes coverage of social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Third Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

The US Army Corps of Engineers (USACE) is responsible for maintaining and improving nearly 12,000 miles of shallow-draft (9'-14') inland and intracoastal waterways, 13,000 miles of deep-draft (14' and greater) coastal channels, and 400 ports, harbors, and turning basins throughout the United States. Because these components of the national waterway network are considered assets to both US commerce and national security, they must be carefully managed to keep marine traffic operating safely and efficiently.

The National DQM Program is tasked with providing USACE a nationally standardized remote monitoring and documentation system across multiple vessel types with timely data access, reporting, dredge certifications, data quality control, and data management. Government systems have often lagged commercial systems in modernization efforts, and the emergence of the cloud and Data Lakehouse Architectures have empowered USACE to successfully move into the modern data era.

This session incorporates aspects of these topics: Data Lakehouse Architecture: Delta Lake, platform security and privacy, serverless, administration, data warehouse, Data Lake, Apache Iceberg, Data Mesh GIS: H3, MOSAIC, spatial analysis data engineering: data pipelines, orchestration, CDC, medallion architecture, Databricks Workflows, data munging, ETL/ELT, lakehouses, data lakes, Parquet, Data Mesh, Apache Spark™ internals. Data Streaming: Apache Spark Structured Streaming, real-time ingestion, real-time ETL, real-time ML, real-time analytics, and real-time applications, Delta Live Tables. ML: PyTorch, TensorFlow, Keras, scikit-learn, Python and R ecosystems data governance: security, compliance, RMF, NIST data sharing: sharing and collaboration, delta sharing, data cleanliness, APIs.

Talk by: Jeff Mroz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Applied Machine Learning and AI for Engineers

While many introductory guides to AI are calculus books in disguise, this one mostly eschews the math. Instead, author Jeff Prosise helps engineers and software developers build an intuitive understanding of AI to solve business problems. Need to create a system to detect the sounds of illegal logging in the rainforest, analyze text for sentiment, or predict early failures in rotating machinery? This practical book teaches you the skills necessary to put AI and machine learning to work at your company. Applied Machine Learning and AI for Engineers provides examples and illustrations from the AI and ML course Prosise teaches at companies and research institutions worldwide. There's no fluff and no scary equations—just a fast start for engineers and software developers, complete with hands-on examples. This book helps you: Learn what machine learning and deep learning are and what they can accomplish Understand how popular learning algorithms work and when to apply them Build machine learning models in Python with Scikit-Learn, and neural networks with Keras and TensorFlow Train and score regression models and binary and multiclass classification models Build facial recognition models and object detection models Build language models that respond to natural-language queries and translate text to other languages Use Cognitive Services to infuse AI into the apps that you write

Deep Learning with Python, Second Edition

Printed in full color! Unlock the groundbreaking advances of deep learning with this extensively revised new edition of the bestselling original. Learn directly from the creator of Keras and master practical Python deep learning techniques that are easy to apply in the real world. In Deep Learning with Python, Second Edition you will learn: Deep learning from first principles Image classification and image segmentation Timeseries forecasting Text classification and machine translation Text generation, neural style transfer, and image generation Printed in full color throughout Deep Learning with Python has taught thousands of readers how to put the full capabilities of deep learning into action. This extensively revised full color second edition introduces deep learning using Python and Keras, and is loaded with insights for both novice and experienced ML practitioners. You’ll learn practical techniques that are easy to apply in the real world, and important theory for perfecting neural networks. About the Technology Recent innovations in deep learning unlock exciting new software capabilities like automated language translation, image recognition, and more. Deep learning is quickly becoming essential knowledge for every software developer, and modern tools like Keras and TensorFlow put it within your reach—even if you have no background in mathematics or data science. This book shows you how to get started. About the Book Deep Learning with Python, Second Edition introduces the field of deep learning using Python and the powerful Keras library. In this revised and expanded new edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. As you move through this book, you’ll build your understanding through intuitive explanations, crisp color illustrations, and clear examples. You’ll quickly pick up the skills you need to start developing deep-learning applications. What's Inside Deep learning from first principles Image classification and image segmentation Time series forecasting Text classification and machine translation Text generation, neural style transfer, and image generation Printed in full color throughout About the Reader For readers with intermediate Python skills. No previous experience with Keras, TensorFlow, or machine learning is required. About the Author François Chollet is a software engineer at Google and creator of the Keras deep-learning library. Quotes Chollet is a master of pedagogy and explains complex concepts with minimal fuss, cutting through the math with practical Python code. He is also an experienced ML researcher and his insights on various model architectures or training tips are a joy to read. - Martin Görner, Google Immerse yourself into this exciting introduction to the topic with lots of real-world examples. A must-read for every deep learning practitioner. - Sayak Paul, Carted The modern classic just got better. - Edmon Begoli, Oak Ridge National Laboratory Truly the bible of deep learning. - Yiannis Paraskevopoulos, University of West Attica

Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry. Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations. What You Will Learn Be introduced to machine learning, Spark, and Spark MLlib 2.4.x Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries Detect anomalies with the Isolation Forest algorithm for Spark Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages Optimize your ML workload with the Alluxio in-memory data accelerator for Spark Use GraphX and GraphFrames for Graph Analysis Perform image recognition using convolutional neural networks Utilize the Keras framework and distributed deep learning libraries with Spark Who This Book Is For Data scientists and machine learning engineers who want to take their knowledge to the next level and use Spark and more powerful, next-generation algorithms and libraries beyond what is available in the standard Spark MLlib library; also serves as a primer for aspiring data scientists and engineers who need an introduction to machine learning, Spark, and Spark MLlib.

Machine Learning for Finance

Dive deep into how machine learning is transforming the financial industry with 'Machine Learning for Finance'. This comprehensive guide explores cutting-edge concepts in machine learning while providing practical insights and Python code examples to help readers apply these techniques to real-world financial scenarios. Whether tackling fraud detection, financial forecasting, or sentiment analysis, this book equips you with the understanding and tools needed to excel. What this Book will help me do Understand and implement machine learning techniques for structured data, natural language, images, and text. Learn Python-based tools and libraries such as scikit-learn, Keras, and TensorFlow for financial data analysis. Apply machine learning for tasks like predicting financial trends, detecting fraud, and customer sentiment analysis. Explore advanced topics such as neural networks, generative adversarial networks (GANs), and reinforcement learning. Gain hands-on experience with machine learning debugging, products launch preparation, and addressing bias in data. Author(s) James Le None and Jannes Klaas are experts in machine learning applications in financial technology. Jannes has extensive experience training financial professionals on implementing machine learning strategies in their work and pairs this with a deep academic understanding of the topic. Their dedication to empowering readers to confidently integrate AI and machine learning into financial applications shines through in this user-focused, richly detailed book. Who is it for? This book is tailored for financial professionals, data scientists, and enthusiasts aiming to harness machine learning's potential in finance. Readers should have a foundational understanding of mathematics, statistics, and Python programming. If you work in financial services and are curious about applications ranging from fraud detection to trend forecasting, this resource is for you. It's designed for those looking to advance their skills and make impactful contributions in financial technology.

Summary Machine learning is a class of technologies that promise to revolutionize business. Unfortunately, it can be difficult to identify and execute on ways that it can be used in large companies. Kevin Dewalt founded Prolego to help Fortune 500 companies build, launch, and maintain their first machine learning projects so that they can remain competitive in our landscape of constant change. In this episode he discusses why machine learning projects require a new set of capabilities, how to build a team from internal and external candidates, and how an example project progressed through each phase of maturity. This was a great conversation for anyone who wants to understand the benefits and tradeoffs of machine learning for their own projects and how to put it into practice.

Introduction

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Kevin Dewalt about his experiences at Prolego, building machine learning projects for Fortune 500 companies

Interview

Introduction How did you get involved in the area of data management? For the benefit of software engineers and team leaders who are new to machine learning, can you briefly describe what machine learning is and why is it relevant to them? What is your primary mission at Prolego and how did you identify, execute on, and establish a presence in your particular market?

How much of your sales process is spent on educating your clients about what AI or ML are and the benefits that these technologies can provide?

What have you found to be the technical skills and capacity necessary for being successful in building and deploying a machine learning project?

When engaging with a client, what have you found to be the most common areas of technical capacity or knowledge that are needed?

Everyone talks about a talent shortage in machine learning. Can you suggest a recruiting or skills development process for companies which need to build out their data engineering practice? What challenges will teams typically encounter when creating an efficient working relationship between data scientists and data engineers? Can you briefly describe a successful project of developing a first ML model and putting it into production?

What is the breakdown of how much time was spent on different activities such as data wrangling, model development, and data engineering pipeline development? When releasing to production, can you share the types of metrics that you track to ensure the health and proper functioning of the models? What does a deployable artifact for a machine learning/deep learning application look like?

What basic technology stack is necessary for putting the first ML models into production?

How does the build vs. buy debate break down in this space and what products do you typically recommend to your clients?

What are the major risks associated with deploying ML models and how can a team mitigate them? Suppose a software engineer wants to break into ML. What data engineering skills would you suggest they learn? How should they position themselves for the right opportunity?

Contact Info

Email: Kevin Dewalt [email protected] and Russ Rands [email protected] Connect on LinkedIn: Kevin Dewalt and Russ Rands Twitter: @kevindewalt

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Prolego Download our book: Become an AI Company in 90 Days Google Rules Of ML AI Winter Machine Learning Supervised Learning O’Reilly Strata Conference GE Rebranding Commercials Jez Humble: Stop Hiring Devops Experts (And Start Growing Them) SQL ORM Django RoR Tensorflow PyTorch Keras Data Engineering Podcast Episode About Data Teams DevOps For Data Teams – DevOps Days Boston Presentation by Tobias Jupyter Notebook Data Engineering Podcast: Notebooks at Netflix Pandas

Podcast Interview

Joel Grus

JupyterCon Presentation Data Science From Scratch

Expensify Airflow

James Meickle Interview

Git Jenkins Continuous Integration Practical Deep Learning For Coders Course by Jeremy Howard Data Carpentry

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

Hands-On Deep Learning with Apache Spark

"Hands-On Deep Learning with Apache Spark" is an essential resource for mastering distributed deep learning frameworks and applications on Apache Spark. Through practical examples and guided tutorials, this book teaches you to deploy scalable deep learning solutions for handling complex data challenges efficiently. What this Book will help me do Understand how to set up Apache Spark for deep learning workflows. Gain practical insight into implementing neural networks, including CNNs and RNNs, on distributed platforms. Learn to train and optimize models using popular frameworks like TensorFlow and Keras. Develop expertise in analyzing large datasets with textual and image-based deep learning methods. Acquire skills to deploy trained models for real-world applications in distributed environments. Author(s) None Iozzia is an accomplished software engineer and data scientist with a strong background in distributed computing and machine learning. With years of experience working with Apache Spark and deep learning technologies, None brings a wealth of practical knowledge to the table. Their passion for providing clear, hands-on guidance makes this book an approachable and valuable resource for learners of all levels. Who is it for? This book is aimed at Scala developers, data scientists, and data analysts who are looking to extend their skill set to include distributed deep learning on Apache Spark. It's ideally suited for readers familiar with machine learning basics and those with prior exposure to Apache Spark workflows. If you aim to create scalable machine learning solutions that handle complex data, this book offers precisely what you need.

Python Data Analytics: With Pandas, NumPy, and Matplotlib

Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This revision is fully updated with new content on social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Second Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis