Python

Statistical Tableau

2024-05-02 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ethan Lang

AI/ML Analytics Data Analytics Tableau data data-science data-science-tasks statistics

In today's data-driven world, understanding statistical models is crucial for effective analysis and decision making. Whether you're a beginner or an experienced user, this book equips you with the foundational knowledge to grasp and implement statistical models within Tableau. Gain the confidence to speak fluently about the models you employ, driving adoption of your insights and analysis across your organization. As AI continues to revolutionize industries, possessing the skills to leverage statistical models is no longer optional—it's a necessity. Stay ahead of the curve and harness the full potential of your data by mastering the ability to interpret and utilize the insights generated by these models. Whether you're a data enthusiast, analyst, or business professional, this book empowers you to navigate the ever-evolving landscape of data analytics with confidence and proficiency. Start your journey toward data mastery today. In this book, you will learn: The basics of foundational statistical modeling with Tableau How to prove your analysis is statistically significant How to calculate and interpret confidence intervals Best practices for incorporating statistics into data visualizations How to connect external analytics resources from Tableau using R and Python

Protocol Buffers Handbook

2024-04-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Clément Jean

API Protobuf data data-engineering protocol-buffers storage-formats

The "Protocol Buffers Handbook" by Clément Jean offers an in-depth exploration of Protocol Buffers (Protobuf), a powerful data serialization format. Learn everything from syntax and schema evolution to custom validations and cross-language integrations. With practical examples in Go and Python, this guide empowers you to efficiently serialize and manage structured data across platforms. What this Book will help me do Develop advanced skills in using Protocol Buffers (Protobuf) for efficient data serialization. Master the key concepts of Protobuf syntax and schema evolution for compatibility. Learn to create custom validation plugins and tailor Protobuf processes. Integrate Protobuf with multiple programming environments, including Go and Python. Automate Protobuf projects using tools like Buf and Bazel to streamline workflows. Author(s) Clément Jean is a skilled programmer and technical writer specializing in data serialization and distributed systems. With substantial experience in developing scalable microservices, he shares valuable insights into using Protocol Buffers effectively. Through this book, Clément offers a hands-on approach to Protobuf, blending theory with practical examples derived from real-world scenarios. Who is it for? This book is perfect for software engineers, system integrators, and data architects who aim to optimize data serialization and APIs, regardless of their programming language expertise. Beginners will grasp foundational Protobuf concepts, while experienced developers will extend their knowledge to advanced, practical applications. Those working with microservices and heavily data-dependent systems will find this book especially relevant.

Mastering Marketing Data Science

2024-04-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Iain Brown

AI/ML Analytics Data Collection Data Science GenAI Marketing NLP SAS data data-science

Unlock the Power of Data: Transform Your Marketing Strategies with Data Science In the digital age, understanding the symbiosis between marketing and data science is not just an advantage; it's a necessity. In Mastering Marketing Data Science: A Comprehensive Guide for Today's Marketers, Dr. Iain Brown, a leading expert in data science and marketing analytics, offers a comprehensive journey through the cutting-edge methodologies and applications that are defining the future of marketing. This book bridges the gap between theoretical data science concepts and their practical applications in marketing, providing readers with the tools and insights needed to elevate their strategies in a data-driven world. Whether you're a master's student, a marketing professional, or a data scientist keen on applying your skills in a marketing context, this guide will empower you with a deep understanding of marketing data science principles and the competence to apply these principles effectively. Comprehensive Coverage: From data collection to predictive analytics, NLP, and beyond, explore every facet of marketing data science. Practical Applications: Engage with real-world examples, hands-on exercises in both Python & SAS, and actionable insights to apply in your marketing campaigns. Expert Guidance: Benefit from Dr. Iain Brown's decade of experience as he shares cutting-edge techniques and ethical considerations in marketing data science. Future-Ready Skills: Learn about the latest advancements, including generative AI, to stay ahead in the rapidly evolving marketing landscape. Accessible Learning: Tailored for both beginners and seasoned professionals, this book ensures a smooth learning curve with a clear, engaging narrative. Mastering Marketing Data Science is designed as a comprehensive how-to guide, weaving together theory and practice to offer a dynamic, workbook-style learning experience. Dr. Brown's voice and expertise guide you through the complexities of marketing data science, making sophisticated concepts accessible and actionable.

Build Your Second Brain One Piece At A Time

2024-04-28 · Data Engineering Podcast Listen

podcast_episode

by Tsavo Knott (Pieces) , Tobias Macey

AI/ML Analytics Cloud Computing Dagster Data Collection Data Engineering Data Lake Data Lakehouse Delta GenAI Hudi Iceberg +3 more

Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementDagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Tsavo Knott about Pieces, a personal AI toolkit to improve the efficiency of developersInterview IntroductionHow did you get involved in machine learning?Can you describe what Pieces is and the story behind it?The past few months have seen an endless series of personalized AI tools launched. What are the features and focus of Pieces that might encourage someone to use it over the alternatives?model selectionsarchitecture of Pieces applicationlocal vs. hybrid vs. online modelsmodel update/delivery processdata preparation/serving for models in context of Pieces appapplication of AI to developer workflowstypes of workflows that people are building with piecesWhat are the most interesting, innovative, or unexpected ways that you have seen Pieces used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pieces?When is Pieces the wrong choice?What do you have planned for the future of Pieces?Contact Info LinkedInParting Question From your perspective, what is the biggest barrier to adoption of machine learning today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.Links PiecesNPU == Neural Processing UnitTensor ChipLoRA == Low Rank AdaptationGenerative Adversarial NetworksMistralEmacsVimNeoVimDartFlutte

Episode 179: CheckGrade, ACCU & CppNorth

2024-04-26 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Bryce's mom , Bryce Adelstein Lelbach (NVIDIA)

GitHub

In this episode, Conor and Bryce chat about the CheckGrade problem, ACCU, CppNorth and have a guest appearance from Bryce’s mom! Link to Episode 179 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-04-17 Date Released: 2024-04-26 CheckGrade TweetRuff (Python Tool)Software Unscripted Episode 69: Making Parsing I/O Bound with Daniel LemireThe simdjson libraryCppNorthComposition Intuition - Conor Hoekstra - CppNorth 2023Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Python Stream Processing Made Simple

2024-04-25 · Stream Processing Meetup at SumUp Berlin

talk

by Tun Shwe (Quix)

Building a RAG-powered AI chat app with Python and VS Code

2024-04-24 · VS Code Day 2024

talk

AI/ML RAG

Hands-on FiftyOne

2024-04-24 · April 24 - Getting Started with FiftyOne Workshop

workshop

by Dan Gural (Voxel51)

fiftyone

Hands-on FiftyOne: Load datasets from the FiftyOne Dataset Zoo; Navigate the FiftyOne App; Programmatically inspect attributes of a dataset; Add new sample and custom attributes to a dataset; Generate and evaluate model predictions; Save insightful views into the data.

FiftyOne Basics

2024-04-24 · April 24 - Getting Started with FiftyOne Workshop

workshop

by Dan Gural (Voxel51)

fiftyone

First part of the workshop covering FiftyOne basics (terms, architecture, installation, and general usage), plus an overview of useful workflows to explore, understand, and curate your data, and how FiftyOne represents and semantically slices unstructured computer vision data.

Getting started with FiftyOne

2024-04-24 · April 24 - Getting Started with FiftyOne Workshop

workshop

by Dan Gural (Voxel51)

computer vision fiftyone

Hands-on workshop covering FiftyOne basics (terms, architecture, installation, and general usage) and useful workflows to explore, understand, and curate data, followed by a hands-on introduction to FiftyOne (loading datasets from the FiftyOne Dataset Zoo, navigating the FiftyOne App, inspecting dataset attributes, adding samples and custom attributes, generating and evaluating model predictions, and saving insightful views).

Getting started with FiftyOne workshop

2024-04-24 · April 24 - Getting Started with FiftyOne Workshop

workshop

by Dan Gural (Voxel51)

fiftyone

A hands-on workshop introducing FiftyOne basics (terms, architecture, installation, and general usage) and workflows to explore, understand, and curate data, followed by a second half focused on a hands-on introduction to FiftyOne: loading datasets from the FiftyOne Dataset Zoo, navigating the FiftyOne App, programmatically inspecting attributes, adding new samples and custom attributes, generating and evaluating model predictions, and saving insightful views.

Python 3 Data Visualization Using Google Gemini

2024-04-22 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Oswald Campesato

DataViz LLM Matplotlib Pandas Seaborn programming-languages software-development

This book offers a comprehensive guide to leveraging Python-based data visualization techniques with the innovative capabilities of Google Gemini. Tailored for individuals proficient in Python seeking to enhance their visualization skills, it explores essential libraries like Pandas, Matplotlib, and Seaborn, along with insights into the innovative Gemini platform. With a focus on practicality and efficiency, it delivers a rapid yet thorough exploration of data visualization methodologies, supported by Gemini-generated code samples. Companion files with source code and figures are available for downloading. FEATURES: Covers Python-based data visualization libraries and techniques Includes practical examples and Gemini-generated code samples for efficient learning Integrates Google Gemini for advanced data visualization capabilities Sets up a conducive development environment for a seamless coding experience Includes companion files for downloading with source code and figures

Making Email Better With AI At Shortwave

2024-04-21 · Data Engineering Podcast Listen

podcast_episode

by Andrew Lee (Shortwave) , Tobias Macey

AI/ML Analytics Cloud Computing Dagster Data Engineering Data Lake Data Lakehouse Data Management Delta GenAI Hudi Iceberg +3 more

Summary

Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm interviewing Andrew Lee about his work on Shortwave, an AI powered email client

Interview

Introduction How did you get involved in the area of data management? Can you describe what Shortwave is and the story behind it?

What is the core problem that you are addressing with Shortwave?

Email has been a central part of communication and business productivity for decades now. What are the overall themes that continue to be problematic? What are the strengths that email maintains as a protocol and ecosystem? From a product perspective, what are the data challenges that are posed by email? Can you describe how you have architected the Shortwave platform?

How have the design and goals of the product changed since you started it? What are the ways that the advent and evolution of language models have influenced your product roadmap?

How do you manage the personalization of the AI functionality in your system for each user/team? For users and teams who are using Shortwave, how does it change their workflow and communication patterns? Can you describe how I would use Shortwave for managing the workflow of evaluating, planning, and promoting my podcast episodes? What are the most interesting, innovative, or unexpected ways that you have seen Shortwave used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Shortwave? When is Shortwave the wrong choice? What do you have planned for the future of Shortwave?

Contact Info

LinkedIn Blog

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with mach

Pulumi fundamentals: Infrastructure as Code with Python for AWS

2024-04-18 · Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python

workshop

by Marina Novikova (AWS) , Diana Esteves (Pulumi)

AWS Pulumi

A hands-on workshop introducing Pulumi and how to use Python to provision AWS resources. Topics include the Pulumi programming model and how to provision, update, and destroy AWS resources.

Llama.cpp for fun and (maybe) profit

2024-04-16 · AI and Deep Learning for Enterprise #15

talk

by ian osvald (Mor Consulting)

cpu gpu llama.cpp llms quantised models

Running models locally on the CPU and possibly a GPU means we can experiment with the latest quantised models on real client data without anything leaving the machine. We can explore text question answering, image analysis and calling these tools via a Python API for rapid PoC experimentation. This quickly exposes the ways that LLMs go weird and maybe that helps us avoid some of the examples of early LLM deployments making embarrassing mistakes!

Data Science Fundamentals with R, Python, and Open Data

2024-04-16 · O'Reilly Data Science Books O'Reilly Amazon

book

by Marco Cremonini

Computer Science CSV Data Science programming-languages software-development

Data Science Fundamentals with R, Python, and Open Data Introduction to essential concepts and techniques of the fundamentals of R and Python needed to start data science projects Organized with a strong focus on open data, Data Science Fundamentals with R, Python, and Open Data discusses concepts, techniques, tools, and first steps to carry out data science projects, with a focus on Python and RStudio, reflecting a clear industry trend emerging towards the integration of the two. The text examines intricacies and inconsistencies often found in real data, explaining how to recognize them and guiding readers through possible solutions, and enables readers to handle real data confidently and apply transformations to reorganize, indexing, aggregate, and elaborate. This book is full of reader interactivity, with a companion website hosting supplementary material including datasets used in the examples and complete running code (R scripts and Jupyter notebooks) of all examples. Exam-style questions are implemented and multiple choice questions to support the readers’ active learning. Each chapter presents one or more case studies. Written by a highly qualified academic, Data Science Fundamentals with R, Python, and Open Data discuss sample topics such as: Data organization and operations on data frames, covering reading CSV dataset and common errors, and slicing, creating, and deleting columns in R Logical conditions and row selection, covering selection of rows with logical condition and operations on dates, strings, and missing values Pivoting operations and wide form-long form transformations, indexing by groups with multiple variables, and indexing by group and aggregations Conditional statements and iterations, multicolumn functions and operations, data frame joins, and handling data in list/dictionary format Data Science Fundamentals with R, Python, and Open Data is a highly accessible learning resource for students from heterogeneous disciplines where Data Science and quantitative, computational methods are gaining popularity, along with hard sciences not closely related to computer science, and medical fields using stochastic and quantitative models.

Software Engineering for Data Scientists

2024-04-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Catherine Nelson

API Data Science NumPy Pandas data data-science

Data science happens in code. The ability to write reproducible, robust, scaleable code is key to a data science project's success—and is absolutely essential for those working with production code. This practical book bridges the gap between data science and software engineering, and clearly explains how to apply the best practices from software engineering to data science. Examples are provided in Python, drawn from popular packages such as NumPy and pandas. If you want to write better data science code, this guide covers the essential topics that are often missing from introductory data science or coding classes, including how to: Understand data structures and object-oriented programming Clearly and skillfully document your code Package and share your code Integrate data science code with a larger code base Learn how to write APIs Create secure code Apply best practices to common tasks such as testing, error handling, and logging Work more effectively with software engineers Write more efficient, maintainable, and robust code in Python Put your data science projects into production And more

Identify Application Vulnerabilities with Security Command Center

2024-04-11 · Google Cloud Next '24

session

by Sean Lubbers (Google Cloud)

Cloud Computing GCP Kubernetes Cyber Security

Web Security Scanner identifies security vulnerabilities in your App Engine, Google Kubernetes Engine (GKE), and Compute Engine web applications. This service crawls your application, following all links within the scope of your starting URLs, and attempts to exercise as many user inputs and event handlers as possible. It can automatically scan and detect four common vulnerabilities, including cross-site-scripting (XSS), flash injection, mixed content (HTTP in HTTPS), and outdated/insecure libraries. In this spotlight lab, you will use Web Security Scanner—one of Security Command Center's built-in services—to scan a Python Flask application for vulnerabilities.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

BigQuery Studio and BigFrames: Unlocking the Power of Scalable Data Science

2024-04-11 · Google Cloud Next '24

session

by Jeff Nelson (Google Cloud)

AI/ML Analytics API BigQuery Cloud Computing Data Management Data Science GCP

BigQuery Studio and BigFrames are a powerful combination for scalable data science and analytics. Unify data management, analysis, and collaboration with BigQuery Studio’s intuitive interface. Scale data science and machine learning with BigFrames’ powerful Python API. Get deeper insights, faster.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Cloud-powered, API-first testing with Testcontainers and Kotlin

2024-04-10 · Google Cloud Next '24

session

by Oleg Šelajev (Docker)

API BigQuery Cloud Computing GCP Cloud Run Java JavaScript Pub/Sub Rust

Adequately testing systems that use Google Cloud services can be a serious challenge. In this session we’ll show you how to shift testing to an API-first approach using Testcontainers. This approach helps us improve the feedback cycle and reliability for both our inner-dev loop and our competitive intelligence cycle. We’ll go through an end-to-end example that uses BigQuery and PubSub, Cloud Build, and Cloud Run. Examples will use Kotlin but it could be accomplished with other languages including Rust, Go, JavaScript, Python, Java, and more.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

talk-data.com

Activity Trend

Top Events

Top Speakers

Statistical Tableau

Protocol Buffers Handbook

Mastering Marketing Data Science

Build Your Second Brain One Piece At A Time

Episode 179: CheckGrade, ACCU & CppNorth

Python Stream Processing Made Simple

Building a RAG-powered AI chat app with Python and VS Code

Hands-on FiftyOne

FiftyOne Basics

Getting started with FiftyOne

Getting started with FiftyOne workshop

Python 3 Data Visualization Using Google Gemini

Making Email Better With AI At Shortwave

Pulumi fundamentals: Infrastructure as Code with Python for AWS

Llama.cpp for fun and (maybe) profit

Data Science Fundamentals with R, Python, and Open Data

Software Engineering for Data Scientists

Identify Application Vulnerabilities with Security Command Center

BigQuery Studio and BigFrames: Unlocking the Power of Scalable Data Science

Cloud-powered, API-first testing with Testcontainers and Kotlin