Data Science

#129 Increasing Diverse Representation in Data Science

2023-03-06 · DataFramed Listen

podcast_episode

by Nikisha Alcindor (STEM Educational Institute (SEI))

Studies have shown that companies lacking in racial diversity also have a corresponding lack in their ability to innovate as a whole, which makes it important for any organization to prioritize an inclusive workplace culture and welcome more women and underrepresented groups in data. This is why Nikiska Alcindor's work is so vital to the future of the data science industry. Nikisha is the President and Founder of the STEM Educational Institute (SEI), a nonprofit corporation that equips underrepresented high school students with the technological skills needed to build generational wealth and be effective in the workforce. Nikisha is a strategic management leader with expertise in organizational change, investing, and fundraising. She is a recipient of the 2021 Dean Huss Teaching Award, a board member of the Upper Manhatten Empowerment Zone, and has taught a master class at Columbia Business School as well as several guest lectures at Columbia University. Throughout the episode, we discuss SEI’s three-pillar approach to education, the rising importance of STEM-based careers, why financial literacy is crucial to a student’s success, SEI’s partnership with DataCamp, contextualizing educational and upskilling programs to your organization’s specific population, how data leaders can positively communicate upskilling initiatives, and much more.

Brain Science is Data Science

2023-03-01 · UVA Data Points Listen

podcast_episode

by Tanya Evans , John Darrell van Horn , Teague Henry

AI/ML

This episode explores the intersection of neuroscience and data science with three experts in the field, Drs. John Darrell van Horn, Tanya Evans, and Teague Henry. As we know, the brain is complicated. People have been charting paths through the brain for decades, making breakthroughs and discoveries that have changed the world. In recent years though, new methodologies in brain research have made significant impacts. Advances in computing power, as well as techniques like machine learning, neural networks, and computer vision, have allowed researchers to ask questions and make discoveries that were not possible even ten years ago. Given these new approaches to studying the world’s most complicated organ, one could say that brain science is data science. Our guests make a compelling case.

Applied Geospatial Data Science with Python

2023-02-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by David S. Jordan

AI/ML Analytics GIS Python programming-languages software-development

"Applied Geospatial Data Science with Python" introduces readers to the power of integrating geospatial data into data science workflows. This book equips you with practical methods for processing, analyzing, and visualizing spatial data to solve real-world problems. Through hands-on examples and clear, actionable advice, you will master the art of spatial data analysis using Python. What this Book will help me do Learn to process, analyze, and visualize geospatial data using Python libraries. Develop a foundational understanding of GIS and geospatial data science principles. Gain skills in building geospatial AI and machine learning models for specific use cases. Apply geospatial data workflows to practical scenarios like optimization and clustering. Create a portfolio of geospatial data science projects relevant across different industries. Author(s) David S. Jordan is an experienced data scientist with years of expertise in GIS and geospatial analytics. With a passion for making complex topics accessible, David leverages his deep technical knowledge to provide practical, hands-on instruction. His approach emphasizes real-world applications and encourages learners to develop confidence as they work with geospatial data. Who is it for? This book is perfect for data scientists looking to integrate geospatial data analysis into their existing workflows, and GIS professionals seeking to expand into data science. If you already have a basic knowledge of Python for data analysis or data science and want to explore how to work effectively with geospatial data to drive impactful solutions, this is the book for you.

Leading Biotech Data Teams

2023-02-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jesse Johnson

bioinformatics data data-science data-science-domains

With hundreds of startups founded each year, the relatively new field of data-focused biotech—or TechBio—is growing rapidly. But without enough experienced practitioners to go around, most organizations hire data scientists with minimal biotech experience and lab scientists who've taken a crash course in data science. This arrangement is problematic. The way lab scientists and data scientists think and work is fundamentally different. But there is a solution. This report introduces biocode principles to help these scientists reframe the way they think about their role, their team's role, and the tools they use to fulfill those roles. Lab and data scientists alike will learn how to address the underlying issues so they can focus on solving these technology problems together. Each of the following chapters presents a vital biocode principle: "Defining Objectives" explores how to broaden the way teams view their work, shifting from purely technical objectives to organizational-level scientific objectives "Building Collaborations" encourages teams to focus their energy on collaboration with partner teams rather than guard their time for technical work "Deploying Tooling" covers ways to coordinate each team's work with the cadence of experiments and lab work

Accelerating the Adoption of AI through Diversity - Dânia Meira

2023-02-24 · DataTalks.Club Listen

podcast_episode

by Dânia Meira (AI Guild)

AI/ML Data Engineering GitHub HTML

We talked about:

Dania’s background Founding the AI Guild Datalift Summit Coming up with meetup topics Diversity in Berlin Other types of diversity besides gender The pitfalls of lacking diversity Creating an environment where people can safely share their experiences How the AI Guild helps organizations become more diverse How the AI guild finds women in the fields of AI and data science Advice for people in underrepresented groups Organizing a welcoming environment and creating a code of conduct AI Guild’s consulting work and community AI Guild team Dania’s resource recommendations Upcoming Datalift Summit

Links:

Call for Speakers for the #datalift summit (Berlin, 14 to 16 June 2023): https://eu1.hubs.ly/H02RXvX0 Coded Bias documentary on Netflix: https://www.netflix.com/de/title/81328723#:~:text=This%20documentary%20investigates%20the%20bias,flaws%20in%20facial%20recognition%20technology. Book Weapons of Math Destruction by Cathy O'Neil: https://en.wikipedia.org/wiki/Weapons_of_Math_Destruction Book Lean In by Sheryl Sandberg: https://en.wikipedia.org/wiki/Lean_In

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

The Kaggle Workbook

2023-02-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Konrad Banachewicz , Luca Massaron

NLP data data-science

"The Kaggle Workbook" is an engaging and practical guide for anyone looking to excel in Kaggle competitions by learning from real past case studies and hands-on exercises. Inside, you'll dive deep into key data science concepts, explore how Kaggle Grandmasters tackle challenges, and apply new skills to your own projects. What this Book will help me do Master the methodology used in past Kaggle competitions for real-world applications. Discover and implement advanced data science techniques such as gradient boosting and NLP. Build a portfolio that demonstrates hands-on experience solving complex data problems. Learn time-series forecasting and computer vision by exploring detailed case studies. Develop a practical mindset for competitive data science problem solving. Author(s) Konrad Banachewicz and Luca Massaron bring their expertise as Kaggle Grandmasters to the pages of this book. With extensive experience in data science and collaborative problem-solving, they guide readers through practical exercises with a clear, approachable style. Their passion for sharing knowledge shines through in every chapter. Who is it for? "The Kaggle Workbook" is ideal for aspiring and experienced data scientists who want to sharpen their competitive data science skills. It caters to those with a foundational knowledge of data science and an interest in enhancing it through practical exercises. The book is a perfect fit for anyone aiming to succeed in Kaggle competitions, whether starting out or advancing further.

47 How She Tripled Her Income in 18 Months w/ Data Analytics, Data Science, & Data Engineering w/ Kedeisha Bryan

2023-02-23 · Data Career Podcast: Helping You Land a Data Analyst Job FAST Listen

podcast_episode

by Kedeisha Bryan , Avery Smith

AI/ML Analytics Data Analytics Data Engineering

This is the story of how a warehouse worker pivoted into a senior data engineer in just 18 months while tripling her salary. In this episode of The Data Career Podcast, Avery Smith sits down with Kedeisha Bryan on how she landed a data job and tripled her income in only 18 months.

🌟 Join the data project club!

“25OFF” to get 25% off (first 50 members).

📊 Come to my next free “How to Land Your First Data Job” training

🏫 Check out my 10-week data analytics bootcamp

Kedeisha’s Links:

Connect on LinkedIn Join Data in Motion Community

Timestamps:

(9:19) - Why you need a sponsor in your life

(11:21) - You need common ground to network genuinely

(27:02) - Tired of sending job applications? Network instead

(30:16) - Know that this is a long game

Connect with Avery:

📺 Subscribe on YouTube: https://www.youtube.com/c/AverySmithDataCareerJumpstart/videos 🎙Listen to My Podcast: https://podcasts.apple.com/us/podcast/data-career-podcast/id1547386535 👔 Connect with me on LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://www.instagram.com/datacareerjumpstart/ 🎵 TikTok: https://www.tiktok.com/@verydata?

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Experimentation for Engineers

2023-02-23 · O'Reilly Data Science Books O'Reilly Amazon

book

by David Sweet

AI/ML NumPy Python bayesian-statistics data data-science data-science-tasks statistics

Optimize the performance of your systems with practical experiments used by engineers in the world’s most competitive industries. In Experimentation for Engineers: From A/B testing to Bayesian optimization you will learn how to: Design, run, and analyze an A/B test Break the "feedback loops" caused by periodic retraining of ML models Increase experimentation rate with multi-armed bandits Tune multiple parameters experimentally with Bayesian optimization Clearly define business metrics used for decision-making Identify and avoid the common pitfalls of experimentation Experimentation for Engineers: From A/B testing to Bayesian optimization is a toolbox of techniques for evaluating new features and fine-tuning parameters. You’ll start with a deep dive into methods like A/B testing, and then graduate to advanced techniques used to measure performance in industries such as finance and social media. Learn how to evaluate the changes you make to your system and ensure that your testing doesn’t undermine revenue or other business metrics. By the time you’re done, you’ll be able to seamlessly deploy experiments in production while avoiding common pitfalls. About the Technology Does my software really work? Did my changes make things better or worse? Should I trade features for performance? Experimentation is the only way to answer questions like these. This unique book reveals sophisticated experimentation practices developed and proven in the world’s most competitive industries that will help you enhance machine learning systems, software applications, and quantitative trading solutions. About the Book Experimentation for Engineers: From A/B testing to Bayesian optimization delivers a toolbox of processes for optimizing software systems. You’ll start by learning the limits of A/B testing, and then graduate to advanced experimentation strategies that take advantage of machine learning and probabilistic methods. The skills you’ll master in this practical guide will help you minimize the costs of experimentation and quickly reveal which approaches and features deliver the best business results. What's Inside Design, run, and analyze an A/B test Break the “feedback loops” caused by periodic retraining of ML models Increase experimentation rate with multi-armed bandits Tune multiple parameters experimentally with Bayesian optimization About the Reader For ML and software engineers looking to extract the most value from their systems. Examples in Python and NumPy. About the Author David Sweet has worked as a quantitative trader at GETCO and a machine learning engineer at Instagram. He teaches in the AI and Data Science master's programs at Yeshiva University. Quotes Putting an ‘improved’ version of a system into production can be really risky. This book focuses you on what is important! - Simone Sguazza, University of Applied Sciences and Arts of Southern Switzerland A must-have for anyone setting up experiments, from A/B tests to contextual bandits and Bayesian optimization. - Maxim Volgin, KLM Shows a non-mathematical programmer exactly what they need to write powerful mathematically-based testing algorithms. - Patrick Goetz, The University of Texas at Austin Gives you the tools you need to get the most out of your experiments. - Marc-Anthony Taylor, Raiffeisen Bank International

#127 How Data Scientists Can Thrive in Consulting

2023-02-20 · DataFramed Listen

podcast_episode

by Pratik Agrawal (Kearney Analytics)

Analytics

The most common application for data science is to solve problems within your own organization, and as professionals become more data literate, they rely less and less on others to solve their problems and unlock professional growth and career advancement. But in the world of consulting, data science is used to solve other people’s problems, which adds an additional layer of complexity since consultants aren’t always given all of the tools they need to do the job right. Enter Pratik Agrawal, a Partner at Kearney Analytics leading the automotive and industrial transportation sector. In this episode, we are taking a look at how data science is applied in the consulting industry and what skills are critical to be a successful data science consultant. As a software engineer and data scientist with over a decade of experience in the consulting world at companies like Boston Consulting Group and IRI, Pratik has a deep understanding of how to navigate the industry and how data science can be leveraged in it, as well as expertise in digital transformation projects and strategy. Throughout the episode, we discuss common problems that consultants encounter, the skills needed to be successful as a consultant, the different approaches to analytics in consulting versus in an organization, how to handle context switching when juggling multiple projects, what makes consulting feel exciting and challenging, and much more.

Staff AI Engineer - Tatiana Gabruseva

2023-02-17 · DataTalks.Club Listen

podcast_episode

by Tatiana Gabruseva

AI/ML Data Engineering GitHub HTML

We talked about:

Tatiana’s background Going from academia to healthcare to the tech industry What staff engineers do Transferring skills from academia to industry and learning new ones The importance of having mentors Skipping junior and mid-level straight into the staff role Convincing employers that you can take on a lead role Seeing failure as a learning opportunity Preparing for coding interviews Preparing for behavioral and system design interviews The importance of having a network and doing mock interviews How much do staff engineers work with building pipelines, data science, ETC, MPOps, etc.? Context switching Advice for those going from academia to industry The most exciting thing about working as an AI staff engineer Tatiana’s book recommendations

Links:

LinkedIn: https://www.linkedin.com/in/tatigabru/ Twitter: https://twitter.com/tatigabru Github: https://github.com/tatigabru Website: http://tatigabru.com/

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

#126 Make Your A/B Testing More Effective and Efficient

2023-02-13 · DataFramed Listen

podcast_episode

by Anjali Mehra (DocuSign)

Analytics KPI Marketing

One of the toughest parts of any data project is experimentation, not just because you need to choose the right testing method to confirm the project’s effectiveness, but because you also need to make sure you are testing the right hypothesis and measuring the right KPIs to ensure you receive accurate results. One of the most effective methods for data experimentation is A/B testing, and Anjali Mehra, Senior Director of Product Analytics, Data Science, Experimentation, and Instrumentation at DocuSign, is no stranger to how A/B testing can impact multiple parts of any organization. Throughout her career, she has also worked in marketing analytics and customer analytics at companies like Shutterfly, Wayfair, and Constant Contact. Throughout the episode, we discuss DocuSign’s analytics goals, how A/B testing works, how to gamify data experimentation, how A/B testing helps with new initiative validation, examples of A/B testing with data projects, how organizations can get started with data experimentation, and much more.

Os maiores cases de Data Science no Grupo Boticário - Data Hackers Podcast 63

2023-02-13 · Data Hackers Listen

podcast_episode

by Carlos Fonseca , Giuliana de Jong , Raphael Corrêa

BI

Convidamos o Grupo Boticário para contar como está aplicando na prática modelos de Data Science no varejo. O time de Data Science compartilhou com a gente alguns cases de modelos de recomendação para o ecommerce, como trabalham com os vendedores trazendo ferramentas para que eles vendam mais.

Os projetos que trouxeram para demonstrar um pouco de como trabalham com dados foram: o BotiEssence, que faz sugestão de produtos pro consumidor final; o BotiColorista, que faz classificação de tendências de cores; e o projeto Lyra iniciativa para tornar os processos do time de performance de produto mais data-driven. Consiste em entregas de soluções de Data Science e BI para áreas como Estabilidade e Segurança de Formulações, Ecotoxicidade, Ciências do Consumidor, Tecnologia de Materiais e Cosmetovigilância.

Os resultados do State of Data Brasil já está no ar! Trata-se do maior mapeamento do mercado de dados no Brasil. Para fazer o download do relatório clique aqui.

Conheça nossos convidados Linkedin do Carlos Fonseca Linkedin da Giuliana de Jong Linkedin do Raphael Corrêa

Data Mining and Predictive Analytics for Business Decisions

2023-02-13 · O'Reilly Data Science Books O'Reilly Amazon

book

by Andres Fortino

AI/ML Analytics Python data data-science data-science-tasks exploratory-data-analysis

With many recent advances in data science, we have many more tools and techniques available for data analysts to extract information from data sets. This book will assist data analysts to move up from simple tools such as Excel for descriptive analytics to answer more sophisticated questions using machine learning. Most of the exercises use R and Python, but rather than focus on coding algorithms, the book employs interactive interfaces to these tools to perform the analysis. Using the CRISP-DM data mining standard, the early chapters cover conducting the preparatory steps in data mining: translating business information needs into framed analytical questions and data preparation. The Jamovi and the JASP interfaces are used with R and the Orange3 data mining interface with Python. Where appropriate, Voyant and other open-source programs are used for text analytics. The techniques covered in this book range from basic descriptive statistics, such as summarization and tabulation, to more sophisticated predictive techniques, such as linear and logistic regression, clustering, classification, and text analytics. Includes companion files with case study files, solution spreadsheets, data sets and charts, etc. from the book. Features: Covers basic descriptive statistics, such as summarization and tabulation, to more sophisticated predictive techniques, such as linear and logistic regression, clustering, classification, and text analytics Uses R, Python, Jamovi and JASP interfaces, and the Orange3 data mining interface Includes companion files with the case study files from the book, solution spreadsheets, data sets, etc.

45: 3-Step Guide To Building Your First Data Science Project

2023-02-09 · Data Career Podcast: Helping You Land a Data Analyst Job FAST Listen

podcast_episode

by Avery Smith

AI/ML Analytics Data Analytics Python SQL Tableau

You just learned SQL or Python, or Tableau. But you don’t know how to build your data science project? In this episode, Avery shares a 3-step guide to building your first data science project.

🌟 Join the data project club!

“25OFF” to get 25% off (first 50 members).

📊 Come to my next free “How to Land Your First Data Job” training

🏫 Check out my 10-week data analytics bootcamp

Timestamps:

(1:28) - Art is theft, and so is the data science project

(4:02) - Find ideas on Towards Data Science Medium

(5:32) - Read a few articles to get inspiration

(6:05) - Avery’s strategy is doing 30 projects in 30 days

(9:08) - How academia finds inspiration to write

(11:01) - Take Avery’s project, replicate and do it

Mentioned Links:

Building 30 Data Science Projects in 30 days: https://youtu.be/kKmA9ihIg20

30 Data Science Projects Resources: https://www.datacareerjumpstart.com/30projectsresourcesignup

I Used Data Science to UNCOVER McDonald’s Healthiest Meal: https://youtu.be/3bbFc1225-4

Connect with Avery:

📺 Subscribe on YouTube: https://www.youtube.com/c/AverySmithDataCareerJumpstart/videos 🎙Listen to My Podcast: https://podcasts.apple.com/us/podcast/data-career-podcast/id1547386535 👔 Connect with me on LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://www.instagram.com/datacareerjumpstart/ 🎵 TikTok: [https://www.tiktok.com/@verydata?]

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Measuring Advertising Attention in a Cookieless World with John Hawkins

2023-02-07 · Leaders of Analytics Listen

podcast_episode

by John Hawkins (Playground XYZ) , Jonas Christensen

AI/ML Analytics

As the digital landscape evolves, privacy concerns and regulations are becoming increasingly important for advertisers. With the decline of third-party cookies and the rise of individual data usage consent, measuring advertising attention is more crucial than ever. One of the biggest challenges for advertisers in a cookie-less world is being able to accurately measure the effectiveness of their campaigns. Without cookies, it's harder to track user behaviour and understand how their ads are performing. However, measuring advertising attention through alternative methods such as viewability, brand lift studies, and surveys can be helpful, but they provide vague and delayed signals about advertising effectiveness. How can advertisers measure the attention and effectiveness of their advertising in real-time? To answer this question, I recently spoke to John Hawkins, Chief Scientist at Playground XYZ. Playground XYZ provides a machine learning-based platform for measuring and maximising attention on digital ads. The company’s Attention Intelligence Platform is a unique technology that uses over 40 different signals to track user attention as it happens. In this episode of Leaders of Analytics, we discuss: How Playground’s attention measurement platform works in practiceThe importance of attention time in a world without cookies, where privacy and consent are increasingly of mandated importanceDealing with the complexities of multi-layered machine learning pipelines and convincing stakeholders of their valueHow data science professionals can foster the right non-data science skills that will make them true unicorns, and much more.John on LinkedIn: https://www.linkedin.com/in/hawkinsjohnc/ John's book, Getting Data Science Done.

R All-in-One For Dummies

2023-02-07 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joseph Schmuller

AI/ML Data Management data data-science data-science-tools r

A deep dive into the programming language of choice for statistics and data With R All-in-One For Dummies, you get five mini-books in one, offering a complete and thorough resource on the R programming language and a road map for making sense of the sea of data we're all swimming in. Maybe you're pursuing a career in data science, maybe you're looking to infuse a little statistics know-how into your existing career, or maybe you're just R-curious. This book has your back. Along with providing an overview of coding in R and how to work with the language, this book delves into the types of projects and applications R programmers tend to tackle the most. You'll find coverage of statistical analysis, machine learning, and data management with R. Grasp the basics of the R programming language and write your first lines of code Understand how R programmers use code to analyze data and perform statistical analysis Use R to create data visualizations and machine learning programs Work through sample projects to hone your R coding skill This is an excellent all-in-one resource for beginning coders who'd like to move into the data space by knowing more about R.

Reflecting On The Past 6 Years Of Data Engineering

2023-02-06 · Data Engineering Podcast Listen

podcast_episode

by Tobias Macey

AI/ML Airflow Alation Analytics API AWS Lambda BI Big Data Cloud Computing Dagster Data Engineering Data Management +12 more

Summary

This podcast started almost exactly six years ago, and the technology landscape was much different than it is now. In that time there have been a number of generational shifts in how data engineering is done. In this episode I reflect on some of the major themes and take a brief look forward at some of the upcoming changes.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Your host is Tobias Macey and today I'm reflecting on the major trends in data engineering over the past 6 years

Interview

Introduction 6 years of running the Data Engineering Podcast Around the first time that data engineering was discussed as a role

Followed on from hype about "data science"

Hadoop era Streaming Lambda and Kappa architectures

Not really referenced anymore

"Big Data" era of capture everything has shifted to focusing on data that presents value

Regulatory environment increases risk, better tools introduce more capability to understand what data is useful

Data catalogs

Amundsen and Alation

Orchestration engine

Oozie, etc. -> Airflow and Luigi -> Dagster, Prefect, Lyft, etc. Orchestration is now a part of most vertical tools

Cloud data warehouses Data lakes DataOps and MLOps Data quality to data observability Metadata for everything

Data catalog -> data discovery -> active metadata

Business intelligence

Read only reports to metric/semantic layers Embedded analytics and data APIs

Rise of ELT

dbt Corresponding introduction of reverse ETL

What are the most interesting, unexpected, or challenging lessons that you have learned while working on running the podcast? What do you have planned for the future of the podcast?

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Materialize:

Looking for the simplest way to get the freshest data possible to your teams? Because let's face it: if real-time were easy, everyone would be using it. Look no further than Materialize, the streaming database you already know how to use.

Materialize’s PostgreSQL-compatible interface lets users leverage the tools they already use, with unsurpassed simplicity enabled by full ANSI SQL support. Delivered as a single platform with the separation of storage and compute, strict-serializability, active replication, horizontal scalability and workload isolation — Materialize is now the fastest way to build products with streaming data, drastically reducing the time, expertise, cost and maintenance traditionally associated with implementation of real-time features.

Sign up now for early access to Materialize and get started with the power of streaming data with the same simplicity and low implementation cost as batch cloud data warehouses.

Go to materialize.comSupport Data Engineering Podcast

Navigating Career Changes in Machine Learning - Chris Szafranek

2023-02-03 · DataTalks.Club Listen

podcast_episode

by Chris Szafranek

AI/ML C#/.NET Cloud Computing Data Engineering DataOps GitHub HTML LLM

We talked about

Chris’s background Switching careers multiple times Freedom at companies Chris’s role as an internal consultant Chris’s sabbatical ChatGPT How being a generalist helped Chris in his career The cons of being a generalist and the importance of T-shaped expertise The importance of learning things you’re interested in Tips to enjoy learning new things Recruiting generalists The job market for generalists vs for specialists Narrowing down your interests Chris’s book recommendations

Links:

Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman Andrej Karpathy, former Senior Director of AI at Tesla, who's now focused on teaching and sharing his knowledge: https://www.youtube.com/@AndrejKarpathy Beautifully done videos on engineering of things in the real world: https://www.youtube.com/@RealEngineering Chris' website: https://szafranek.net/ Zalando Tech Radar: https://opensource.zalando.com/tech-radar/ Modal Labs, new way of deploying code to the cloud, also useful for testing ML code on GPUs: https://modal.com Excellent Twitter account to follow to learn more about prompt engineering for ChatGPT: https://twitter.com/goodside Image prompts for Midjourney: https://twitter.com/GuyP Machine Learning Workflows in Production - Krzysztof Szafanek: https://www.youtube.com/watch?v=CO4Gqd95j6k From Data Science to DataOps: https://datatalks.club/podcast/s11e03-from-data-science-to-dataops.html

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

INNOVATIONS IN DATA SCIENCE

2023-02-01 · Superweek 2023

talk

by Kris Ewald (/ Senius)

Analytics Blockchain Data Analytics

Kris Ewald will give you an overview of Innovations in Data Science you should be aware of. If data driven insights are key to competitiveness, you need to keep innovating on how you Collect, Manage and Challenge data. With plenty of other talks about very specific tools and data analytics frameworks, this talk will instead aim to inspire you to apply new approaches to your data science - it'll give you a list of topics you should care about and pay attention to. Expect to hear about Zero-knowledge proofs, Homomorphic encryption, DAGs, and Blockchain and data as value objects.

Graph Data Science with Neo4j

2023-01-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Estelle Scifo (Neo4j)

AI/ML Neo4j Python data data-engineering graph-databases

"Graph Data Science with Neo4j" teaches you how to utilize Neo4j 5 and its Graph Data Science Library 2.0 for analyzing and making predictions with graph data. By integrating graph algorithms into actionable machine learning pipelines using Python, you'll harness the power of graph-based data models. What this Book will help me do Query and manipulate graph data using Cypher in Neo4j. Design and implement graph datasets using your data and public sources. Utilize graph-specific algorithms for tasks such as link prediction. Integrate graph data science pipelines into machine learning projects. Understand and apply predictive modeling using the GDS Library. Author(s) None Scifo, the author of "Graph Data Science with Neo4j," is an experienced data scientist with expertise in graph databases and advanced machine learning techniques. Their technical approach combines practical implementation with clear, step-by-step guidance to provide readers the skills they need to excel. Who is it for? This book is ideal for data scientists and analysts familiar with basic Neo4j concepts and Python-based data science workflows who wish to deepen their skills in graph algorithms and machine learning integration. It is particularly suited for professionals aiming to advance their expertise in graph data science for practical applications.

talk-data.com

Activity Trend

Top Events

Top Speakers

#129 Increasing Diverse Representation in Data Science

Brain Science is Data Science

Applied Geospatial Data Science with Python

Leading Biotech Data Teams

Accelerating the Adoption of AI through Diversity - Dânia Meira

The Kaggle Workbook

47 How She Tripled Her Income in 18 Months w/ Data Analytics, Data Science, & Data Engineering w/ Kedeisha Bryan

Experimentation for Engineers

#127 How Data Scientists Can Thrive in Consulting

Staff AI Engineer - Tatiana Gabruseva

#126 Make Your A/B Testing More Effective and Efficient

Os maiores cases de Data Science no Grupo Boticário - Data Hackers Podcast 63

Data Mining and Predictive Analytics for Business Decisions

45: 3-Step Guide To Building Your First Data Science Project

Measuring Advertising Attention in a Cookieless World with John Hawkins

R All-in-One For Dummies

Reflecting On The Past 6 Years Of Data Engineering

Navigating Career Changes in Machine Learning - Chris Szafranek

INNOVATIONS IN DATA SCIENCE

Graph Data Science with Neo4j