talk-data.com talk-data.com

Topic

LLM

Large Language Models (LLM)

nlp ai machine_learning

1405

tagged

Activity Trend

158 peak/qtr
2020-Q1 2026-Q1

Activities

1405 activities · Newest first

In this podcast episode, we talked with Bartosz Mikulski about Data Intensive AI.

About the Speaker: Bartosz is an AI and data engineer. He specializes in moving AI projects from the good-enough-for-a-demo phase to production by building a testing infrastructure and fixing the issues detected by tests. On top of that, he teaches programmers and non-programmers how to use AI. He contributed one chapter to the book 97 Things Every Data Engineer Should Know, and he was a speaker at several conferences, including Data Natives, Berlin Buzzwords, and Global AI Developer Days. 

In this episode, we discuss Bartosz’s career journey, the importance of testing in data pipelines, and how AI tools like ChatGPT and Cursor are transforming development workflows. From prompt engineering to building Chrome extensions with AI, we dive into practical use cases, tools, and insights for anyone working in data-intensive AI projects. Whether you’re a data engineer, AI enthusiast, or just curious about the future of AI in tech, this episode offers valuable takeaways and real-world experiences.

0:00 Introduction to Bartosz and his background 4:00 Bartosz’s career journey from Java development to AI engineering 9:05 The importance of testing in data engineering 11:19 How to create tests for data pipelines 13:14 Tools and approaches for testing data pipelines 17:10 Choosing Spark for data engineering projects 19:05 The connection between data engineering and AI tools 21:39 Use cases of AI in data engineering and MLOps 25:13 Prompt engineering techniques and best practices 31:45 Prompt compression and caching in AI models 33:35 Thoughts on DeepSeek and open-source AI models 35:54 Using AI for lead classification and LinkedIn automation 41:04 Building Chrome extensions with AI integration 43:51 Comparing Cursor and GitHub Copilot for coding 47:11 Using ChatGPT and Perplexity for AI-assisted tasks 52:09 Hosting static websites and using AI for development 54:27 How blogging helps attract clients and share knowledge 58:15 Using AI to assist with writing and content creation

🔗 CONNECT WITH Bartosz LinkedIn: https://www.linkedin.com/in/mikulskibartosz/ Github: https://github.com/mikulskibartosz Website: https://mikulskibartosz.name/blog/

🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/

Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. DataTopics Unpluggedis your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don’t), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data—unplugged style! In this episode: OpenAI asks White House for AI regulation relief: OpenAI seeks federal-level AI policy exceptions in exchange for transparency. But is this a sign they’re losing momentum?Hot take: GPT-4.5 is a ‘nothing burger’: Is GPT-4.5 actually an upgrade, or just a well-marketed rerun?Claude 3.7 & Blowing €100 in Two Days: One of the hosts tests Claude extensively—and racks up a pricey bill. Was it worth it?OpenAI’s Deep Research: How does OpenAI’s new research tool compare to Perplexity?AI cracks superbug problem in two days: AI speeds up decades of scientific research—should we be impressed or concerned?European tech coalition demands ‘radical action’ on digital sovereignty: Big names like Airbus and Proton push for homegrown European tech.Migrating from AWS to a European cloud: A real-world case study on cutting costs by 62%—is it worth the trade-offs?Docs by the French government: A Notion alternative for open-source government collaboration.Why people hate note-taking apps: A deep dive into the frustrations with Notion, Obsidian, and alternatives.Model Context Protocol (MCP): How MCP is changing AI tool integrations—and why OpenAI isn’t on board (yet).OpenRouter.ai: The one-stop API for switching between AI models. Does it live up to the hype?OTDiamond.ai: A multi-LLM approach that picks the best model for your queries to balance cost and performance.Are you polite to AI?: Study finds most people say "please" to ChatGPT—good manners or fear of the AI uprising?AI refusing to do your work?: A hilarious case of an AI refusing to generate code because it "wants you to learn."And finally, a big announcement—DataTopics Unplugged is evolving! Stay tuned for an updated format and a fresh take on tech discussions. 

Supported by Our Partners • WorkOS — The modern identity platform for B2B SaaS. • Vanta — Automate compliance and simplify security with Vanta. — Linux is the most widespread operating system, globally – but how is it built? Few people are better to answer this than Greg Kroah-Hartman: a Linux kernel maintainer for 25 years, and one of the 3 Linux Kernel Foundation Fellows (the other two are Linus Torvalds and Shuah Khan). Greg manages the Linux kernel’s stable releases, and is a maintainer of multiple kernel subsystems. We cover the inner workings of Linux kernel development, exploring everything from how changes get implemented to why its community-driven approach produces such reliable software. Greg shares insights about the kernel's unique trust model and makes a case for why engineers should contribute to open-source projects. We go into: • How widespread is Linux? • What is the Linux kernel responsible for – and why is it a monolith? • How does a kernel change get merged? A walkthrough • The 9-week development cycle for the Linux kernel • Testing the Linux kernel • Why is Linux so widespread? • The career benefits of open-source contribution • And much more! — Timestamps (00:00) Intro (02:23) How widespread is Linux? (06:00) The difference in complexity in different devices powered by Linux  (09:20) What is the Linux kernel? (14:00) Why trust is so important with the Linux kernel development (16:02) A walk-through of a kernel change (23:20) How Linux kernel development cycles work (29:55) The testing process at Kernel and Kernel CI  (31:55) A case for the open source development process (35:44) Linux kernel branches: Stable vs. development (38:32) Challenges of maintaining older Linux code  (40:30) How Linux handles bug fixes (44:40) The range of work Linux kernel engineers do  (48:33) Greg’s review process and its parallels with Uber’s RFC process (51:48) Linux kernel within companies like IBM (53:52) Why Linux is so widespread  (56:50) How Linux Kernel Institute runs without product managers  (1:02:01) The pros and cons of using Rust in Linux kernel  (1:09:55) How LLMs are utilized in bug fixes and coding in Linux  (1:12:13) The value of contributing to the Linux kernel or any open-source project  (1:16:40) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: What TPMs do and what software engineers can learn from them The past and future of modern backend practices Backstage: an open-source developer portal — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !! Aperte o play e ouça agora, o Data Hackers News dessa semana ! Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Conheça nossos comentaristas do Data Hackers News: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Paulo Vasconcellos Demais canais do Data Hackers: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

A challenge I frequently hear about from subscribers to my insights mailing list is how to design B2B data products for multiple user types with differing needs. From dashboards to custom apps and commercial analytics / AI products, data product teams often struggle to create a single solution that meets the diverse needs of technical and business users in B2B settings. If you're encountering this issue, you're not alone!

In this episode, I share my advice for tackling this challenge including the gift of saying "no.” What are the patterns you should be looking out for in your customer research? How can you choose what to focus on with limited resources? What are the design choices you should avoid when trying to build these products? I’m hoping by the end of this episode, you’ll have some strategies to help reduce the size of this challenge—particularly if you lack a dedicated UX team to help you sort through your various user/stakeholder demands. 

Highlights/ Skip to 

The importance of proper user research and clustering “jobs to be done” around business importance vs. task frequency—ignoring the rest until your solution can show measurable value  (4:29) What “level” of skill to design for, and why “as simple as possible” isn’t what I generally recommend (13:44) When it may be advantageous to use role or feature-based permissions to hide/show/change certain aspects, UI elements, or features  (19:50) Leveraging AI and LLMs in-product to allow learning about the user and progressive disclosure and customization of UIs (26:44) Leveraging the “old” solution of rapid prototyping—which is now faster than ever with AI, and can accelerate learning (capturing user feedback) (31:14) 5 things I do not recommend doing when trying to satisfy multiple user types in your b2b AI or analytics product (34:14)

Quotes from Today’s Episode

If you're not talking to your users and stakeholders sufficiently, you're going to have a really tough time building a successful data product for one user – let alone for multiple personas. Listen for repeating patterns in what your users are trying to achieve (tasks they are doing). Focus on the jobs and tasks they do most frequently or the ones that bring the most value to their business. Forget about the rest until you've proven that your solution delivers real value for those core needs. It's more about understanding the problems and needs, not just the solutions. The solutions tend to be easier to design when the problem space is well understood. Users often suggest solutions, but it's our job to focus on the core problem we're trying to solve; simply entering in any inbound requests verbatim into JIRA and then “eating away” at the list is not usually a reliable strategy. (5:52) I generally recommend not going for “easy as possible” at the cost of shallow value. Instead, you’re going to want to design for some “mid-level” ability, understanding that this may make early user experiences with the product more difficult. Why? Oversimplification can mislead because data is complex, problems are multivariate, and data isn't always ideal. There are also “n” number of “not-first” impressions users will have with your product. This also means there is only one “first impression” they have. As such, the idea conceptually is to design an amazing experience for the “n” experiences, but not to the point that users never realize value and give up on the product.  While I'd prefer no friction, technical products sometimes will have to have a little friction up front however, don't use this as an excuse for poor design. This is hard to get right, even when you have design resources, and it’s why UX design matters as thinking this through ends up determining, in part, whether users obtain the promise of value you made to them. (14:21) As an alternative to rigid role and feature-based permissions in B2B data products, you might consider leveraging AI and / or LLMs in your UI as a means of simplifying and customizing the UI to particular users. This approach allows users to potentially interrogate the product about the UI, customize the UI, and even learn over time about the user’s questions (jobs to be done) such that becomes organically customized over time to their needs. This is in contrast to the rigid buckets that role and permission-based customization present. However, as discussed in my previous episode (164 - “The Hidden UX Taxes that AI and LLM Features Impose on B2B Customers Without Your Knowledge”)  designing effective AI features and capabilities can also make things worse due to the probabilistic nature of the responses GenAI produces. As such, this approach may benefit from a UX designer or researcher familiar with designing data products. Understanding what “quality” means to the user, and how to measure it, is especially critical if you’re going to leverage AI and LLMs to make the product UX better. (20:13) The old solution of rapid prototyping is even more valuable now—because it’s possible to prototype even faster. However, prototyping is not just about learning if your solution is on track. Whether you use AI or pencil and paper, prototyping early in the product development process should be framed as a “prop to get users talking.” In other words, it is a prop to facilitate problem and need clarity—not solution clarity. Its purpose is to spark conversation and determine if you're solving the right problem. As you iterate, your need to continually validate the problem should shrink, which will present itself in the form of consistent feedback you hear from end users. This is the point where you know you can focus on the design of the solution. Innovation happens when we learn; so the goal is to increase your learning velocity. (31:35) Have you ever been caught in the trap of prioritizing feature requests based on volume? I get it. It's tempting to give the people what they think they want. For example, imagine ten users clamoring for control over specific parameters in your machine learning forecasting model. You could give them that control, thinking you're solving the problem because, hey, that's what they asked for! But did you stop to ask why they want that control? The reasons behind those requests could be wildly different. By simply handing over the keys to all the model parameters, you might be creating a whole new set of problems. Users now face a "usability tax," trying to figure out which parameters to lock and which to let float. The key takeaway? Focus on addressing the frequency that the same problems are occurring across your users, not just the frequency a given tactic or “solution” method (i.e. “model” or “dashboard” or “feature”) appears in a stakeholder or user request. Remember, problems are often disguised as solutions. We've got to dig deeper and uncover the real needs, not just address the symptoms. (36:19)

Summary In this episode of Data and AI with Mukundan, the host discusses the creation and impact of an AI life planner designed to enhance productivity and time management. The conversation covers the technology behind the planner, including the use of GPT-4, Google Calendar API, and the Pomodoro technique, as well as the personal transformation experienced by the host as a result of implementing this tool. Takeaways Most of us struggle with time management.AI can help optimize our schedules.The AI life planner analyzes daily habits.It syncs with Google Calendar for seamless planning.Reminders are sent via Slack API integration.A Pomodoro timer helps maintain focus.The planner allows for real-time adjustments.Productivity can skyrocket with the right tools.You can build your own AI life planner.Engaging with the audience for feedback is important.If you want to see exactly how I built this AI Life Planner, check out my full guide here: https://mukundansankar.substack.com/p/i-never-thought-i-had-my-life-together

How the App looks like: https://youtu.be/pyyWV7-Ty5w?feature=shared

In this podcast episode, we talked with Nemanja Radojkovic about MLOps in Corporations and Startups.

About the Speaker: Nemanja Radojkovic is Senior Machine Learning Engineer at Euroclear.

In this event,we’re diving into the world of MLOps, comparing life in startups versus big corporations. Joining us again is Nemanja, a seasoned machine learning engineer with experience spanning Fortune 500 companies and agile startups. We explore the challenges of scaling MLOps on a shoestring budget, the trade-offs between corporate stability and startup agility, and practical advice for engineers deciding between these two career paths. Whether you’re navigating legacy frameworks or experimenting with cutting-edge tools.

1:00 MLOps in corporations versus startups 6:03 The agility and pace of startups 7:54 MLOps on a shoestring budget 12:54 Cloud solutions for startups 15:06 Challenges of cloud complexity versus on-premise 19:19 Selecting tools and avoiding vendor lock-in 22:22 Choosing between a startup and a corporation 27:30 Flexibility and risks in startups 29:37 Bureaucracy and processes in corporations 33:17 The role of frameworks in corporations 34:32 Advantages of large teams in corporations 40:01 Challenges of technical debt in startups 43:12 Career advice for junior data scientists 44:10 Tools and frameworks for MLOps projects 49:00 Balancing new and old technologies in skill development 55:43 Data engineering challenges and reliability in LLMs 57:09 On-premise vs. cloud solutions in data-sensitive industries 59:29 Alternatives like Dask for distributed systems

🔗 CONNECT WITH NEMANJA LinkedIn -   / radojkovic   Github - https://github.com/baskervilski

🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events  LinkedIn -   / datatalks-club    Twitter -   / datatalksclub    Website - https://datatalks.club/ 

In this session we will go over how we created GneissWeb and discuss tools and techniques used. We will provide code examples that you can try at your leisure.

👉 > 2% avg improvement in benchmark performance over FineWeb 👉 Huggingface page 👉 Data prep kit detailed recipe 👉 Data prep kit bloom filter for quick reproduction 👉 Recipe models for reproduction 👉 announcement 👉 Paper

At IBM, responsible AI implies transparency in training data: Introducing GneissWeb (pronounced “niceWeb”), a state-of-the-art LLM pre-training dataset with ~10 Trillion tokens derived from FineWeb, with open recipes, results, and tools for reproduction! In this session we will go over how we created GneissWeb and discuss tools and techniques used. We will provide code examples that you can try at your leisure.

Hands-On APIs for AI and Data Science

Are you ready to grow your skills in AI and data science? A great place to start is learning to build and use APIs in real-world data and AI projects. API skills have become essential for AI and data science success, because they are used in a variety of ways in these fields. With this practical book, data scientists and software developers will gain hands-on experience developing and using APIs with the Python programming language and popular frameworks like FastAPI and StreamLit. As you complete the chapters in the book, you'll be creating portfolio projects that teach you how to: Design APIs that data scientists and AIs love Develop APIs using Python and FastAPI Deploy APIs using multiple cloud providers Create data science projects such as visualizations and models using APIs as a data source Access APIs using generative AI and LLMs

Are you prepared for the hidden UX taxes that AI and LLM features might be imposing on your B2B customers—without your knowledge? Are you certain that your AI product or features are truly delivering value, or are there unseen taxes that are working against your users and your product / business? In this episode, I’m delving into some of UX challenges that I think need to be addressed when implementing LLM and AI features into B2B products.

While AI seems to offer the change for significantly enhanced productivity, it also introduces a new layer of complexity for UX design. This complexity is not limited to the challenges of designing in a probabilistic medium (i.e. ML/AI), but also in being able to define what “quality” means. When the product team does not have a shared understanding of what a measurably better UX outcome means, improved sales and user adoption are less likely to follow. 

I’ll also discuss aspects of designing for AI that may be invisible on the surface. How might AI-powered products change the work of B2B users? What are some of the traps I see some startup clients and founders I advise in MIT’s Sandbox venture fund fall into?

If you’re a product leader in B2B / enterprise software and want to make sure your AI capabilities don’t end up creating more damage than value for users,  this episode will help!  

Highlights/ Skip to 

Improving your AI model accuracy improves outputs—but customers only care about outcomes (4:02) AI-driven productivity gains also put the customer’s “next problem” into their face sooner. Are you addressing the most urgent problem they now have—or used to have? (7:35) Products that win will combine AI with tastefully designed deterministic-software—because doing everything for everyone well is impossible and most models alone aren’t products (12:55) Just because your AI app or LLM feature can do ”X” doesn't mean people will want it or change their behavior (16:26) AI Agents sound great—but there is a human UX too, and it must enable trust and intervention at the right times (22:14) Not overheard from customers: “I would buy this/use this if it had AI” (26:52) Adaptive UIs sound like they’ll solve everything—but to reduce friction, they need to adapt to the person, not just the format of model outputs (30:20) Introducing AI introduces more states and scenarios that your product may need to support that may not be obvious right away (37:56)

Quotes from Today’s Episode

Product leaders have to decide how much effort and resources you should put into model improvements versus improving a user’s experience. Obviously, model quality is important in certain contexts and regulated industries, but when GenAI errors and confabulations are lower risk to the user (i.e. they create minor friction or inconveniences), the broader user experience that you facilitate might be what is actually determining the true value of your AI features or product. Model accuracy alone is not going to necessarily lead to happier users or increased adoption. ML models can be quantifiably tested for accuracy with structured tests, but because they’re easier to test for quality vs. something like UX doesn’t mean users value these improvements more. The product will stand a better chance of creating business value when it is clearly demonstrating it is improving your users’ lives. (5:25) When designing AI agents, there is still a human UX - a beneficiary - in the loop. They have an experience, whether you designed it with intention or not. How much transparency needs to be given to users when an agent does work for them? Should users be able to intervene when the AI is doing this type of work?  Handling errors is something we do in all software, but what about retraining and learning so that the future user experiences is better? Is the system learning anything while it’s going through this—and can I tell if it’s learning what I want/need it to learn? What about humans in the loop who might interact with or be affected by the work the agent is doing even if they aren’t the agent’s owner or “user”? Who’s outcomes matter here? At what cost? (22:51) Customers primarily care about things like raising or changing their status, making more money, making their job easier, saving time, etc. In fact,I believe a product marketed with GenAI may eventually signal a negative / burden on customers thanks to the inflated and unmet expectations around AI that is poorly implemented in the product UX. Don’t think it’s going to be bought just because it using  AI in a novel way. Customers aren’t sitting around wishing for “disruption” from your product; quite the opposite. AI or not, you need to make the customer the hero. Your AI will shine when it delivers an outsized UX outcome for your users (27:49) What kind of UX are you delivering right out of the box when a customer tries out your AI product or feature? Did you design it for tire kicking, playing around, and user stress testing? Or just an idealistic happy path? GenAI features inside b2b products should surface capabilities and constraints particularly around where users can create value for themselves quickly.  Natural hints and well-designed prompt nudges in LLMs for example are important to users and to your product team: because you’re setting a more realistic expectation of what’s possible with customers and helping them get to an outcome sooner. You’re also teaching them how to use your solution to get the most value—without asking them to go read a manual. (38:21)

Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. This week, we dive into the latest in AI-assisted coding, software quality, and the ongoing debate on whether LLMs will replace developers—or just make their lives easier: My LLM Codegen workflow atm: A deep dive into using LLMs for coding, including structured workflows, tool recommendations, and the fine line between automation and chaos.Cline & Cursor: Exploring VSCode extensions and AI-powered coding tools that aim to supercharge development—but are they game-changers or just fancy autocomplete?To avoid being replaced by LLMs, do what they can’t: A thought-provoking take on the future of programming, the value of human intuition, and how to stay ahead in an AI-driven world.The wired brain: Why we should stop using glowing-brain stock images to talk about AI—and what that says about how we understand machine intelligence.A year of uv: Reflecting on a year of UV, the rising star of Python package managers. Should you switch? Maybe. Probably.Posting: A look at a fun GitHub project that makes sharing online a little more structured.Software Quality: AI may generate code, but does it generate good code? A discussion on testing, maintainability, and avoiding spaghetti.movingWithTheTimes: A bit of programmer humor to lighten the mood—because tech discussions need memes too.

Summary In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large language models (LLMs) to enhance productivity and reduce manual toil. The conversation covers the potential of AI to transform data engineering tasks, such as text-to-SQL interfaces and creating semantic graphs to improve data accessibility, and explores practical applications of LLMs in automating code reviews, testing, and understanding data lineage.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Gleb Mezhanskiy about Interview IntroductionHow did you get involved in the area of data management?modern data stack is deadwhere is AI in the data stack?"buy our tool to ship AI"opportunities for LLM in DE workflowContact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DatafoldCopilotCursor IDEAI AgentsDataChatAI Engineering Podcast EpisodeMetrics LayerEmacsLangChainLangGraphCrewAIThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

I tested DeepSeek-- an emerging AI platform that makes ChatGPT look ancient! I asked it to outline a comprehensive roadmap for becoming a data analyst. What it said scared me (Spoiler: it basically copied my SPN Method)! Listen to NEXT: My interview with StatQuest! https://www.youtube.com/watch?v=nqtQUg4mZ9I 💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com/interviewsimulator ⌚ TIMESTAMPS 00:00 - Introduction 01:05 - Skills 01:27 - Do you need a degree? DeepSeek answers 01:59 - Projects and portfolio 02:43 - Networking and job search strategies 04:55 - Interview preparation 10:15 - FindADataJob.com and PremiumDataJobs.com 11:30 - InterviewSimulator.io 🔗 CONNECT WITH AVERY 🎥 YouTube Channel: https://www.youtube.com/@averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. DataTopics Unpluggedis your go-to spot for relaxed discussions on tech, news, data, and society. This week, we’re unpacking everything from AI-powered vacations (or the lack thereof) to corporate drama, and even a deep dive into the quirks of COBOL. Join Morillo, Bart, and Alex as they navigate the latest happenings in data and tech, including: Airbnb AI: The CEO of Airbnb thinks AI trip planning is still a pipe dream. Is he right?Anthropic’s next AI model: A new Claude model could be just weeks away, promising a hybrid of deep reasoning and speed.OpenAI’s roadmap: Sam Altman lays out vague but ambitious plans, blurring the lines between AI models.Elon vs. OpenAI: Musk offers $97B for OpenAI, Altman claps back. Just another day in AI power struggles.RIP Viktor Antonov: The legendary art lead behind Half-Life 2 and Dishonored passes away at 52.Project Sid AI agents: 1,000 AI agents left to their own devices in Minecraft… What could go wrong?DeepSeek R1 breaks speed records: The latest AI model boasts a staggering 198 tokens per second.Perplexity’s Deep Research is now free: A game-changer for AI-powered search? We discuss.COBOL and the mystery of 1875-05-20: Why do old systems default to weird dates?Polars Cloud: A new distributed architecture to run Polars anywhere.Pickle AI avatars: Deepfake yourself into meetings. Ethical? Useful? Just plain weird?Vim after Bram: How the legendary text editor is surviving after its creator’s passing.Working Fast and Slow: A take on productivity, deep focus, and why some days just don’t work.We were wrong about GPUs: Fly.io admits they misjudged the demand for GPU-powered workloads.

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !!

Aperte o play e ouça agora, o Data Hackers News dessa semana !

Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: