talk-data.com talk-data.com

Topic

DevOps

software_development it_operations continuous_delivery

41

tagged

Activity Trend

25 peak/qtr
2020-Q1 2026-Q1

Activities

41 activities · Newest first

Send us a text Dive into the powerful world of mainframes! Chief Product Officer of IBM Z and LinuxONE, Tina Tarquinio, reveals the truth behind those eight nines of uptime and explores how mainframes are evolving with AI, hybrid cloud, and future-proofing strategies for mission-critical business decisions. 

Discover the cutting-edge innovations transforming enterprise computing—from on-chip AIU and Spyre AI accelerators enabling real-time inferencing at transaction speed, to how LinuxONE is redefining hybrid cloud architecture.  Tina discusses DevOps integration, AI-powered code assistants revolutionizing mainframe development, compelling AI use cases, and shares her bold predictions for the mainframe’s next 100 years.  Plus, career advice from a tech leader and what she does for fun! 00:46 Tina Tarquinio03:18 The Most Mainframe Surprise09:12 What IS the Mainframe Really?  8 Nines!14:40 On Chip AIU, Spyre Inferencing18:11 Mainframes with Hybrid Cloud19:11 The Linux One Pitch19:59 Exciting Mainframe Innovations22:09 DevOps23:36 Code Assistants26:03 AI Use Case27:49 Future Proofing Decisions37:17 Regulations38:45 Bold Prediction38:58 Mainframe 10040:48 Career Advice42:24 For FunLinkedIn: linkedin.com/in/tina-tarquinio Website: https://www.ibm.com/products/z

MakingDataSimple #IBMz #Mainframe #LinuxONE #AIInferencing #SpyreAccelerator #HybridCloud #EnterpriseAI #DevOps #AICodeAssistant #EightNines #TinaTarquinio #MainframeModernization #AIUChip #FutureProofing #TechLeadership #WatsonxCodeAssistant #CloudComputing #TelumII

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Summary In this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. Join them as they discuss Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay.

Pre-amble I hope you enjoy this cross-over episode of the AI Engineering Podcast, another show that I run to act as your guide to the fast-moving world of building scalable and maintainable AI systems. As generative AI models have grown more powerful and are being applied to a broader range of use cases, the lines between data and AI engineering are becoming increasingly blurry. The responsibilities of data teams are being extended into the realm of context engineering, as well as designing and supporting new infrastructure elements that serve the needs of agentic applications. This episode is an example of the types of work that are not easily categorized into one or the other camp.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloadsInterview IntroductionHow did you get involved in machine learning?Can you describe what FlexAI is and the story behind it?What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?Can you describe the design and architecture of the FlexAI platform?How has the implementation evolved from when you first started working on it?For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?What are the elements of AI workloads and applications that you are explicitly not trying to solve for?What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?When is FlexAI the wrong choice?What do you have planned for the future of FlexAI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links Flex AIAurora Super ComputerCoreWeaveKubernetesCUDAROCmTensor Processing Unit (TPU)PyTorchTritonTrainiumASIC == Application Specific Integrated CircuitSOC == System On a ChipLoveableFlexAI BlueprintsTenstorrentThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

What does AI transformation really look like inside a 180-year-old company? In this episode of Data Unchained, we are joined by Younes Hairej, founder and CEO of Aokumo Inc, a trailblazing company helping enterprises in Japan and beyond bridge the gap between business intent and AI execution. From deploying autonomous AI agents that eliminate the need for dashboards and YAML, to revitalizing siloed, analog systems in manufacturing, Younes shares what it takes to modernize legacy infrastructure without starting over. Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US

ArtificialIntelligence #EnterpriseAI #AITransformation #Kubernetes #DevOps #GenAI #DigitalTransformation #OpenSourceAI #DataInfrastructure #BusinessInnovation #AIInJapan #LegacyModernization #MetadataStrategy #AIOrchestration #CloudNative #AIAutomation #DataGovernance #MLOps #IntelligentAgents #TechLeadership

Hosted on Acast. See acast.com/privacy for more information.

In this season of the Analytics Engineering podcast, Tristan is digging deep into the world of developer tools and databases. There are few more widely used developer tools than Docker. From its launch back in 2013, Docker has completely changed how developers ship applications.  In this episode, Tristan talks to Solomon Hykes, the founder and creator of Docker. They trace Docker's rise from startup obscurity to becoming foundational infrastructure in modern software development. Solomon explains the technical underpinnings of containerization, the pivotal shift from platform-as-a-service to open-source engine, and why Docker's developer experience was so revolutionary.  The conversation also dives into his next venture Dagger, and how it aims to solve the messy, overlooked workflows of software delivery. Bonus: Solomon shares how AI agents are reshaping how CI/CD gets done and why the next revolution in DevOps might already be here. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

Summary In this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings. From his roots as an Oracle developer to leading the data platform at a major online travel company, Chakravarthy shares insights on managing diverse database technologies and providing databases as a service to streamline operations. He explains how his team has transitioned from DevOps to a platform engineering approach, centralizing expertise and automating repetitive tasks with AWS Service Catalog. Join them as they discuss the challenges of migrating legacy systems, integrating AI and ML for automation, and the importance of organizational buy-in in driving data platform success.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th.Your host is Tobias Macey and today I'm interviewing Chakri Kotaru about scaling successful data operations through standardized platform offeringsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the different ways that you have seen teams you work with fail due to lack of structure and opinionated design?Why NoSQL?Pairing different styles of NoSQL for different problemsUseful patterns for each NoSQL style (document, column family, graph, etc.)Challenges in platform automation and scaling edge casesWhat challenges do you anticipate as a result of the new pressures as a result of AI applications?What are the most interesting, innovative, or unexpected ways that you have seen platform engineering practices applied to data systems?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data platform engineering?When is NoSQL the wrong choice?What do you have planned for the future of platform principles for enabling data teams/data applications?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links RiakDynamoDBSQL ServerCassandraScyllaDBCAP TheoremTerraformAWS Service CatalogBlog PostThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Today, we’re joined by Ted Elliott, Chief Executive Officer of Copado, the leader in AI-powered DevOps for business applications. We talk about:  Impacts of AI agents over the next 5 yearsTed’s AI-generated Dr. Seuss book based on walks with his dogThe power of small data with AI, despite many believing more data is the answerThe challenge of being disciplined to enter only good dataGaming out SaaS company ideas with AI, such as a virtual venture capitalist

Supported by Our Partners • Sonar —  Trust your developers – verify your AI-generated code. • Vanta —Automate compliance and simplify security with Vanta. — In today's episode of The Pragmatic Engineer, I'm joined by Charity Majors, a well-known observability expert – as well as someone with strong and grounded opinions. Charity is the co-author of "Observability Engineering" and brings extensive experience as an operations and database engineer and an engineering manager. She is the cofounder and CTO of observability scaleup Honeycomb. Our conversation explores the ever-changing world of observability, covering these topics: • What is observability? Charity’s take • What is “Observability 2.0?” • Why Charity is a fan of platform teams • Why DevOps is an overloaded term: and probably no longer relevant • What is cardinality? And why does it impact the cost of observability so much? • How OpenTelemetry solves for vendor lock-in  • Why Honeycomb wrote its own database • Why having good observability should be a prerequisite to adding AI code or using AI agents • And more! — Timestamps (00:00) Intro  (04:20) Charity’s inspiration for writing Observability Engineering (08:20) An overview of Scuba at Facebook (09:16) A software engineer’s definition of observability  (13:15) Observability basics (15:10) The three pillars model (17:09) Observability 2.0 and the shift to unified storage (22:50) Who owns observability and the advantage of platform teams  (25:05) Why DevOps is becoming unnecessary  (27:01) The difficulty of observability  (29:01) Why observability is so expensive  (30:49) An explanation of cardinality and its impact on cost (34:26) How to manage cost with tools that use structured data  (38:35) The common worry of vendor lock-in (40:01) An explanation of OpenTelemetry (43:45) What developers get wrong about observability  (45:40) A case for using SLOs and how they help you avoid micromanagement  (48:25) Why Honeycomb had to write their database  (51:56) Companies who have thrived despite ignoring conventional wisdom (53:35) Observability and AI  (59:20) Vendors vs. open source (1:00:45) What metrics are good for  (1:02:31) RUM (Real User Monitoring)  (1:03:40) The challenges of mobile observability  (1:05:51) When to implement observability at your startup  (1:07:49) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • How Uber Built its Observability Platform https://newsletter.pragmaticengineer.com/p/how-uber-built-its-observability-platform  • Building an Observability Startup https://newsletter.pragmaticengineer.com/p/chronosphere  • How to debug large distributed systems https://newsletter.pragmaticengineer.com/p/antithesis  • Shipping to production https://newsletter.pragmaticengineer.com/p/shipping-to-production  — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

In this podcast episode, we talked with Agita Jaunzeme about Career choices, transitions and promotions in and out of tech.

About the Speaker:

Agita has designed a career spanning DevOps/DataOps engineering, management, community building, education, and facilitation. She has worked on projects across corporate, startup, open source, and non-governmental sectors. Following her passion, she founded an NGO focusing on the inclusion of expats and locals in Porto. Embodying the values of innovation, automation, and continuous learning, Agita provides practical insights on promotions, career pivots, and aligning work with passion and purpose.

During this event, discussed their career journey, starting with their transition from art school to programming and later into DevOps, eventually taking on leadership roles. They explored the challenges of burnout and the importance of volunteering, founding an NGO to support inclusion, gender equality, and sustainability. The conversation also covered key topics like mentorship, the differences between data engineering and data science, and the dynamics of managing volunteers versus employees. Additionally, the guest shared insights on community management, developer relations, and the importance of product vision and team collaboration.

0:00 Introduction and Welcome 1:28 Guest Introduction: Agita’s Background and Career Highlights 3:05 Transition to Tech: From Art School to Programming 5:40 Exploring DevOps and Growing into Leadership Roles 7:24 Burnout, Volunteering, and Founding an NGO 11:00 Volunteering and Mentorship Initiatives 14:00 Discovering Programming Skills and Early Career Challenges 15:50 Automating Work Processes and Earning a Promotion 19:00 Transitioning from DevOps to Volunteering and Project Management 24:00 Managing Volunteers vs. Employees and Building Organizational Skills 31:07 Personality traits in engineering vs. data roles 33:14 Differences in focus between data engineers and data scientists 36:24 Transitioning from volunteering to corporate work 37:38 The role and responsibilities of a community manager 39:06 Community management vs. developer relations activities 41:01 Product vision and team collaboration 43:35 Starting an NGO and legal processes 46:13 NGO goals: inclusion, gender equality, and sustainability 49:02 Community meetups and activities 51:57 Living off-grid in a forest and sustainability 55:02 Unemployment party and brainstorming session 59:03 Unemployment party: the process and structure

🔗 CONNECT WITH AGITA JAUNZEME Linkedin - /agita

🔗 CONNECT WITH DataTalksClub Join DataTalks.Club: ⁠https://datatalks.club/slack.html⁠ Our events: ⁠https://datatalks.club/events.html⁠ Datalike Substack - ⁠https://datalike.substack.com/⁠ LinkedIn: ⁠  / datatalks-club  

0:00

hi everyone Welcome to our event this event is brought to you by data dos club which is a community of people who love

0:06

data and we have weekly events and today one is one of such events and I guess we

0:12

are also a community of people who like to wake up early if you're from the states right Christopher or maybe not so

0:19

much because this is the time we usually have uh uh our events uh for our guests

0:27

and presenters from the states we usually do it in the evening of Berlin time but yes unfortunately it kind of

0:34

slipped my mind but anyways we have a lot of events you can check them in the

0:41

description like there's a link um I don't think there are a lot of them right now on that link but we will be

0:48

adding more and more I think we have like five or six uh interviews scheduled so um keep an eye on that do not forget

0:56

to subscribe to our YouTube channel this way you will get notified about all our future streams that will be as awesome

1:02

as the one today and of course very important do not forget to join our community where you can hang out with

1:09

other data enthusiasts during today's interview you can ask any question there's a pin Link in live chat so click

1:18

on that link ask your question and we will be covering these questions during the interview now I will stop sharing my

1:27

screen and uh there is there's a a message in uh and Christopher is from

1:34

you so we actually have this on YouTube but so they have not seen what you wrote

1:39

but there is a message from to anyone who's watching this right now from Christopher saying hello everyone can I

1:46

call you Chris or you okay I should go I should uh I should look on YouTube then okay yeah but anyways I'll you don't

1:53

need like you we'll need to focus on answering questions and I'll keep an eye

1:58

I'll be keeping an eye on all the question questions so um

2:04

yeah if you're ready we can start I'm ready yeah and you prefer Christopher

2:10

not Chris right Chris is fine Chris is fine it's a bit shorter um

2:18

okay so this week we'll talk about data Ops again maybe it's a tradition that we talk about data Ops every like once per

2:25

year but we actually skipped one year so because we did not have we haven't had

2:31

Chris for some time so today we have a very special guest Christopher Christopher is the co-founder CEO and

2:37

head chef or hat cook at data kitchen with 25 years of experience maybe this

2:43

is outdated uh cuz probably now you have more and maybe you stopped counting I

2:48

don't know but like with tons of years of experience in analytics and software engineering Christopher is known as the

2:55

co-author of the data Ops cookbook and data Ops Manifesto and it's not the

3:00

first time we have Christopher here on the podcast we interviewed him two years ago also about data Ops and this one

3:07

will be about data hops so we'll catch up and see what actually changed in in

3:13

these two years and yeah so welcome to the interview well thank you for having

3:19

me I'm I'm happy to be here and talking all things related to data Ops and why

3:24

why why bother with data Ops and happy to talk about the company or or what's changed

3:30

excited yeah so let's dive in so the questions for today's interview are prepared by Johanna berer as always

3:37

thanks Johanna for your help so before we start with our main topic for today

3:42

data Ops uh let's start with your ground can you tell us about your career Journey so far and also for those who

3:50

have not heard have not listened to the previous podcast maybe you can um talk

3:55

about yourself and also for those who did listen to the previous you can also maybe give a summary of what has changed

4:03

in the last two years so we'll do yeah so um my name is Chris so I guess I'm

4:09

a sort of an engineer so I spent about the first 15 years of my career in

4:15

software sort of working and building some AI systems some non- AI systems uh

4:21

at uh Us's NASA and MIT linol lab and then some startups and then um

4:30

Microsoft and then about 2005 I got I got the data bug uh I think you know my

4:35

kids were small and I thought oh this data thing was easy and I'd be able to go home uh for dinner at 5 and life

4:41

would be fine um because I was a big you started your own company right and uh it didn't work out that way

4:50

and um and what was interesting is is for me it the problem wasn't doing the

4:57

data like I we had smart people who did data science and data engineering the act of creating things it was like the

5:04

systems around the data that were hard um things it was really hard to not have

5:11

errors in production and I would sort of driving to work and I had a Blackberry at the time and I would not look at my

5:18

Blackberry all all morning I had this long drive to work and I'd sit in the parking lot and take a deep breath and

5:24

look at my Blackberry and go uh oh is there going to be any problems today and I'd be and if there wasn't I'd walk and

5:30

very happy um and if there was I'd have to like rce myself um and you know and

5:36

then the second problem is the team I worked for we just couldn't go fast enough the customers were super

5:42

demanding they didn't care they all they always thought things should be faster and we are always behind and so um how

5:50

do you you know how do you live in that world where things are breaking left and right you're terrified of making errors

5:57

um and then second you just can't go fast enough um and it's preh Hadoop era

6:02

right it's like before all this big data Tech yeah before this was we were using

6:08

uh SQL Server um and we actually you know we had smart people so we we we

6:14

built an engine in SQL Server that made SQL Server a column or

6:20

database so we built a column or database inside of SQL Server um so uh

6:26

in order to make certain things fast and and uh yeah it was it was really uh it's not

6:33

bad I mean the principles are the same right before Hadoop it's it's still a database there's still indexes there's

6:38

still queries um things like that we we uh at the time uh you would use olap

6:43

engines we didn't use those but you those reports you know are for models it's it's not that different um you know

6:50

we had a rack of servers instead of the cloud um so yeah and I think so what what I

6:57

took from that was uh it's just hard to run a team of people to do do data and analytics and it's not

7:05

really I I took it from a manager perspective I started to read Deming and

7:11

think about the work that we do as a factory you know and in a factory that produces insight and not automobiles um

7:18

and so how do you run that factory so it produces things that are good of good

7:24

quality and then second since I had come from software I've been very influenced

7:29

by by the devops movement how you automate deployment how you run in an agile way how you

7:35

produce um how you how you change things quickly and how you innovate and so

7:41

those two things of like running you know running a really good solid production line that has very low errors

7:47

um and then second changing that production line at at very very often they're kind of opposite right um and so

7:55

how do you how do you as a manager how do you technically approach that and

8:00

then um 10 years ago when we started data kitchen um we've always been a profitable company and so we started off

8:07

uh with some customers we started building some software and realized that we couldn't work any other way and that

8:13

the way we work wasn't understood by a lot of people so we had to write a book and a Manifesto to kind of share our our

8:21

methods and then so yeah we've been in so we've been in business now about a little over 10

8:28

years oh that's cool and uh like what

8:33

uh so let's talk about dat offs and you mentioned devops and how you were inspired by that and by the way like do

8:41

you remember roughly when devops as I think started to appear like when did people start calling these principles

8:49

and like tools around them as de yeah so agile Manifesto well first of all the I

8:57

mean I had a boss in 1990 at Nasa who had this idea build a

9:03

little test a little learn a lot right that was his Mantra and then which made

9:09

made a lot of sense um and so and then the sort of agile software Manifesto

9:14

came out which is very similar in 2001 and then um the sort of first real

9:22

devops was a guy at Twitter started to do automat automated deployment you know

9:27

push a button and that was like 200 Nish and so the first I think devops

9:33

Meetup was around then so it's it's it's been 15 years I guess 6 like I was

9:39

trying to so I started my career in 2010 so I my first job was a Java

9:44

developer and like I remember for some things like we would just uh SFTP to the

9:52

machine and then put the jar archive there and then like keep our fingers crossed that it doesn't break uh uh like

10:00

it was not really the I wouldn't call it this way right you were deploying you

10:06

had a Dey process I put it yeah

10:11

right was that so that was documented too it was like put the jar on production cross your

10:17

fingers I think there was uh like a page on uh some internal Viki uh yeah that

10:25

describes like with passwords and don't like what you should do yeah that was and and I think what's interesting is

10:33

why that changed right and and we laugh at it now but that was why didn't you

10:38

invest in automating deployment or a whole bunch of automated regression

10:44

tests right that would run because I think in software now that would be rare

10:49

that people wouldn't use C CD they wouldn't have some automated tests you know functional

10:56

regression tests that would be the

Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineersInterview IntroductionHow did you get involved in the area of data management?Can you describe what DataKitchen is and the story behind it?You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?What are the challenges that never went away?You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?Can you talk through the technical implementation of your new obserability and quality testing platform?What does the onboarding and integration process look like?Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?What do you have planned for the future of your work at DataKitchen?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links DataKitchenPodcast EpisodeNASADataOps ManifestoData Reliability EngineeringData ObservabilitydbtDevOps Enterprise SummitBuilding The Data Warehouse by Bill Inmon (affiliate link)dataops-testgen, dataops-observabilityFree Data Quality and Data Observability CertificationDatabricksDORA MetricsDORA for dataThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

In today's fast-paced digital world, managing IT operations is more complex than ever. With the rise of cloud services, microservices, and constant software deployments, the pressure on IT teams to keep everything running smoothly is immense. But how do you keep up with the ever-growing flood of data and ensure your systems are always available? AIOps is the use of artificial intelligence to automate and scale IT operations. But what exactly is AIOps, and how can it transform your IT operations? Assaf Resnick is the CEO and Co-Founder of BigPanda. Before founding BigPanda, Assaf was an investor at Sequoia Capital, where he focused on early and growth-stage investing in software, internet, and mobile sectors. Assaf’s time at Sequoia gave him a front-row seat to the challenges of IT scale, complexity, and velocity faced by Operations teams in rapidly scaling and accelerating organizations. This is the problem that Assaf founded BigPanda to solve. In the episode, Richie and Assaf explore AIOps, how AIOps helps manage increasingly complex IT operations, how AIOps differs from DevOps and MLOps, examples of AIOps projects, a real world application of AIOps, the key benefits of AIOps, how to implement AIOps, excitement in the space, how GenAI is improving AIOps and much more.  Links Mentioned in the Show: BigPandaGartner: Market Guide for AIOps Platforms[Course] Implementing AI Solutions in BusinessRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

We talked about:

Nemanja’s background

When Nemanja first work as a data person Typical problems that ML Ops folks solve in the financial sector What Nemanja currently does as an ML Engineer The obstacle of implementing new things in financial sector companies Going through the hurdles of DevOps Working with an on-premises cluster “ML Ops on a Shoestring” (You don’t need fancy stuff to start w/ ML Ops) Tactical solutions Platform work and code work Programming and soft skills needed to be an ML Engineer The challenges of transitioning from and electrical engineering and sales to ML Ops The ML Ops tech stack for beginners Working on projects to determine which skills you need

Links:

LinkedIn: https://www.linkedin.com/in/radojkovic/

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

We talked about:

Maria's background Marvelous MLOps Maria's definition of MLOps Alternate team setups without a central MLOps team Pragmatic vs non-pragmatic MLOps Must-have ML tools (categories) Maturity assessment What to start with in MLOps Standardized MLOps Convincing DevOps to implement Understanding what the tools are used for instead of knowing all the tools Maria's next project plans Is LLM Ops a thing? What Ahold Delhaize does Resource recommendations to learn more about MLOps The importance of data engineering knowledge for ML engineers

Links:

LinkedIn: https://www.linkedin.com/company/marvelous-mlops/

Website: https://marvelousmlops.substack.com/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

On today’s episode, we’re joined by Ben Johnson Founder, CEO of Particle41, a provider of software and product development solutions crafted by world-class app development, DevOps, and data science teams. We talk about:

What components the CTO owns in a SaaS companyOptimizing the efficiency of dev teamsHow much of the CTO role is internal vs. externalHow to interview & identify a great CTO candidate

We talked about:

Hugo's background Why do tools and the companies that run them have wildly different names Hugo's other projects beside Metaflow Transitioning from educator to DevRel What is DevRel? DevRel vs Marketing How DevRel coordinates with developers How DevRel coordinates with marketers What skills a DevRel needs The challenges that come with being an educator Becoming a good writer: nature vs nurture Hugo's approach to writing and suggestions Establishing a goal for your content Choosing a form of media for your content Is DevRel intercompany or intracompany? The Vanishing Gradients podcast Finding Hugo online

Links:

Hugo Browne's github: http://hugobowne.github.io/ Vanishing Gradients: https://vanishinggradients.fireside.fm/ MLOps and DevOps: Why Data Makes It Differenthttps://www.oreilly.com/radar/mlops-and-devops-why-data-makes-it-different/ Evaluate Metaflow for free, right from your Browser: https://outerbounds.com/sandbox/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Summary

A significant portion of the time spent by data engineering teams is on managing the workflows and operations of their pipelines. DataOps has arisen as a parallel set of practices to that of DevOps teams as a means of reducing wasted effort. Agile Data Engine is a platform designed to handle the infrastructure side of the DataOps equation, as well as providing the insights that you need to manage the human side of the workflow. In this episode Tevje Olin explains how the platform is implemented, the features that it provides to reduce the amount of effort required to keep your pipelines running, and how you can start using it in your own team.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack Your host is Tobias Macey and today I'm interviewing Tevje Olin about Agile Data Engine, a platform that combines data modeling, transformations, continuous delivery and workload orchestration to help you manage your data products and the whole lifecycle of your warehouse

Interview

Introduction How did you get involved in the area of data management? Can you describe what Agile Data Engine is and the story behind it? What are some of the tools and architectures that an organization might be able to replace with Agile Data Engine?

How does the unified experience of Agile Data Engine change the way that teams think about the lifecycle of their data? What are some of the types of experiments that are enabled by reduced operational overhead?

What does CI/CD look like for a data warehouse?

How is it different from CI/CD for software applications?

Can you describe how Agile Data Engine is architected?

How have the design and goals of the system changed since you first started working on it? What are the components that you needed to develop in-house to enable your platform goals?

What are the changes in the broader data ecosystem that have had the most influence on your product goals and customer adoption? Can you describe the workflow for a team that is using Agile Data Engine to power their business analytics?

What are some of the insights that you generate to help your customers understand how to improve their processes or identify new opportunities?

In your "about" page it mentions the unique approaches that you take for warehouse automation. How do your practices differ from the rest of the industry? How have changes in the adoption/implementation of ML and AI impacted the ways that your customers exercise your platform? What are the most interesting, innovative, or unexpected ways that you have seen the Agile Data Engine platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Agile Data Engine? When is Agile Data Engine the wrong choice? What do you have planned for the future of Agile Data Engine?

Guest Contact Info

LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

About Agile Data Engine

Agile Data Engine unlocks the potential of your data to drive business value - in a rapidly changing world. Agile Data Engine is a DataOps Management platform for designing, deploying, operating and managing data products, and managing the whole lifecycle of a data warehouse. It combines data modeling, transformations, continuous delivery and workload orchestration into the same platform.

Links

Agile Data Engine Bill Inmon Ralph Kimball Snowflake Redshift BigQuery Azure Synapse Airflow

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Rudderstack: Rudderstack

RudderStack provides all your customer data pipelines in one platform. You can collect, transform, and route data across your entire stack with its event streaming, ETL, and reverse ETL pipelines.

RudderStack’s warehouse-first approach means it does not store sensitive information, and it allows you to leverage your existing data warehouse/data lake infrastructure to build a single source of truth for every team.

RudderStack also supports real-time use cases. You can Implement RudderStack SDKs once, then automatically send events to your warehouse and 150+ business tools, and you’ll never have to worry about API changes again.

Visit dataengineeringpodcast.com/rudderstack to sign up for free today, and snag a free T-Shirt just for being a Data Engineering Podcast listener.Support Data Engineering Podcast

We talked about:

Christopher’s background The essence of DataOps Also known as Agile Analytics Operations or DevOps for Data Science Defining processes and automating them (defining “done” and “good”) The balance between heroism and fear (avoiding deferred value) The Lean approach Avoiding silos The 7 steps to DataOps Wanting to become replaceable DataOps is doable Testing tools DataOps vs MLOps The Head Chef at Data Kitchen What’s grilling at Data Kitchen? The DataOps Cookbook

Links:

DataOps Manifesto website: https://dataopsmanifesto.org/en/ DataOps Cookbook: https://dataops.datakitchen.io/pf-cookbook Recipes for DataOps Success: https://dataops.datakitchen.io/pf-recipes-for-dataops-success DataOps Certification Course: https://info.datakitchen.io/training-certification-dataops-fundamentals DataOps Blog: https://datakitchen.io/blog/ DataOps Maturity Model: https://datakitchen.io/dataops-maturity-model/ DataOps Webinars: https://datakitchen.io/webinars/

Join DataTalks.Club: https://datatalks.club/slack.html  

Our events: https://datatalks.club/events.html

Summary Putting machine learning models into production and keeping them there requires investing in well-managed systems to manage the full lifecycle of data cleaning, training, deployment and monitoring. This requires a repeatable and evolvable set of processes to keep it functional. The term MLOps has been coined to encapsulate all of these principles and the broader data community is working to establish a set of best practices and useful guidelines for streamlining adoption. In this episode Demetrios Brinkmann and David Aponte share their perspectives on this rapidly changing space and what they have learned from their work building the MLOps community through blog posts, podcasts, and discussion forums.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode is brought to you by Acryl Data, the company behind DataHub, the leading developer-friendly data catalog for the modern data stack. Open Source DataHub is running in production at several companies like Peloton, Optum, Udemy, Zynga and others. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. Signup for the SaaS product at dataengineeringpodcast.com/acryl RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Your host is Tobias Macey and today I’m interviewing Demetrios Brinkmann and David Aponte about what you need to know about MLOps as a data engineer

Interview

Introduction How did you get involved in the area of data management? Can you describe what MLOps is?

How does it relate to DataOps? DevOps? (is it just another buzzword?)

What is your interest and involvement in the space of MLOps? What are the open and active questions in the MLOps community? Who is responsible for MLOps in an organization?

What is the role of the data engineer in that process?

What are the core capabilities that are necessary to support an "MLOps" workflow? How do the current platform technologies support the adoption of MLOps workflows?

What are the areas that are currently underdeveloped/underserved?

Can you describe the technical and organizational design/architecture decisions that need to be made when endeavoring to adopt MLOps practices? What are some of the common requirements for supporting ML workflows?

What are some of the ways that requirements become bespoke to a given organization or project?

What are the opportunities for standardization or consolidation in the tooling for MLOps?

What are the pieces that are always going to require custom engineering?

What are the most interesting, innovative, or unexpected approaches to MLOps workflows/platforms that you have seen? What are the most interesting, unexpected, or challenging lessons that you

We talked about:

Andreas’s background Why data engineering is becoming more popular Who to hire first – a data engineer or a data scientist? How can I, as a data scientist, learn to build pipelines? Don’t use too many tools What is a data pipeline and why do we need it? What is ingestion? Can just one person build a data pipeline? Approaches to building data pipelines for data scientists Processing frameworks Common setup for data pipelines — car price prediction Productionizing the model with the help of a data pipeline Scheduling Orchestration Start simple Learning DevOps to implement data pipelines How to choose the right tool Are Hadoop, Docker, Cloud necessary for a first job/internship? Is Hadoop still relevant or necessary? Data engineering academy How to pick up Cloud skills Avoid huge datasets when learning Convincing your employer to do data science How to find Andreas

Links:

LinkedIn: https://www.linkedin.com/in/andreas-kretz Data engieering cookbook: https://cookbook.learndataengineering.com/ Course: https://learndataengineering.com/

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Summary The Data industry is changing rapidly, and one of the most active areas of growth is automation of data workflows. Taking cues from the DevOps movement of the past decade data professionals are orienting around the concept of DataOps. More than just a collection of tools, there are a number of organizational and conceptual changes that a proper DataOps approach depends on. In this episode Kevin Stumpf, CTO of Tecton, Maxime Beauchemin, CEO of Preset, and Lior Gavish, CTO of Monte Carlo, discuss the grand vision and present realities of DataOps. They explain how to think about your data systems in a holistic and maintainable fashion, the security challenges that threaten to derail your efforts, and the power of using metadata as the foundation of everything that you do. If you are wondering how to get control of your data platforms and bring all of your stakeholders onto the same page then this conversation is for you.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold. Once you sign up and create an alert in Datafold for your company data, they will send you a cool water flask. RudderStack’s smart customer data pipeline is warehouse-first. It builds your customer data warehouse and your identity graph on your data warehouse, with support for Snowflake, Google BigQuery, Amazon Redshift, and more. Their SDKs and plugins make event streaming easy, and their integrations with cloud applications like Salesforce and ZenDesk help you go beyond event streaming. With RudderStack you can use all of your customer data to answer more difficult questions and then send those insights to your whole customer data stack. Sign up free at dataengineeringpodcast.com/rudder today. Your host is Tobias Macey and today I’m interviewing Max Beauchemin, Lior Gavish, and Kevin Stumpf about the real world challenges of embracing DataOps practices and systems, and how to keep things secure as you scale

Interview

Introduction How did you get involved in the area of data management? Before we get started, can you each give your definition of what "DataOps" means to you?

How does this differ from "business as usual" in the data industry? What are some of the things that DataOps isn’t (despite what marketers might say)?

What are the biggest difficulties that you have faced in going from concept to production with a workflow or system intended to power self-serve access to other membe