talk-data.com
People (461 results)
See all 461 →Activities & events
| Title & Speakers | Event |
|---|---|
|
DataOps, Observability, and The Cure for Data Team Blues - Christopher Bergh
2024-08-15 · 08:07
Johanna Berer
– Host/Interviewer
@ DataTalks.Club
,
Christopher Bergh
– CEO and Founder
@ DataKitchen
0:00 hi everyone Welcome to our event this event is brought to you by data dos club which is a community of people who love 0:06 data and we have weekly events and today one is one of such events and I guess we 0:12 are also a community of people who like to wake up early if you're from the states right Christopher or maybe not so 0:19 much because this is the time we usually have uh uh our events uh for our guests 0:27 and presenters from the states we usually do it in the evening of Berlin time but yes unfortunately it kind of 0:34 slipped my mind but anyways we have a lot of events you can check them in the 0:41 description like there's a link um I don't think there are a lot of them right now on that link but we will be 0:48 adding more and more I think we have like five or six uh interviews scheduled so um keep an eye on that do not forget 0:56 to subscribe to our YouTube channel this way you will get notified about all our future streams that will be as awesome 1:02 as the one today and of course very important do not forget to join our community where you can hang out with 1:09 other data enthusiasts during today's interview you can ask any question there's a pin Link in live chat so click 1:18 on that link ask your question and we will be covering these questions during the interview now I will stop sharing my 1:27 screen and uh there is there's a a message in uh and Christopher is from 1:34 you so we actually have this on YouTube but so they have not seen what you wrote 1:39 but there is a message from to anyone who's watching this right now from Christopher saying hello everyone can I 1:46 call you Chris or you okay I should go I should uh I should look on YouTube then okay yeah but anyways I'll you don't 1:53 need like you we'll need to focus on answering questions and I'll keep an eye 1:58 I'll be keeping an eye on all the question questions so um 2:04 yeah if you're ready we can start I'm ready yeah and you prefer Christopher 2:10 not Chris right Chris is fine Chris is fine it's a bit shorter um 2:18 okay so this week we'll talk about data Ops again maybe it's a tradition that we talk about data Ops every like once per 2:25 year but we actually skipped one year so because we did not have we haven't had 2:31 Chris for some time so today we have a very special guest Christopher Christopher is the co-founder CEO and 2:37 head chef or hat cook at data kitchen with 25 years of experience maybe this 2:43 is outdated uh cuz probably now you have more and maybe you stopped counting I 2:48 don't know but like with tons of years of experience in analytics and software engineering Christopher is known as the 2:55 co-author of the data Ops cookbook and data Ops Manifesto and it's not the 3:00 first time we have Christopher here on the podcast we interviewed him two years ago also about data Ops and this one 3:07 will be about data hops so we'll catch up and see what actually changed in in 3:13 these two years and yeah so welcome to the interview well thank you for having 3:19 me I'm I'm happy to be here and talking all things related to data Ops and why 3:24 why why bother with data Ops and happy to talk about the company or or what's changed 3:30 excited yeah so let's dive in so the questions for today's interview are prepared by Johanna berer as always 3:37 thanks Johanna for your help so before we start with our main topic for today 3:42 data Ops uh let's start with your ground can you tell us about your career Journey so far and also for those who 3:50 have not heard have not listened to the previous podcast maybe you can um talk 3:55 about yourself and also for those who did listen to the previous you can also maybe give a summary of what has changed 4:03 in the last two years so we'll do yeah so um my name is Chris so I guess I'm 4:09 a sort of an engineer so I spent about the first 15 years of my career in 4:15 software sort of working and building some AI systems some non- AI systems uh 4:21 at uh Us's NASA and MIT linol lab and then some startups and then um 4:30 Microsoft and then about 2005 I got I got the data bug uh I think you know my 4:35 kids were small and I thought oh this data thing was easy and I'd be able to go home uh for dinner at 5 and life 4:41 would be fine um because I was a big you started your own company right and uh it didn't work out that way 4:50 and um and what was interesting is is for me it the problem wasn't doing the 4:57 data like I we had smart people who did data science and data engineering the act of creating things it was like the 5:04 systems around the data that were hard um things it was really hard to not have 5:11 errors in production and I would sort of driving to work and I had a Blackberry at the time and I would not look at my 5:18 Blackberry all all morning I had this long drive to work and I'd sit in the parking lot and take a deep breath and 5:24 look at my Blackberry and go uh oh is there going to be any problems today and I'd be and if there wasn't I'd walk and 5:30 very happy um and if there was I'd have to like rce myself um and you know and 5:36 then the second problem is the team I worked for we just couldn't go fast enough the customers were super 5:42 demanding they didn't care they all they always thought things should be faster and we are always behind and so um how 5:50 do you you know how do you live in that world where things are breaking left and right you're terrified of making errors 5:57 um and then second you just can't go fast enough um and it's preh Hadoop era 6:02 right it's like before all this big data Tech yeah before this was we were using 6:08 uh SQL Server um and we actually you know we had smart people so we we we 6:14 built an engine in SQL Server that made SQL Server a column or 6:20 database so we built a column or database inside of SQL Server um so uh 6:26 in order to make certain things fast and and uh yeah it was it was really uh it's not 6:33 bad I mean the principles are the same right before Hadoop it's it's still a database there's still indexes there's 6:38 still queries um things like that we we uh at the time uh you would use olap 6:43 engines we didn't use those but you those reports you know are for models it's it's not that different um you know 6:50 we had a rack of servers instead of the cloud um so yeah and I think so what what I 6:57 took from that was uh it's just hard to run a team of people to do do data and analytics and it's not 7:05 really I I took it from a manager perspective I started to read Deming and 7:11 think about the work that we do as a factory you know and in a factory that produces insight and not automobiles um 7:18 and so how do you run that factory so it produces things that are good of good 7:24 quality and then second since I had come from software I've been very influenced 7:29 by by the devops movement how you automate deployment how you run in an agile way how you 7:35 produce um how you how you change things quickly and how you innovate and so 7:41 those two things of like running you know running a really good solid production line that has very low errors 7:47 um and then second changing that production line at at very very often they're kind of opposite right um and so 7:55 how do you how do you as a manager how do you technically approach that and 8:00 then um 10 years ago when we started data kitchen um we've always been a profitable company and so we started off 8:07 uh with some customers we started building some software and realized that we couldn't work any other way and that 8:13 the way we work wasn't understood by a lot of people so we had to write a book and a Manifesto to kind of share our our 8:21 methods and then so yeah we've been in so we've been in business now about a little over 10 8:28 years oh that's cool and uh like what 8:33 uh so let's talk about dat offs and you mentioned devops and how you were inspired by that and by the way like do 8:41 you remember roughly when devops as I think started to appear like when did people start calling these principles 8:49 and like tools around them as de yeah so agile Manifesto well first of all the I 8:57 mean I had a boss in 1990 at Nasa who had this idea build a 9:03 little test a little learn a lot right that was his Mantra and then which made 9:09 made a lot of sense um and so and then the sort of agile software Manifesto 9:14 came out which is very similar in 2001 and then um the sort of first real 9:22 devops was a guy at Twitter started to do automat automated deployment you know 9:27 push a button and that was like 200 Nish and so the first I think devops 9:33 Meetup was around then so it's it's it's been 15 years I guess 6 like I was 9:39 trying to so I started my career in 2010 so I my first job was a Java 9:44 developer and like I remember for some things like we would just uh SFTP to the 9:52 machine and then put the jar archive there and then like keep our fingers crossed that it doesn't break uh uh like 10:00 it was not really the I wouldn't call it this way right you were deploying you 10:06 had a Dey process I put it yeah 10:11 right was that so that was documented too it was like put the jar on production cross your 10:17 fingers I think there was uh like a page on uh some internal Viki uh yeah that 10:25 describes like with passwords and don't like what you should do yeah that was and and I think what's interesting is 10:33 why that changed right and and we laugh at it now but that was why didn't you 10:38 invest in automating deployment or a whole bunch of automated regression 10:44 tests right that would run because I think in software now that would be rare 10:49 that people wouldn't use C CD they wouldn't have some automated tests you know functional 10:56 regression tests that would be the |
DataTalks.Club |
|
Chris Bergh - DataOps Deep Dive
2024-08-06 · 06:12
Chris Bergh
– Head Chef
@ Data Kitchen
,
Joe Reis
– founder
@ Ternary Data
Chris Bergh joins me to chat about all things DataOps. We also discuss lean, removing waste from data processes and teams, and much more. DataKitchen: https://datakitchen.io/ DataOps Manifesto: https://dataopsmanifesto.org/en/ |
The Joe Reis Show |
|
The Evolution of DataOps: Insights from DataKitchen's CEO
2024-08-04 · 19:40
Chris Berg
– CEO
@ DataKitchen
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineersInterview IntroductionHow did you get involved in the area of data management?Can you describe what DataKitchen is and the story behind it?You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?What are the challenges that never went away?You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?Can you talk through the technical implementation of your new obserability and quality testing platform?What does the onboarding and integration process look like?Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?What do you have planned for the future of your work at DataKitchen?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links DataKitchenPodcast EpisodeNASADataOps ManifestoData Reliability EngineeringData ObservabilitydbtDevOps Enterprise SummitBuilding The Data Warehouse by Bill Inmon (affiliate link)dataops-testgen, dataops-observabilityFree Data Quality and Data Observability CertificationDatabricksDORA MetricsDORA for dataThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
DataOps, Observability, and The Cure for Data Team Blues
2024-07-16 · 10:30
Boosting Productivity and Navigating Challenges with LLMs - Christopher Bergh About the event Outline:
About the speaker: Christopher Bergh is the CEO and Head Chef at DataKitchen. Chris has over 30 years of experience in research, software engineering, data analytics, and executive management. At various points in his career, he has been a COO, CTO, VP, and Director of engineering. Chris has an M.S. from Columbia University and a B.S. from the University of Wisconsin-Madison. Chris is a recognized expert on DataOps. He is the co-author of the DataOps Cookbook and DataOps Manifesto and a speaker on DataOps at many industry conferences. Chris began his career at the Massachusetts Institute of Technology's Lincoln Laboratory and NASA Ames Research Center. There, he created software and algorithms that provided aircraft arrival optimization at several major airports in the United States. Chris served as a Peace Corps Volunteer Math Teacher in Botswana, Africa. DataTalks.Club is the place to talk about data. Join our slack community! |
DataOps, Observability, and The Cure for Data Team Blues
|
|
Taming Complexity In Your Data Driven Organization With DataOps
2020-04-28 · 02:00
Chris Bergh
– Head Chef
@ Data Kitchen
,
Tobias Macey
– host
Summary Data is a critical element to every role in an organization, which is also what makes managing it so challenging. With so many different opinions about which pieces of information are most important, how it needs to be accessed, and what to do with it, many data projects are doomed to failure. In this episode Chris Bergh explains how taking an agile approach to delivering value can drive down the complexity that grows out of the varied needs of the business. Building a DataOps workflow that incorporates fast delivery of well defined projects, continuous testing, and open lines of communication is a proven path to success. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! If DataOps sounds like the perfect antidote to your pipeline woes, DataKitchen is here to help. DataKitchen’s DataOps Platform automates and coordinates all the people, tools, and environments in your entire data analytics organization – everything from orchestration, testing and monitoring to development and deployment. In no time, you’ll reclaim control of your data pipelines so you can start delivering business value instantly, without errors. Go to dataengineeringpodcast.com/datakitchen today to learn more and thank them for supporting the show! Your host is Tobias Macey and today I’m welcoming back Chris Bergh to talk about ways that DataOps principles can help to reduce organizational complexity Interview Introduction How did you get involved in the area of data management? How are typical data and analytic teams organized? What are their roles and structure? Can you start by giving an outline of the ways that complexity can manifest in a data organization? What are some of the contributing factors that generate this complexity? How does the size or scale of an organization and their data needs impact the segmentation of responsibilities and roles? How does this organizational complexity play out within a single team? For example between data engineers, data scientists, and production/operations? How do you approach the definition of useful interfaces between different roles or groups within an organization? What are your thoughts on the relationship between the multivariate complexities of data and analytics workflows and the software trend toward microservices as a means of addressing the challenges of organizational communication patterns in the software lifecycle? How does this organizational complexity play out between multiple teams? For example between centralized data team and line of business self service teams? Isn’t organizational complexity just ‘the way it is’? Is there any how in getting out of meetings and inter team conflict? What are some of the technical elements that are most impactful in reducing the time to delivery for different roles? What are some strategies that you have found to be useful for maintaining a connection to the business need throughout the different stages of the data lifecycle? What are some of the signs or symptoms of problematic complexity that individuals and organizations should keep an eye out for? What role can automated testing play in improving this process? How do the current set of tools contribute to the fragmentation of data wor |
|
|
A DataOps vs DevOps Cookoff In The Data Kitchen
2019-03-18 · 10:00
Chris Bergh
– Head Chef
@ Data Kitchen
,
Tobias Macey
– host
Summary Delivering a data analytics project on time and with accurate information is critical to the success of any business. DataOps is a set of practices to increase the probability of success by creating value early and often, and using feedback loops to keep your project on course. In this episode Chris Bergh, head chef of Data Kitchen, explains how DataOps differs from DevOps, how the industry has begun adopting DataOps, and how to adopt an agile approach to building your data platform. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams. If you are tired of wasting your time cobbling together scripts and workarounds to give your developers, data scientists, and managers the permissions that they need then it’s time to talk to our friends at strongDM. They have built an easy to use platform that lets you leverage your company’s single sign on for your data platform. Go to dataengineeringpodcast.com/strongdm today to find out how you can simplify your systems. "There aren’t enough data conferences out there that focus on the community, so that’s why these folks built a better one": Data Council is the premier community powered data platforms & engineering event for software engineers, data engineers, machine learning experts, deep learning researchers & artificial intelligence buffs who want to discover tools & insights to build new products. This year they will host over 50 speakers and 500 attendees (yeah that’s one of the best "Attendee:Speaker" ratios out there) in San Francisco on April 17-18th and are offering a $200 discount to listeners of the Data Engineering Podcast. Use code: DEP-200 at checkout You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to dataengineeringpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Chris Bergh about the current state of DataOps and why it’s more than just DevOps for data Interview Introduction How did you get involved in the area of data management? We talked last year about what DataOps is, but can you give a quick overview of how the industry has changed or updated the definition since then? It is easy to draw parallels between DataOps and DevOps, can you provide some clarity as to how they are different? How has the conversat |
|
|
Defining DataOps with Chris Bergh - Episode 26
2018-04-08 · 21:00
Christopher Bergh
– CEO and Founder
@ DataKitchen
,
Tobias Macey
– host
Summary Managing an analytics project can be difficult due to the number of systems involved and the need to ensure that new information can be delivered quickly and reliably. That challenge can be met by adopting practices and principles from lean manufacturing and agile software development, and the cross-functional collaboration, feedback loops, and focus on automation in the DevOps movement. In this episode Christopher Bergh discusses ways that you can start adding reliability and speed to your workflow to deliver results with confidence and consistency. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Christopher Bergh about DataKitchen and the rise of DataOps Interview Introduction How did you get involved in the area of data management? How do you define DataOps? How does it compare to the practices encouraged by the DevOps movement? How does it relate to or influence the role of a data engineer? How does a DataOps oriented workflow differ from other existing approaches for building data platforms? One of the aspects of DataOps that you call out is the practice of providing multiple environments to provide a platform for testing the various aspects of the analytics workflow in a non-production context. What are some of the techniques that are available for managing data in appropriate volumes across those deployments? The practice of testing logic as code is fairly well understood and has a large set of existing tools. What have you found to be some of the most effective methods for testing data as it flows through a system? One of the practices of DevOps is to create feedback loops that can be used to ensure that business needs are being met. What are the metrics that you track in your platform to define the value that is being created and how the various steps in the workflow are proceeding toward that goal? In order to keep feedback loops fast it is necessary for tests to run quickly. How do you balance the need for larger quantities of data to be used for verifying scalability/performance against optimizing for cost and speed in non-production environments? How does the DataKitchen platform simplify the process of operationalizing a data analytics workflow? As the need for rapid iteration and deployment of systems to capture, store, process, and analyze data becomes more prevalent how do you foresee that feeding back into the ways that the landscape of data tools are designed and developed? Contact Info LinkedIn @ChrisBergh on Twitter Email Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links DataOps Manifesto DataKitchen 2017: The Year Of DataOps Air Traffic Control Chief Data Officer (CDO) Gartner W. Edwards Deming DevOps Total Quality Management (TQM) Informatica Talend Agile Development Cattle Not Pets IDE (Integrated Devel |
|