talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

The rapid rise of generative AI is changing how businesses operate, but with this change comes new challenges. How do you navigate the balance between innovation and risk, especially in a regulated industry? As organizations race to adopt AI, it’s crucial to ensure that these technologies are not only transformative but also responsible. What steps can you take to harness AI’s potential while maintaining control and transparency? And how can you build excitement and trust around AI within your organization, ensuring that everyone is ready to embrace this new era? Steve Holden is the Senior Vice President and Head of Single-Family Analytics at Fannie Mae, leading a team of data science professionals, supporting loan underwriting, pricing and acquisition, securitization, loss mitigation, and loan liquidation for the company’s multi-trillion-dollar Single-Family mortgage portfolio. He is also responsible for all Generative AI initiatives across the enterprise. His team provides real-time analytic solutions that guide thousands of daily business decisions necessary to manage this extensive mortgage portfolio. The team comprises experts in econometric models, machine learning, data engineering, data visualization, software engineering, and analytic infrastructure design. Holden previously served as Vice President of Credit Portfolio Management Analytics at Fannie Mae. Before joining Fannie Mae in 1999, he held several analytic leadership roles and worked on economic issues at the Economic Strategy Institute and the U.S. Bureau of Labor Statistics. In the episode Adel and Steve explore opportunities in generative AI, building a GenAI program, use-case prioritization, driving excitement and engagement for an AI-first culture, skills transformation, governance as a competitive advantage, challenges of scaling AI, future trends in AI, and much more.  Links Mentioned in the Show: Fannie MaeSteve’s recent DataCamp Webinar: Bringing Generative AI to the EnterpriseVideo: Andrej Karpathy - [1hr Talk] Intro to Large Language ModelsSkill Track - AI Business FundamentalsRelated Episode: Generative AI at EY with John Thompson, Head of AI at EYRewatch sessions from RADAR: AI Edition Join the DataFramed team! Data Evangelist Data & AI Video Creator New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

podcast_episode
by Cris deRitis , Mark Zandi (Moody's Analytics) , Marisa DiNatale (Moody's Analytics) , Emily Mandel (Moody's Analytics)

The Inside Economics team discusses what they found most encouraging and disquieting in the blizzard of economic releases and events of this past week. Emily Mandel, our state and local government expert, also weighs in on the fiscal health of states and how the economy is performing in states that stand to swing the Presidential election. The group also takes up listener questions; keep the Qs coming.   Guest: Emily Mandel - Associate Director, Senior Economist, Moody's Analytics Hosts: Mark Zandi – Chief Economist, Moody’s Analytics, Cris deRitis – Deputy Chief Economist, Moody’s Analytics, and Marisa DiNatale – Senior Director - Head of Global Forecasting, Moody’s Analytics Follow Mark Zandi on 'X' @MarkZandi, Cris deRitis on LinkedIn, and Marisa DiNatale on LinkedIn

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

“Last week was a great year in GenAI,” jokes Mark Ramsey—and it’s a great philosophy to have as LLM tools especially continue to evolve at such a rapid rate. This week, you’ll get to hear my fun and insightful chat with Mark from Ramsey International about the world of large language models (LLMs) and how we make useful UXs out of them in the enterprise. 

Mark shared some fascinating insights about using a company’s website information (data) as a place to pilot a LLM project, avoiding privacy landmines, and how re-ranking of models leads to better LLM response accuracy. We also talked about the importance of real human testing to ensure LLM chatbots and AI tools truly delight users. From amusing anecdotes about the spinning beach ball on macOS to envisioning a future where AI-driven chat interfaces outshine traditional BI tools, this episode is packed with forward-looking ideas and a touch of humor.

Highlights/ Skip to:

(0:50) Why is the world of GenAI evolving so fast? (4:20) How Mark thinks about UX in an LLM application (8:11) How Mark defines “Specialized GenAI?” (12:42) Mark’s consulting work with GenAI / LLMs these days (17:29) How GenAI can help the healthcare industry (30:23) Uncovering users’ true feelings about LLM applications (35:02) Are UIs moving backwards as models progress forward? (40:53) How will GenAI impact data and analytics teams? (44:51) Will LLMs be able to consistently leverage RAG and produce proper SQL? (51:04) Where can find more from Mark and Ramsey International

Quotes from Today’s Episode “With [GenAI], we have a solution that we’ve built to try to help organizations, and build workflows. We have a workflow that we can run and ask the same question [to a variety of GenAI models] and see how similar the answers are. Depending on the complexity of the question, you can see a lot of variability between the models… [and] we can also run the same question against the different versions of the model and see how it’s improved. Folks want a human-like experience interacting with these models.. [and] if the model can start responding in just a few seconds, that gives you much more of a conversational type of experience.” - Mark Ramsey (2:38) “[People] don’t understand when you interact [with GenAI tools] and it brings tokens back in that streaming fashion, you’re actually seeing inside the brain of the model. Every token it produces is then displayed on the screen, and it gives you that typewriter experience back in the day. If someone has to wait, and all you’re seeing is a logo spinning, from a UX experience standpoint… people feel like the model is much faster if it just starts to produce those results in that streaming fashion. I think in a design, it’s extremely important to take advantage of that [...] as opposed to waiting to the end and delivering the results some models support that, and other models don’t.”- Mark Ramsey (4:35) "All of the data that’s on the website is public information. We’ve done work with several organizations on quickly taking the data that’s on their website, packaging it up into a vector database, and making that be the source for questions that their customers can ask. [Organizations] publish a lot of information on their websites, but people really struggle to get to it. We’ve seen a lot of interest in vectorizing website data, making it available, and having a chat interface for the customer. The customer can ask questions, and it will take them directly to the answer, and then they can use the website as the source information.” - Mark Ramsey (14:04) “I’m not skeptical at all. I’ve changed much of my [AI chatbot searches] to Perplexity, and I think it’s doing a pretty fantastic job overall in terms of quality. It’s returning an answer with citations, so you have a sense of where it’s sourcing the information from. I think it’s important from a user experience perspective. This is a replacement for broken search, as I really don’t want to read all the web pages and PDFs you have that might be about my chiropractic care query to answer my actual [healthcare] question.” - Brian O’Neill (19:22)

“We’ve all had great experience with customer service, and we’ve all had situations where the customer service was quite poor, and we’re going to have that same thing as we begin to [release more] chatbots. We need to make sure we try to alleviate having those bad experiences, and have an exit. If someone is running into a situation where they’d rather talk to a live person, have that ability to route them to someone else. That’s why the robustness of the model is extremely important in the implementation… and right now, organizations like OpenAI and Anthropic are significantly better at that [human-like] experience.” - Mark Ramsey (23:46) "There’s two aspects of these models: the training aspect and then using the model to answer questions. I recommend to organizations to always augment their content and don’t just use the training data. You’ll still get that human-like experience that’s built into the model, but you’ll eliminate the hallucinations. If you have a model that has been set up correctly, you shouldn’t have to ask questions in a funky way to get answers.” - Mark Ramsey (39:11) “People need to understand GenAI is not a predictive algorithm. It is not able to run predictions, it struggles with some math, so that is not the focus for these models. What’s interesting is that you can use the model as a step to get you [the answers]. A lot of the models now support functions… when you ask a question about something that is in a database, it actually uses its knowledge about the schema of the database. It can build the query, run the query to get the data back, and then once it has the data, it can reformat the data into something that is a good response back." - Mark Ramsey (42:02)

Links Mark on LinkedIn Ramsey International Email: mark [at] ramsey.international Ramsey International's YouTube Channel

In this episode, we explore the incredible story of Zach Wilson, a data engineer who skyrocketed his annual salary from $30,000 to $500,000 in just five years. Zach shares his experiences working for top tech companies like Facebook, Netflix, and Airbnb and details his transformation from a struggling drug user to a highly successful engineer. Connect with Zach Wilson : 🤝 Follow on Linkedin ▶️ Subscribe on YouTube 📩 Subscribe to Newsletter 🎒 Learn About Bootcamp Connect with Avery 📊 Come to my next free “How to Land Your First Data Job” training⁠ ⁠🏫 Check out my 10-week data analytics bootcamp ⁠📩 Get my weekly email with helpful data career tips⁠ ⁠🧙‍♂️ Ace the Interview with Confidence Timestamps: (4:29) Key to Career Success (10:39) Is Job Hopping Worth It? (18:16) Data Analyst to Data Engineer (22:19) The Rise of Analytics Engineers (32:21) Zach’s Upbringing (37:32) The Secret to Growth Connect with Avery: 📺 Subscribe on YouTube 🎙Listen to My Podcast 👔 Connect with me on LinkedIn 📸 Instagram 🎵 TikTok Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

podcast_episode
by Cris deRitis , Mark Zandi (Moody's Analytics) , Brendan LaCerda , Marisa DiNatale (Moody's Analytics)

Brendan LaCerda (not Canadian) joins Mark, Cris & Marisa to discuss a plethora of topics, including the Canadian railroad workers strike, the revisions to the employment numbers, Jerome Powell’s Jackson Hole speech, and the Moody’s Analytics election model. The team takes a few thought-provoking listener questions on the housing market and the savings rate.   Reach out to us with any questions or suggestions at [email protected] Hosts: Mark Zandi – Chief Economist, Moody’s Analytics, Cris deRitis – Deputy Chief Economist, Moody’s Analytics, and Marisa DiNatale – Senior Director - Head of Global Forecasting, Moody’s Analytics Follow Mark Zandi on 'X' @MarkZandi, Cris deRitis on LinkedIn, and Marisa DiNatale on LinkedIn

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Prepare-se para uma experiência inovadora e disruptiva ! No primeiro episódio do Podcast Data Hackers dublado por AI, vamos explorar os debates mais atuais e relevantes entre MLOps vs LLMOps. Este episódio marca uma nova era no nosso podcast, trazendo conteúdos em português com a precisão e naturalidade das vozes geradas por inteligência artificial.

Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam Demetrios Brinkmann, Co-FounderCo-Founder MLOps Community , vai compartilhar suas experiências e insights sobre como o dia a dia nesse campo evoluiu ao longo do tempo. Discutimos as principais diferenças técnicas entre MLOps e LLMOps, e o que isso significa para os profissionais que atuam ou desejam atuar nessa área.

Se você é apaixonado pela área de dados, quer entender as últimas tendências e se atualizar sobre o futuro do MLOps, não pode perder este episódio!

Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!

Nossa Bancada Data Hackers:

Paulo Vasconcellos — Co-founder da Data Hackers e Principal Data Scientist na Hotmart. Monique Femme — Head of Community Management na Data Hackers Gabriel Lages — Co-founder da Data Hackers e Data & Analytics Sr. Director na Hotmart.

Referências no Medium.

Microsoft Excel is the most widely used data analysis tool on the planet, but somehow it doesn't get the love it deserves.   In this episode, Sonali and Chris talk about how they leverage Excel in their data roles at prominent companies, and share practical tips you can use to add more value to your own organization.   You'll leave the show with an insider's look at where Excel can make a big impact, and actionable advice about what to learn next to take your game to the next level.   What You'll Learn: What makes Excel the most versatile tool used by data professionals How Chris and Sonali are leveraging Excel to drive value at their organizations Tips for where to focus to get the most out of Excel   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   About our guests: Sonali Kumar works as a Data Analyst specializing in HR Analytics. She is also a podcast host of "Success Beyond Our Greatest Fears." Follow Sonali on LinkedIn  

Chris French is a Data Advisor, LinkedIn Learning Instructor, and founder of DataFrenchy Academy Follow Chris on LinkedIn   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

LLMs and Generative AI for Healthcare

Large language models (LLMs) and generative AI are rapidly changing the healthcare industry. These technologies have the potential to revolutionize healthcare by improving the efficiency, accuracy, and personalization of care. This practical book shows healthcare leaders, researchers, data scientists, and AI engineers the potential of LLMs and generative AI today and in the future, using storytelling and illustrative use cases in healthcare. Authors Kerrie Holley, former Google healthcare professionals, guide you through the transformative potential of large language models (LLMs) and generative AI in healthcare. From personalized patient care and clinical decision support to drug discovery and public health applications, this comprehensive exploration covers real-world uses and future possibilities of LLMs and generative AI in healthcare. With this book, you will: Understand the promise and challenges of LLMs in healthcare Learn the inner workings of LLMs and generative AI Explore automation of healthcare use cases for improved operations and patient care using LLMs Dive into patient experiences and clinical decision-making using generative AI Review future applications in pharmaceutical R&D, public health, and genomics Understand ethical considerations and responsible development of LLMs in healthcare "The authors illustrate generative's impact on drug development, presenting real-world examples of its ability to accelerate processes and improve outcomes across the pharmaceutical industry." --Harsh Pandey, VP, Data Analytics & Business Insights, Medidata-Dassault Kerrie Holley is a retired Google tech executive, IBM Fellow, and VP/CTO at Cisco. Holley's extensive experience includes serving as the first Technology Fellow at United Health Group (UHG), Optum, where he focused on advancing and applying AI, deep learning, and natural language processing in healthcare. Manish Mathur brings over two decades of expertise at the crossroads of healthcare and technology. A former executive at Google and Johnson & Johnson, he now serves as an independent consultant and advisor. He guides payers, providers, and life sciences companies in crafting cutting-edge healthcare solutions.

Polars Cookbook

Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed.

In this brilliant episode, Jason Foster delves into the transformative impact of artificial intelligence on businesses and everyday life with Tom Goodwin, a world-renowned trends and transformation expert, keynote speaker, consultant, and author. Tom shares his expert insights on how technology reshapes the rules of business, creates new possibilities, and influences consumer behaviour. From discussing the hype surrounding AI to exploring its practical applications and the importance of adapting to technological advancements, this conversation offers a comprehensive look at the future of AI in the corporate world and beyond.


Cynozure is a leading data, analytics and AI company that helps organisations to reach their data potential. They work with clients on data and AI strategy, data management, data architecture and engineering, analytics and AI, data culture and literacy, and change management and leadership. The company was named one of The Sunday Times' fastest-growing private companies in 2022 and 2023 and named the Best Place to Work in Data by DataIQ in 2023.

Learn three powerful resume hacks to navigate the applicant tracking system (ATS) and significantly increase your chances of landing a data job. Discover how changing your job titles, editing your job experiences, and creatively enhancing bullet points on your resume can help you get noticed in an oversaturated job market. 📊 Come to my next free “How to Land Your First Data Job” training⁠ ⁠🏫 Check out my 10-week data analytics bootcamp ⁠📩 Get my weekly email with helpful data career tips⁠ ⁠🧙‍♂️ Ace the Interview with Confidence Timestamps: 01:21 Hack #1 05:12 Hack #2 07:50 Hack #3 Connect with Avery: 📺 Subscribe on YouTube 🎙Listen to My Podcast 👔 Connect with me on LinkedIn 📸 Instagram 🎵 TikTok Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

DuckDB in Action

Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: Read and process data from CSV, JSON and Parquet sources both locally and remote Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the Technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the Book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's Inside Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Fast-paced SQL recap: From simple queries to advanced analytics About the Reader For data pros comfortable with Python and CLI tools. About the Authors Mark Needham is a blogger and video creator at @‌LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j. Quotes I use DuckDB every day, and I still learned a lot about how DuckDB makes things that are hard in most databases easy! - Jordan Tigani, Founder, MotherDuck An excellent resource! Unlocks possibilities for storing, processing, analyzing, and summarizing data at the edge using DuckDB. - Pramod Sadalage, Director, Thoughtworks Clear and accessible. A comprehensive resource for harnessing the power of DuckDB for both novices and experienced professionals. - Qiusheng Wu, Associate Professor, University of Tennessee Excellent! The book all we ducklings have been waiting for! - Gunnar Morling, Decodable

Rehgan Bleile joins me to chat about the challenges and importance of AI governance and adoption. We also discuss the lack of representation of women in conferences, and efforts to create genuine opportunities for women speakers. 
 AlignAI: getalignai.com LinkedIn: https://www.linkedin.com/in/rehganavon/ Women in Analytics: https://www.womeninanalytics.com/

Doing sales better is perhaps the most direct route to making more revenue, so it should be a priority for every business. B2B sales is often very complex, with a mix of emails and video calls and prospects interacting with your website and social content. And you often have multiple people making decisions about a purchase. All this generates a massive data—or, more accurately, a mess of data—which very few sales teams manage to harness effectively. How can sales teams can make use of data, software, and AI to clean up this mess, work more effectively, and most of all, crush those quarterly targets?  Ellie Fields is the Chief Product and Engineering Officer at Salesloft leading Product Management, Engineering, and Design. Ellie previously led development teams at Tableau responsible for product strategy and engineering for collaboration and mobile portfolio. Ellie also launched and led Tableau Public. In the episode Richie and Ellie explore the digital transformation of sales, how sales technology helps buyers and sellers, metrics for sales success, activity vs outcome metrics, predictive forecasting, AI, customizing sales processes, revenue orchestration, how data impacts sales and management, future trends in sales, and much more.  Links Mentioned in the Show: SalesloftConnect with EllieForrester ResearchCourse - Understanding the EU AI ActRelated Episode: Data & AI at Tesco with Venkat Raghavan, Director of Analytics and Science at TescoRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

podcast_episode
by Cris deRitis , Mark Zandi (Moody's Analytics) , Marisa DiNatale (Moody's Analytics)

With the release of July’s consumer price index this week, the Inside Economics team discusses the current state of U.S. inflation. As they dig into the underlying details, they debate what they see as root causes. Specifically – are corporations taking advantage of consumers by keeping prices higher than they should? If so, are recent policy proposals to address ‘price gouging’ likely to be effective?  Hosts: Mark Zandi – Chief Economist, Moody’s Analytics, Cris deRitis – Deputy Chief Economist, Moody’s Analytics, and Marisa DiNatale – Senior Director - Head of Global Forecasting, Moody’s Analytics Follow Mark Zandi on 'X' @MarkZandi, Cris deRitis on LinkedIn, and Marisa DiNatale on LinkedIn

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Data governance is more important than ever today, but it's something a lot of companies still struggle with.  During this live show, George Firican will demystify data governance, breaking down its core components and sharing practical advice that you can use to make improvements at your organization.    What You'll Learn: Why data governance is more important than ever in 2024 The core pillars of a strong data governance program  Practical tips for launching a new data governance program   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   About our guest: George Firican is the Founder of LightsOnData, and a Data Governance expert and course creator. The Practical Data Governance: Implementation Course  Subscribe to George's YouTube Channel Follow George on LinkedIn

Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

0:00

hi everyone Welcome to our event this event is brought to you by data dos club which is a community of people who love

0:06

data and we have weekly events and today one is one of such events and I guess we

0:12

are also a community of people who like to wake up early if you're from the states right Christopher or maybe not so

0:19

much because this is the time we usually have uh uh our events uh for our guests

0:27

and presenters from the states we usually do it in the evening of Berlin time but yes unfortunately it kind of

0:34

slipped my mind but anyways we have a lot of events you can check them in the

0:41

description like there's a link um I don't think there are a lot of them right now on that link but we will be

0:48

adding more and more I think we have like five or six uh interviews scheduled so um keep an eye on that do not forget

0:56

to subscribe to our YouTube channel this way you will get notified about all our future streams that will be as awesome

1:02

as the one today and of course very important do not forget to join our community where you can hang out with

1:09

other data enthusiasts during today's interview you can ask any question there's a pin Link in live chat so click

1:18

on that link ask your question and we will be covering these questions during the interview now I will stop sharing my

1:27

screen and uh there is there's a a message in uh and Christopher is from

1:34

you so we actually have this on YouTube but so they have not seen what you wrote

1:39

but there is a message from to anyone who's watching this right now from Christopher saying hello everyone can I

1:46

call you Chris or you okay I should go I should uh I should look on YouTube then okay yeah but anyways I'll you don't

1:53

need like you we'll need to focus on answering questions and I'll keep an eye

1:58

I'll be keeping an eye on all the question questions so um

2:04

yeah if you're ready we can start I'm ready yeah and you prefer Christopher

2:10

not Chris right Chris is fine Chris is fine it's a bit shorter um

2:18

okay so this week we'll talk about data Ops again maybe it's a tradition that we talk about data Ops every like once per

2:25

year but we actually skipped one year so because we did not have we haven't had

2:31

Chris for some time so today we have a very special guest Christopher Christopher is the co-founder CEO and

2:37

head chef or hat cook at data kitchen with 25 years of experience maybe this

2:43

is outdated uh cuz probably now you have more and maybe you stopped counting I

2:48

don't know but like with tons of years of experience in analytics and software engineering Christopher is known as the

2:55

co-author of the data Ops cookbook and data Ops Manifesto and it's not the

3:00

first time we have Christopher here on the podcast we interviewed him two years ago also about data Ops and this one

3:07

will be about data hops so we'll catch up and see what actually changed in in

3:13

these two years and yeah so welcome to the interview well thank you for having

3:19

me I'm I'm happy to be here and talking all things related to data Ops and why

3:24

why why bother with data Ops and happy to talk about the company or or what's changed

3:30

excited yeah so let's dive in so the questions for today's interview are prepared by Johanna berer as always

3:37

thanks Johanna for your help so before we start with our main topic for today

3:42

data Ops uh let's start with your ground can you tell us about your career Journey so far and also for those who

3:50

have not heard have not listened to the previous podcast maybe you can um talk

3:55

about yourself and also for those who did listen to the previous you can also maybe give a summary of what has changed

4:03

in the last two years so we'll do yeah so um my name is Chris so I guess I'm

4:09

a sort of an engineer so I spent about the first 15 years of my career in

4:15

software sort of working and building some AI systems some non- AI systems uh

4:21

at uh Us's NASA and MIT linol lab and then some startups and then um

4:30

Microsoft and then about 2005 I got I got the data bug uh I think you know my

4:35

kids were small and I thought oh this data thing was easy and I'd be able to go home uh for dinner at 5 and life

4:41

would be fine um because I was a big you started your own company right and uh it didn't work out that way

4:50

and um and what was interesting is is for me it the problem wasn't doing the

4:57

data like I we had smart people who did data science and data engineering the act of creating things it was like the

5:04

systems around the data that were hard um things it was really hard to not have

5:11

errors in production and I would sort of driving to work and I had a Blackberry at the time and I would not look at my

5:18

Blackberry all all morning I had this long drive to work and I'd sit in the parking lot and take a deep breath and

5:24

look at my Blackberry and go uh oh is there going to be any problems today and I'd be and if there wasn't I'd walk and

5:30

very happy um and if there was I'd have to like rce myself um and you know and

5:36

then the second problem is the team I worked for we just couldn't go fast enough the customers were super

5:42

demanding they didn't care they all they always thought things should be faster and we are always behind and so um how

5:50

do you you know how do you live in that world where things are breaking left and right you're terrified of making errors

5:57

um and then second you just can't go fast enough um and it's preh Hadoop era

6:02

right it's like before all this big data Tech yeah before this was we were using

6:08

uh SQL Server um and we actually you know we had smart people so we we we

6:14

built an engine in SQL Server that made SQL Server a column or

6:20

database so we built a column or database inside of SQL Server um so uh

6:26

in order to make certain things fast and and uh yeah it was it was really uh it's not

6:33

bad I mean the principles are the same right before Hadoop it's it's still a database there's still indexes there's

6:38

still queries um things like that we we uh at the time uh you would use olap

6:43

engines we didn't use those but you those reports you know are for models it's it's not that different um you know

6:50

we had a rack of servers instead of the cloud um so yeah and I think so what what I

6:57

took from that was uh it's just hard to run a team of people to do do data and analytics and it's not

7:05

really I I took it from a manager perspective I started to read Deming and

7:11

think about the work that we do as a factory you know and in a factory that produces insight and not automobiles um

7:18

and so how do you run that factory so it produces things that are good of good

7:24

quality and then second since I had come from software I've been very influenced

7:29

by by the devops movement how you automate deployment how you run in an agile way how you

7:35

produce um how you how you change things quickly and how you innovate and so

7:41

those two things of like running you know running a really good solid production line that has very low errors

7:47

um and then second changing that production line at at very very often they're kind of opposite right um and so

7:55

how do you how do you as a manager how do you technically approach that and

8:00

then um 10 years ago when we started data kitchen um we've always been a profitable company and so we started off

8:07

uh with some customers we started building some software and realized that we couldn't work any other way and that

8:13

the way we work wasn't understood by a lot of people so we had to write a book and a Manifesto to kind of share our our

8:21

methods and then so yeah we've been in so we've been in business now about a little over 10

8:28

years oh that's cool and uh like what

8:33

uh so let's talk about dat offs and you mentioned devops and how you were inspired by that and by the way like do

8:41

you remember roughly when devops as I think started to appear like when did people start calling these principles

8:49

and like tools around them as de yeah so agile Manifesto well first of all the I

8:57

mean I had a boss in 1990 at Nasa who had this idea build a

9:03

little test a little learn a lot right that was his Mantra and then which made

9:09

made a lot of sense um and so and then the sort of agile software Manifesto

9:14

came out which is very similar in 2001 and then um the sort of first real

9:22

devops was a guy at Twitter started to do automat automated deployment you know

9:27

push a button and that was like 200 Nish and so the first I think devops

9:33

Meetup was around then so it's it's it's been 15 years I guess 6 like I was

9:39

trying to so I started my career in 2010 so I my first job was a Java

9:44

developer and like I remember for some things like we would just uh SFTP to the

9:52

machine and then put the jar archive there and then like keep our fingers crossed that it doesn't break uh uh like

10:00

it was not really the I wouldn't call it this way right you were deploying you

10:06

had a Dey process I put it yeah

10:11

right was that so that was documented too it was like put the jar on production cross your

10:17

fingers I think there was uh like a page on uh some internal Viki uh yeah that

10:25

describes like with passwords and don't like what you should do yeah that was and and I think what's interesting is

10:33

why that changed right and and we laugh at it now but that was why didn't you

10:38

invest in automating deployment or a whole bunch of automated regression

10:44

tests right that would run because I think in software now that would be rare

10:49

that people wouldn't use C CD they wouldn't have some automated tests you know functional

10:56

regression tests that would be the

Streaming Databases

Real-time applications are becoming the norm today. But building a model that works properly requires real-time data from the source, in-flight stream processing, and low latency serving of its analytics. With this practical book, data engineers, data architects, and data analysts will learn how to use streaming databases to build real-time solutions. Authors Hubert Dulay and Ralph M. Debusmann take you through streaming database fundamentals, including how these databases reduce infrastructure for real-time solutions. You'll learn the difference between streaming databases, stream processing, and real-time online analytical processing (OLAP) databases. And you'll discover when to use push queries versus pull queries, and how to serve synchronous and asynchronous data emanating from streaming databases. This guide helps you: Explore stream processing and streaming databases Learn how to build a real-time solution with a streaming database Understand how to construct materialized views from any number of streams Learn how to serve synchronous and asynchronous data Get started building low-complexity streaming solutions with minimal setup