talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (245 results)

See all 245 →
Showing 19 results

Activities & events

Title & Speakers Event
Paul Groth – Professor @ University of Amsterdam , Tobias Macey – host

Summary In this episode of the Data Engineering Podcast Professor Paul Groth, from the University of Amsterdam, talks about his research on knowledge graphs and data engineering. Paul shares his background in AI and data management, discussing the evolution of data provenance and lineage, as well as the challenges of data integration. He explores the impact of large language models (LLMs) on data engineering, highlighting their potential to simplify knowledge graph construction and enhance data integration. The conversation covers the evolving landscape of data architectures, managing semantics and access control, and the interplay between industry and academia in advancing data engineering practices, with Paul also sharing insights into his work with the intelligent data engineering lab and the importance of human-AI collaboration in data engineering pipelines.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Paul Groth about his research on knowledge graphs and data engineeringInterview IntroductionHow did you get involved in the area of data management?Can you start by describing the focus and scope of your academic efforts?Given your focus on data management for machine learning as part of the INDELab, what are some of the developing trends that practitioners should be aware of?ML architectures / systems changing (matteo interlandi) GPUs for data mangementYou have spent a large portion of your career working with knowledge graphs, which have largely been a niche area until recently. What are some of the notable changes in the knowledge graph ecosystem that have resulted from the introduction of LLMs?What are some of the other ways that you are seeing LLMs change the methods of data engineering?There are numerous vague and anecdotal references to the power of LLMs to unlock value from unstructured data. What are some of the realitites that you are seeing in your research?A majority of the conversations in this podcast are focused on data engineering in the context of a business organization. What are some of the ways that management of research data is disjoint from the methods and constraints that are present in business contexts?What are the most interesting, innovative, or unexpected ways that you have seen LLM used in data management?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data engineering research?What do you have planned for the future of your research in the context of data engineering, knowledge graphs, and AI?Contact Info WebsiteemailParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links INDELabData ProvenanceElsevierSIGMOD 2025Digital TwinKnowledge GraphWikiDataKuzuDBPodcast Episodedata.worldPodcast EpisodeGraphRAGSPARQLSemantic WebGQL == Graph Query LanguageCypherAmazon NeptuneRDF == Resource Description FrameworkSwellDBFlockMTLDuckDBPodcast EpisodeMatteo InterlandiPaolo PapottiNeuromorphic ComputingPoint CloudsLongform.aiBASIL DBThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

AI/ML Data Engineering Data Management Datafold LLM Python
Data Engineering Podcast

Data is your business. Have you unlocked its full potential? If you read nothing else on data strategy, read this book. We've combed through hundreds of Harvard Business Review articles and selected the most important ones to help you maximize your analytics capabilities; harness the power of data, algorithms, and AI; and gain competitive advantage in our hyperconnected world. This book will inspire you to: Reap the rewards of digital transformation Make better data-driven decisions Design breakout products that generate profitable insights Address vulnerabilities to cyberattacks and data breaches Reskill your workforce and build a culture of continuous learning Win with personalized customer experiences at scale This collection of articles includes "What's Your Data Strategy?," by Leandro DalleMule and Thomas H. Davenport; "Democratizing Transformation," by Marco Iansiti and Satya Nadella; "Why Companies Should Consolidate Tech Roles in the C-Suite," by Thomas H. Davenport, John Spens, and Saurabh Gupta; "Developing a Digital Mindset," by Tsedal Neeley and Paul Leonardi; "What Does It Actually Take to Build a Data-Driven Culture?," by Mai B. AlOwaish and Thomas C. Redman; "When Data Creates Competitive Advantage," by Andrei Hagiu and Julian Wright; "Building an Insights Engine," by Frank van den Driest, Stan Sthanunathan, and Keith Weed; "Personalization Done Right," by Mark Abraham and David C. Edelman; "Ensure High-Quality Data Powers Your AI," by Thomas C. Redman; "The Ethics of Managing People's Data," by Michael Segalla and Dominique Rouzies; "Where Data-Driven Decision-Making Can Go Wrong," by Michael Luca and Amy C. Edmondson; "Sizing Up Your Cyberrisks," by Thomas J. Parenty and Jack J. Domet; "A Better Way to Put Your Data to Work," Veeral Desai, Tim Fountaine, and Kayvaun Rowshankish; and "Heavy Machinery Meets AI," by Vijay Govindarajan and Venkat Venkatraman. HBR's 10 Must Reads are definitive collections of classic ideas, practical advice, and essential thinking from the pages of Harvard Business Review. Exploring topics like disruptive innovation, emotional intelligence, and new technology in our ever-evolving world, these books empower any leader to make bold decisions and inspire others.

data data-science AI/ML Analytics
O'Reilly Data Science Books
Paul Blankley – CTO / Co-founder @ Zenlytic

The modern data stack has improved the lives of data teams everywhere. But has it helped the rest of the business? In this talk, we’ll discuss the business teams’ perspective. Are they actually getting value from the modern data stack? How does help it them do their jobs better? And why do data teams keep questioning if we’re “adding value” with our powerful new tools? Attendees will gain perspective on their data ‘customers’ and learn ideas on how to deliver tangible business value.

Speaker: Paul Blankley CTO Zenlytic

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Analytics Cloud Computing dbt Modern Data Stack
Dbt Coalesce 2024

Please note that this event takes place in person on 5 June 2024 at 20:00 BST (London time).

Paul Bilokon, Oleksandr Bilokon, and Nataliya Bilokon have recently published, with general public audience in mind,

A Brief History of Artificial Intelligence: https://www.amazon.co.uk/Brief-History-Artificial-Intelligence-Thalesians/dp/B0CV898K4M/ (February 2024)

In addition, Oleksandr Bilokon has published

Artificial Intelligence in Shipping and Logistics: https://www.amazon.co.uk/Artificial-Intelligence-Logistics-Thalesians-Technology/dp/B0CX8JYY54/ (March 2024)

We'll be organizing a public lecture followed by a book signing.

Artificial Intelligence (AI) is transforming business, science, technology, and medicine. Achievements that one could only dream of in the 20th century have materialized in the 21st: AlphaGo has beaten Lee Sedol at the game of Go, probably the hardest game that humanity has invented; intelligent machines created by an army of quants (mathematicians and computer scientists) have been gradually displacing human traders on Wall Street and in the City of London; protein folding, a 50-year-old grand challenge in biology, has essentially been solved by AlphaFold, with unprecedented implications for areas like drug design and environmental sustainability.

Even before the advent of AI, technological advances such as social media, Deliveroo, and Uber have transformed the lives of numerous people, sometimes creating new, sometimes eliminating existing jobs. The emergence of large language models (LLMs), such as GPT behind the AI system ChatGPT, means that AI is now almost universally available and will transform your life – for worse or better. It is therefore essential to be informed about the AI well before its universal adoption.

The purpose of this book is to make AI accessible to literally everyone: whether you are currently at primary or secondary school, preparing to join or already at college or university, working in the academe at masters or PhD level or beyond, doing an AI-related – or a totally unrelated – job, running a business, serving a religious organisation, a charity, a country, and/or a government. Irrespective of whether you are not in the work pool, unemployed, employed, self-employed, retired, if you are interested in what dangers and opportunities are presented to you personally, and other people around you, by AI, this book is for you. If you are a busy executive or, more generally, decision maker, this short book (less than 100 pages long) will serve to update you on the current state-of-the-art in AI.

The primary purpose of this book is to inform. The secondary, to entertain and inspire. This is not an academic treatise, so there are (almost) no formulae and no citations. There are some anecdotes that, we hope, you will find intriguing. Granted, you may not have heard of all the references (just as we would not have heard of all the references that you have heard of!), but you have an unfair advantage: access to search engines, such as Google.

Read this book, and you will learn why AI has been created, what it is, how it works and how to use it, as well as what will happen (and what won't happen) if you start adopting AI today.

Book signing: A Brief History of Artificial Intelligence and AI in Shipping

Dear all, We are happy to announce a joint session by two of the greatest and bestest Meetup groups in Berlin - the Berlin DevOps and the Berlin AWS User Group. Registering on both is not necessary 😉

The State of Infrastructure as Code

The idea is to have insightful discussions with the audience about the current Infrastructure as Code landscape and allow practitioners to share their experiences and discuss with others.

If you are keen to tell your story, your successes and failures to others, please fill in this form to register with your tooling of choice - https://forms.gle/T5zUQRtfFgnwCVey8

In the second part we will have (remote) guests from various projects, giving food for thought and visions for the IaC landscape in the months ahead.This event will happen in-person, kindly sponsored by HeyJobs! as well as remotely.Please only RSVP if you intend to participate in person in Kreuzberg**. The event is published in both Meetup groups - Berlin DevOps and AWS User Group - please RSVP in only one of them if you happen to be a member of both groups.

Details for remote access will be shared in time with the community.Please note, we will take pictures during the event as well as record the sessions.** ------------------------------------------------------------------------------------------------ The evening 18:30 - Warming up and networking

18:45 - Welcome talk by HeyJobs!

19:00 - 20:00 - Fishbowl discussion

Practitioners share their opinionated experiences with landscapes and toolings while the audience is invited to join the discussions. An introduction to the format happens at the evening, pre-read here: https://en.wikipedia.org/wiki/Fishbowl_(conversation)

20:00 - 20:30 - Break with snacks and drinks

20:30 - 21:15 - Food for thought

We invited Adam Jacob from System Initiative (https://www.systeminit.com/), as well as people from Hashicorp, Winglang.io and OpenTF Foundation to address the crowd and broaden our perspectives on IaC.

21:15 - 21:30 Closing and networking ------------------------------------------------------------------------------------------------ How to get to HeyJobs Address: Paul-Lincke-Ufer 39, 10999 Berlin, First courtyard, 4th floor Closest station is Kottbusser Tor (U8, U1 and U3). Follow the direction of Paul-Linke-Ufer, then walk on Paul-Lincke-Ufer street for about 3 minutes. Then turn left into the courtyard where Concierge coffee is, just before Zola pizzeria. Enter the courtyard and to your right you will see a sign for HeyJobs. Take the elevator or stairs to the 4th floor and follow the signs inside to get to the room (called Warehouse).

Special session “The State of Infrastructure as Code”

Dear all, We are happy to announce a joint session by two of the greatest and bestest Meetup groups in Berlin - the Berlin DevOps and the Berlin AWS User Group. Registering on both is not necessary 😉

The State of Infrastructure as Code

The idea is to have insightful discussions with the audience about the current Infrastructure as Code landscape and allow practitioners to share their experiences and discuss with others.

If you are keen to tell your story, your successes and failures to others, please fill in this form to register with your tooling of choice - https://forms.gle/T5zUQRtfFgnwCVey8

In the second part we will have (remote) guests from various projects, giving food for thought and visions for the IaC landscape in the months ahead.

This event will happen in-person, kindly sponsored by HeyJobs! as well as remotely. Please only RSVP if you intend to participate in person in Kreuzberg. The event is published in both Meetup groups - Berlin DevOps and AWS User Group - please RSVP in only one of them if you happen to be a member of both groups.

Details for remote access will be shared in time with the community.

Please note, we will take pictures during the event as well as record the sessions.

===================================================================== The evening

18:30 - Warming up and networking 18:45 - Welcome talk by HeyJobs! 19:00 - 20:00 - Fishbowl discussion Practitioners share their opinionated experiences with landscapes and toolings while the audience is invited to join the discussions. An introduction to the format happens at the evening, pre-read here: https://en.wikipedia.org/wiki/Fishbowl_(conversation)

20:00 - 20:30 - Break with snacks and drinks

20:30 - 21:15 - Food for thought We invited Adam Jacob from System Initiative (https://www.systeminit.com/), as well as people from Hashicorp, Winglang.io and OpenTF Foundation to address the crowd and broaden our perspectives on IaC. 21:15 - 21:30 Closing and networking

=====================================================================

How to get to HeyJobs Address: Paul-Lincke-Ufer 39, 10999 Berlin, First courtyard, 4th floor

Closest station is Kottbusser Tor (U8, U1 and U3). Follow the direction of Paul-Linke-Ufer, then walk on Paul-Lincke-Ufer street for about 3 minutes. Then turn left into the courtyard where Concierge coffee is, just before Zola pizzeria. Enter the courtyard and to your right you will see a sign for HeyJobs. Take the elevator or stairs to the 4th floor and follow the signs inside to get to the room (called Warehouse).

Special session “The State of Infrastructure as Code”
Ryan Janssen – guest @ Zenlytic , Paul Blankley – guest @ Zenlytic , Tobias Macey – host

Summary

Business intellingence has been chasing the promise of self-serve data for decades. As the capabilities of these systems has improved and become more accessible, the target of what self-serve means changes. With the availability of AI powered by large language models combined with the evolution of semantic layers, the team at Zenlytic have taken aim at this problem again. In this episode Paul Blankley and Ryan Janssen explore the power of natural language driven data exploration combined with semantic modeling that enables an intuitive way for everyone in the business to access the data that they need to succeed in their work.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack Your host is Tobias Macey and today I'm interviewing Paul Blankley and Ryan Janssen about Zenlytic, a no-code business intelligence tool focused on emerging commerce brands

Interview

Introduction How did you get involved in the area of data management? Can you describe what Zenlytic is and the story behind it? Business intelligence is a crowded market. What was your process for defining the problem you are focused on solving and the method to achieve that outcome? Self-serve data exploration has been attempted in myriad ways over successive generations of BI and data platforms. What are the barriers that have been the most challenging to overcome in that effort?

What are the elements that are coming together now that give you confidence in being able to deliver on that?

Can you describe how Zenlytic is implemented?

What are the evolutions in the understanding and implementation of semantic layers that provide a sufficient substrate for operating on? How have the recent breakthroughs in large language models (LLMs) improved your ability to build features in Zenlytic? What is your process for adding domain semantics to the operational aspect of your LLM?

For someone using Zenlytic, what is the process for getting it set up and integrated with their data? Once it is operational, can you describe some typical workflows for using Zenlytic in a business context?

Who are the target users? What are the collaboration options available?

What are the most complex engineering/data challenges that you have had to address in building Zenlytic? What are the most interesting, innovative, or unexpected ways that you have seen Zenlytic used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Zenlytic? When is Zenlytic the wrong choice? What do you have planned for the future of Zenlytic?

Contact Info

Paul Blankley (LinkedIn)

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Zenlytic OLAP Cube Large Language Model Starburst Pr

AI/ML BI CDP Data Engineering Data Lake Data Management LLM Python Data Streaming
Data Engineering Podcast
Charles Custer – author , Jim Walker – author , Paul Modderman – author

Globally available resources have become the status quo. They're accessible, distributed, and resilient. Our traditional SQL database options haven't kept up. Centralized SQL databases, even those with read replicas in the cloud, put all the transactional load on a central system. The further away that a transaction happens from the user, the more the user experience suffers. If the transactional data powering the application is greatly slowed down, fast-loading web pages mean nothing. In this report, Paul Modderman, Jim Walker, and Charles Custer explain how distributed SQL fits all applications and eliminates complex challenges like sharding from traditional RDBMS systems. You'll learn how distributed SQL databases can reach global scale without introducing the consistency trade-offs found in NoSQL solutions. These databases come to life through cloud computing, while legacy databases simply can't rise to meet the elastic and ubiquitous new paradigm. You'll learn: Key concepts driving this new technology, including the CAP theorem, the Raft consensus algorithm, multiversion concurrency control, and Google Spanner How distributed SQL databases meet enterprise requirements, including management, security, integration, and Everything as a Service (XaaS) The impact that distributed SQL has already made in the telecom, retail, and gaming industries Why serverless computing is an ideal fit for distributed SQL How distributed SQL can help you expand your company's strategic plan

data data-engineering nosql-databases Cloud Computing ELK NoSQL RDBMS Cyber Security SQL
O'Reilly Data Engineering Books

Printed in full color! Unlock the groundbreaking advances of deep learning with this extensively revised new edition of the bestselling original. Learn directly from the creator of Keras and master practical Python deep learning techniques that are easy to apply in the real world. In Deep Learning with Python, Second Edition you will learn: Deep learning from first principles Image classification and image segmentation Timeseries forecasting Text classification and machine translation Text generation, neural style transfer, and image generation Printed in full color throughout Deep Learning with Python has taught thousands of readers how to put the full capabilities of deep learning into action. This extensively revised full color second edition introduces deep learning using Python and Keras, and is loaded with insights for both novice and experienced ML practitioners. You’ll learn practical techniques that are easy to apply in the real world, and important theory for perfecting neural networks. About the Technology Recent innovations in deep learning unlock exciting new software capabilities like automated language translation, image recognition, and more. Deep learning is quickly becoming essential knowledge for every software developer, and modern tools like Keras and TensorFlow put it within your reach—even if you have no background in mathematics or data science. This book shows you how to get started. About the Book Deep Learning with Python, Second Edition introduces the field of deep learning using Python and the powerful Keras library. In this revised and expanded new edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. As you move through this book, you’ll build your understanding through intuitive explanations, crisp color illustrations, and clear examples. You’ll quickly pick up the skills you need to start developing deep-learning applications. What's Inside Deep learning from first principles Image classification and image segmentation Time series forecasting Text classification and machine translation Text generation, neural style transfer, and image generation Printed in full color throughout About the Reader For readers with intermediate Python skills. No previous experience with Keras, TensorFlow, or machine learning is required. About the Author François Chollet is a software engineer at Google and creator of the Keras deep-learning library. Quotes Chollet is a master of pedagogy and explains complex concepts with minimal fuss, cutting through the math with practical Python code. He is also an experienced ML researcher and his insights on various model architectures or training tips are a joy to read. - Martin Görner, Google Immerse yourself into this exciting introduction to the topic with lots of real-world examples. A must-read for every deep learning practitioner. - Sayak Paul, Carted The modern classic just got better. - Edmon Begoli, Oak Ridge National Laboratory Truly the bible of deep learning. - Yiannis Paraskevopoulos, University of West Attica

software-development programming-languages Python AI/ML Data Science Keras TensorFlow
O'Reilly AI & ML Books
Thomas Dietterich – Distinguished Professor Emeritus @ Oregon State University , Vishal – host

Thomas Dietterich ( @tdietterich ) on Understanding the Depth of AI #FutureOfData #Leadership #Podcast

In this podcast Thomas Dietterich(@tdietterich) Distinguished Professor Emeritus @ Oregan State University sat with Vishal @ AnalyticsWeek to discuss the depth of AI. in This session Tom shared the current state, limitations and future of AI. He shared areas where AI is relevant and which areas are still seeking more testing for AI adoption. He also shared some of the pitfalls with current AI framework, area of selective bias, knowing context etc. This is a great session for anyone seeking to learn about the World of AI.

Thomas's Recommended Read: Army of None: Autonomous Weapons and the Future of War by Paul Scharre https://amzn.to/2CnoA94

Podcast Link: iTunes: http://math.im/itunes Youtube: http://math.im/youtube

Thomas's BIO: Thomas Dietterich has devoted his career to research in machine learning starting from the very first machine learning workshop in 1980. Along the way, he has been involved in four startup companies: Arris Pharmaceutical, MusicStrands, Smart Desktop, and (currently) BigML. He has made important contributions to learning with weak labels, ensemble methods, hierarchical reinforcement learning. and robust artificial intelligence. He was founding President of the International Machine Learning Society (which runs the International Conference on Machine Learning) and President of the Association for the Advancement of Artificial Intelligence. He has served on numerous government advisory bodies and currently is a member of the steering committee of the DARPA ISAT group. Dietterich earned his bachelor's degree from Oberlin College, his M.S. from the University of Illinois, and his PhD from Stanford University. He is a Fellow of the ACM, AAAI, and AAAS.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join? If you or any you know wants to join in, Register your interest by mailing us @ [email protected]

Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData,

DataAnalytics,

Leadership,

Futurist,

Podcast,

BigData,

Strategy

AI/ML Big Data
Nathan Furr – Professor of Strategy and Innovation @ INSEAD

In this podcast, Nathan Furr(@nathan_furr) talks about leading transformation. He shares some of the crucial ingredients of transformational leaders. He sheds some light on how businesses could improve their storytelling to get the transformation agenda across. He shares some cool tips and tricks that help leaders plan for a transformation across data-driven and disruptive times.

Timeline: 1:39 Nathan's journey. 4:49 Nathan's current role. 13:55 Transforming legacy old company. 21:52 The right moment for companies to think about data transformation. 26:38 Using comic books to share transformational stories. 34:32 Who's the most responsible person in an organization for transformation? 39:13 Qualities a leader must have for bringing in transformational change. 43:40 Nathan's success mantra. 47:57 Nathan's favorite reads. 50:29 Closing remarks.

Nathan's Recommended Read: East of Eden (Penguin Twentieth-Century Classics) by John Steinbeck, David Wyatt https://amzn.to/2S9MHA0

Nathan's Books The Innovator's Method: Bringing the Lean Start-up into Your Organization by Nathan Furr, Jeff Dyer, Clayton M. Christensen https://amzn.to/2TeadJE Leading Transformation: How to Take Charge of Your Company's Future by Nathan Furr, Kyle Nel, Thomas Zoega Ramsey https://amzn.to/2CTw16z Nail It then Scale It: The Entrepreneur's Guide to Creating and Managing Breakthrough Innovation: The lean startup book to help entrepreneurs launch a high-growth business by Nathan Furr, Paul Ahlstrom https://amzn.to/2UfTpSC

Podcast Link: https://futureofdata.org/leading-transformation-through-data-driven-times-nathan_furr-insead-futureofdata-podcast/

Nathan's BIO: Nathan Furr is a professor of strategy and innovation at INSEAD in Paris and a recognized expert in innovation and technology strategy. He has multiple books and articles published by outlets such as Harvard Business Review and MIT Sloan Management Review, including his most recent best-selling book, “The Innovator’s Method” (Harvard Business Review Press, September 2014), which won multiple awards from the business press. He has two forthcoming books from Harvard Business Review Press addressing 1) how companies lead transformation and 2) how innovators win support for their ideas.

Professor Furr has worked with leading companies to study and implement innovation strategies, including Google, Amazon, Citi, Deutsche Bank, Philips, Kimberly Clark, Solvay, and others. Professor Furr earned his Ph.D. from the Stanford Technology Ventures Program at Stanford University.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey in creating the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest by emailing us @ [email protected]

Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData,

DataAnalytics,

Leadership,

Futurist,

Podcast,

BigData,

Strategy

Big Data
Event Making Data Simple 2020-09-09
Paul Zikopoulos – IBM VP Big Data Cognitive Systems @ IBM , Al Martin – WW VP Technical Sales @ IBM

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next. 

Abstract

This week, Paul Zikopolous, IBM VP Big Data Cognitive Systems, makes a highly anticipated return to Making Data Simple. Paul gives an update to what he's been working on, including his A.I. tracking app which saw an interesting use case at a recent Luke Bryan concert. We are also given some insight to the state of data and the rest of the industry. Host Al Martin then finishes things off by discussing what it means to lead a team, and tips for growing your career. 

Connect with Paul

LinkedIn 

Twitter

IBM Blogs

Show Notes

07:02 - Read more about Watson Anywhere here. 

20:20 - Check out Auto AI here.

Connect with the Team

Producer Liam Seston - LinkedIn.

Producer Lana Cosic - LinkedIn.

Producer Meighann Helene - LinkedIn. 

Host Al Martin - LinkedIn and Twitter.

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

AI/ML Big Data IBM
Paul Zikopoulos – IBM VP Big Data Cognitive Systems @ IBM , Al Martin – WW VP Technical Sales @ IBM

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.  Abstract This week, Paul Zikopolous, IBM VP Big Data Cognitive Systems, makes a highly anticipated return to Making Data Simple. Paul gives an update to what he's been working on, including his A.I. tracking app which saw an interesting use case at a recent Luke Bryan concert. We are also given some insight to the state of data and the rest of the industry. Host Al Martin then finishes things off by discussing what it means to lead a team, and tips for growing your career.  Connect with Paul LinkedIn  Twitter IBM Blogs Show Notes 02:43 - Learn about the technology Paul employed to make his detection app here. 07:02 - Read more about Watson Anywhere here.  20:20 - Check out Auto AI here. Connect with the Team Producer Liam Seston - LinkedIn. Producer Lana Cosic - LinkedIn. Producer Meighann Helene - LinkedIn.  Host Al Martin - LinkedIn and Twitter. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

AI/ML Big Data IBM
Paul Brebner – guest @ Instaclustr , Tobias Macey – host

Summary Anomaly detection is a capability that is useful in a variety of problem domains, including finance, internet of things, and systems monitoring. Scaling the volume of events that can be processed in real-time can be challenging, so Paul Brebner from Instaclustr set out to see how far he could push Kafka and Cassandra for this use case. In this interview he explains the system design that he tested, his findings for how these tools were able to work together, and how they behaved at different orders of scale. It was an interesting conversation about how he stress tested the Instaclustr managed service for benchmarking an application that has real-world utility.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Integrating data across the enterprise has been around for decades – so have the techniques to do it. But, a new way of integrating data and improving streams has evolved. By integrating each silo independently – data is able to integrate without any direct relation. At CluedIn they call it “eventual connectivity”. If you want to learn more on how to deliver fast access to your data across the enterprise leveraging this new method, and the technologies that make it possible, get a demo or presentation of the CluedIn Data Hub by visiting dataengineeringpodcast.com/cluedin. And don’t forget to thank them for supporting the show! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to dataengineeringpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Paul Brebner about his experience designing and building a scalable, real-time anomaly detection system using Kafka and Cassandra

Interview

Introduction How did you get involved in the area of data management? Can you start by describing the problem that you were trying to solve and the requirements that you were aiming for?

What are some example cases where anomaly detection is useful or necessary?

Once you had established the requirements in terms of functionality and data volume, what was your approach for dete

AI/ML Big Data Cassandra Data Engineering Data Management Data Science Kafka Data Streaming
Data Engineering Podcast
Justin Borgman – CEO @ Starburst Data

In this podcast, Justin Borgman talks about his journey of starting a data science start, doing an exit, and jumping on another one. The session is filled with insights for leadership, looking for entrepreneurial wisdom to get on a data-driven journey.

Timeline: 0:28 Justin's journey. 3:22 Taking the plunge to start a new company. 5:49 Perception vs. reality of starting a data warehouse company. 8:15 Bringing in something new to the IT legacy. 13:20 Getting your first few customers. 16:16 Right moment for a data warehouse company to look for a new venture. 18:20 Right person to have as a co-founder. 20:29 Advantages of going seed vs. series A. 22:13 When is a company ready for seeding or series A? 24:40 Who's a good adviser? 26:35 Exiting Teradata. 28:54 Teradata to starting a new company. 31:24 Excitement of starting something from scratch. 32:24 What is Starburst? 37:15 Presto, a great engine for cloud platforms. 40:30 How can a company get started with Presto. 41:50 Health of enterprise data. 44:15 Where does Presto not fit in? 45:19 Future of enterprise data. 46:36 Drawing parallels between proprietary space and open source space. 49:02 Does align with open-source gives a company a better chance in seeding. 51:44 John's ingredients for success. 54:05 John's favorite reads. 55:01 Key takeaways.

Paul's Recommended Read: The Outsiders Paperback – S. E. Hinton amzn.to/2Ai84Gl

Podcast Link: https://futureofdata.org/running-a-data-science-startup-one-decision-at-a-time-futureofdata-podcast/

Justin's BIO: Justin has spent the better part of a decade in senior executive roles building new businesses in the data warehousing and analytics space. Before co-founding Starburst, Justin was Vice President and General Manager at Teradata (NYSE: TDC), where he was responsible for the company’s portfolio of Hadoop products. Prior to joining Teradata, Justin was co-founder and CEO of Hadapt, the pioneering "SQL-on-Hadoop" company that transformed Hadoop from file system to analytic database accessible to anyone with a BI tool. Teradata acquired Hadapt in 2014.

Justin earned a BS in Computer Science from the University of Massachusetts at Amherst and an MBA from the Yale School of Management.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Analytics BI Big Data Cloud Computing Computer Science Data Science DWH Hadoop Presto SQL Teradata
Paul Ballew – Vice President and Global Chief Data and Analytics Officer @ Ford Motor Company

In this podcast, Paul Ballew(@Ford) talks about best practices when running a data science organization spanned across multiple continents. He shared the importance of being Smart, Nice, and Inquisitive in creating tomorrow's workforce today. He sheds some light on the importance of appreciating culture when defining forward-looking policies. He also builds a case for a non-native group and discusses ways to implement data science as a central organization(with no hub-spoke model). This podcast is great for future data science leaders leading organizations with a broad consumer base and multiple geo-political silos.

Timeline: 0:29 Paul's journey. 5:10 Paul's current role. 8:10 Insurance and data analytics. 13:00 Who will own the insurance in the time of automation. 18:22 Recruiting models in technologies. 21:54 Embracing technological change. 25:03 Will we have more analytics in Ford cars? 28:25 How does Ford stay competitive from a technology perspective. 30:30 Challenges for Analytics officer in Ford. 32:36 Ingredients of a good hire. 34:12 How is the data science team structured in Ford. 36:15 Dealing with shadow groups. 39:00 Successful KPIs. 40:33 Who owns data? 42:27 Who should own the security of data assets. 44:05 Examples of successful data science groups. 46:30 Practises for remaining bias-free. 48:55 Getting started running a global data science team. 52:45 How does Paul's keep himself updated. 54:18 Paul's favorite read. 55:45 Closing remarks.

Paul's Recommended Read: The Outsiders Paperback – S. E. Hinton http://amzn.to/2Ai84Gl

Podcast Link: https://futureofdata.org/paul-ballewford-running-global-data-science-group-futureofdata-podcast/

Paul's BIO: Paul Ballew is vice president and Global Chief Data and Analytics officer, Ford Motor Company, effective June 1, 2017. At the same time, he also was elected a Ford Motor Company officer. In this role, he leads Ford’s global data and analytics teams for the enterprise. Previously, Ballew was Global Chief Data and Analytics Officer, a position to which he was named in December 2014. In this role, he has been responsible for establishing and growing the company’s industry-leading data and analytics operations that are driving significant business value throughout the enterprise. Prior to joining Ford, he was Chief Data, Insight & Analytics Officer at Dun & Bradstreet. In this capacity, he was responsible for the company’s global data and analytic activities along with the company’s strategic consulting practice. Previously, Ballew served as Nationwide’s senior vice president for Customer Insight and Analytics. He directed customer analytics, market research, and information and data management functions, and supported the company’s marketing strategy. His responsibilities included the development of Nationwide’s customer analytics, data operations, and strategy. Ballew joined Nationwide in November 2007 and established the company’s Customer Insights and Analytics capabilities.

Ballew sits on the boards of Neustar, Inc. and Hyatt Hotels Corporation. He was born in 1964 and has a bachelor’s and master’s degree in Economics from the University of Detroit.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey in creating the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Analytics Big Data Data Analytics Data Management Data Science KPI Marketing Cyber Security

Easy to read and comprehensive, Survival Analysis Using SAS: A Practical Guide, Second Edition, by Paul D. Allison, is an accessible, data-based introduction to methods of survival analysis. Researchers who want to analyze survival data with SAS will find just what they need with this fully updated new edition that incorporates the many enhancements in SAS procedures for survival analysis in SAS 9. Although the book assumes only a minimal knowledge of SAS, more experienced users will learn new techniques of data input and manipulation. Numerous examples of SAS code and output make this an eminently practical book, ensuring that even the uninitiated become sophisticated users of survival analysis. The main topics presented include censoring, survival curves, Kaplan-Meier estimation, accelerated failure time models, Cox regression models, and discrete-time analysis. Also included are topics not usually covered in survival analysis books, such as time-dependent covariates, competing risks, and repeated events.

Survival Analysis Using SAS: A Practical Guide, Second Edition, has been thoroughly updated for SAS 9, and all figures are presented using ODS Graphics. This new edition also documents major enhancements to the STRATA statement in the LIFETEST procedure; includes a section on the PROBPLOT command, which offers graphical methods to evaluate the fit of each parametric regression model; introduces the new BAYES statement for both parametric and Cox models, which allows the user to do a Bayesian analysis using MCMC methods; demonstrates the use of the counting process syntax as an alternative method for handling time-dependent covariates; contains a section on cumulative incidence functions; and describes the use of the new GLIMMIX procedure to estimate random-effects models for discrete-time data.

This book is part of the SAS Press program.

data data-science analytics-platforms SAS
O'Reilly Data Science Books
MySQL® 2008-08-29
Paul DuBois – author

The Definitive Guide to Using, Programming, and Administering MySQL 5.0 and 5.1 MySQL is an open source relational database management system that has experienced a phenomenal growth in popularity and use. Known for its speed and ease of use, MySQL has proven itself to be particularly well-suited for developing database-backed websites and applications. In MySQL, Paul DuBois provides a comprehensive guide to using and administering MySQL effectively and productively. He describes everything from the basics of getting information into a database and formulating queries, to using MySQL with PHP or Perl to generate dynamic web pages, to writing your own programs that access MySQL databases, to administering MySQL servers. The fourth edition of this bestselling book has been meticulously revised and updated to thoroughly cover the latest features and capabilities of MySQL 5.0, as well as to add new coverage of features introduced with MySQL 5.1. “One of the best technical books I have read on any subject.” –Gregory Haley, C Vu, The Association of C & C++ Users “A top-notch user’s guide and reference manual, and in my opinion, the only book you’ll need for the daily operation and maintenance of MySQL databases.” –Eugene Kim, Web Techniques Introduction 1 Part I: General MySQL Use Chapter 1: Getting Started with MySQL 13 Chapter 2: Using SQL to Manage Data 101 Chapter 3: Data Types 201 Chapter 4: Stored Programs 289 Chapter 5: Query Optimization 303 Part II: Using MySQL Programming Interfaces Chapter 6: Introduction to MySQL Programming 341 Chapter 7: Writing MySQL Programs Using C 359 Chapter 8: Writing MySQL Programs Using Perl DBI 435 Chapter 9: Writing MySQL Programs Using PHP 527 Part III: MySQL Administration Chapter 10: Introduction to MySQL Administration 579 Chapter 11: The MySQL Data Directory 585 Chapter 12: General MySQL Administration 609 Chapter 13: Access Control and Security 699 Chapter 14: Database Maintenance, Backups, and Replication 737 Part IV: Appendixes Appendix A: Obtaining and Installing Software 777 Appendix B: Data Type Reference 797 Appendix C: Operator and Function Reference 813 Appendix D: System, Status, and User Variable Reference 889 Appendix E: SQL Syntax Reference 937 Appendix F: MySQL Program Reference 1037 Note: Appendixes G, H, and I are located online and are accessible either by registering this book at informit.com/register or by visiting www.kitebird.com/mysql-book. Appendix G: C API Reference 1121 Appendix H: Perl DBI API Reference 1177 Appendix I: PHP API Reference 1207 Index 1225

data data-engineering relational-databases MySQL API RDBMS Cyber Security SQL

“This book takes the somewhat daunting process of database design and breaks it into completely manageable and understandable components. Mike’s approach whilst simple is completely professional, and I can recommend this book to any novice database designer.” – Sandra Barker, Lecturer, University of South Australia, Australia “Databases are a critical infrastructure technology for information systems and today’s business. Mike Hernandez has written a literate explanation of database technology–a topic that is intricate and often obscure. If you design databases yourself, this book will educate you about pitfalls and show you what to do. If you purchase products that use a database, the book explains the technology so that you can understand what the vendor is doing and assess their products better.” – Michael Blaha, consultant and trainer, author of A Manager’s Guide to Database Technology “If you told me that Mike Hernandez could improve on the first edition of Database Design for Mere Mortals I wouldn’t have believed you, but he did! The second edition is packed with more real-world examples, detailed explanations, and even includes database-design tools on the CD-ROM! This is a must-read for anyone who is even remotely interested in relational database design, from the individual who is called upon occasionally to create a useful tool at work, to the seasoned professional who wants to brush up on the fundamentals. Simply put, if you want to do it right, read this book!” – Matt Greer, Process Control Development, The Dow Chemical Company “Mike’s approach to database design is totally common-sense based, yet he’s adhered to all the rules of good relational database design. I use Mike’s books in my starter database-design class, and I recommend his books to anyone who’s interested in learning how to design databases or how to write SQL queries.” – Michelle Poolet, President, MVDS, Inc. “Slapping together sophisticated applications with poorly designed data will hurt you just as much now as when Mike wrote his first edition, perhaps even more. Whether you’re just getting started developing with data or are a seasoned pro; whether you've read Mike’s previous book or this is your first; whether you're happier letting someone else design your data or you love doing it yourself–this is the book for you. Mike’s ability to explain these concepts in a way that’s not only clear, but fun, continues to amaze me.” –From the Foreword by Ken Getz, MCW Technologies, coauthor ASP.NET Developer's JumpStart “The first edition of Mike Hernandez’s book Database Design for Mere Mortals was one of the few books that survived the cut when I moved my office to smaller quarters. The second edition expands and improves on the original in so many ways. It is not only a good, clear read, but contains a remarkable quantity of clear, concise thinking on a very complex subject. It’s a must for anyone interested in the subject of database design.” – Malcolm C. Rubel, Performance Dynamics Associates “Mike’s excellent guide to relational database design deserves a second edition. His book is an essential tool for fledgling Microsoft Access and other desktop database developers, as well as for client/server pros. I recommend it highly to all my readers.” – Roger Jennings, author of Special Edition Using Access 2002 “There are no silver bullets! Database technology has advanced dramatically, the newest crop of database servers perform operations faster than anyone could have imagined six years ago, but none of these technological advances will help fix a bad database design, or capture data that you forgot to include! Database Design for Mere Mortals™, Second Edition, helps you design your database right in the first place!” – Matt Nunn, Product Manager, SQL Server, Microsoft Corporation “When my brother started his professional career as a developer, I gave him Mike’s book to help him understand database concepts and make real-world application of database technology. When I need a refresher on the finer points of database design, this is the book I pick up. I do not think that there is a better testimony to the value of a book than that it gets used. For this reason I have wholeheartedly recommended to my peers and students that they utilize this book in their day-to-day development tasks.” – Chris Kunicki, Senior Consultant, OfficeZealot.com “Mike has always had an incredible knack for taking the most complex topics, breaking them down, and explaining them so that anyone can ‘get it.’ He has honed and polished his first very, very good edition and made it even better. If you're just starting out building database applications, this book is a must-read cover to cover. Expert designers will find Mike’s approach fresh and enlightening and a source of great material for training others.” – John Viescas, President, Viescas Consulting, Inc., author of Running Microsoft Access 2000 and coauthor of SQL Queries for Mere Mortals “Whether you need to learn about relational database design in general, design a relational database, understand relational database terminology, or learn best practices for implementing a relational database, Database Design for Mere Mortals™, Second Edition, is an indispensable book that you’ll refer to often. With his many years of real-world experience designing relational databases, Michael shows you how to analyze and improve existing databases, implement keys, define table relationships and business rules, and create data views, resulting in data integrity, uniform access to data, and reduced data-entry errors.” – Paul Cornell, Site Editor, MSDN Office Developer Center Sound database design can save hours of development time and ensure functionality and reliability. is a straightforward, platform-independent tutorial on the basic principles of relational database design. It provides a commonsense design methodology for developing databases that work. Database Design for Mere Mortals™, Second Edition, Database design expert Michael J. Hernandez has expanded his best-selling first edition, maintaining its hands-on approach and accessibility while updating its coverage and including even more examples and illustrations. This edition features a CD-ROM that includes diagrams of sample databases, as well as design guidelines, documentation forms, and examples of the database design process. This book will give you the knowledge and tools you need to create efficient and effective relational databases.

data data-engineering relational-databases C#/.NET Microsoft RDBMS SQL
Showing 19 results