talk-data.com talk-data.com

Topic

Computer Science

programming algorithms data_structures

166

tagged

Activity Trend

9 peak/qtr
2020-Q1 2026-Q1

Activities

166 activities · Newest first

In this episode, Conor and Bryce chat with Jared Hoberock about the NVIDIA Thrust Parallel Algorithms Library. Link to Episode 237 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBryce Adelstein Lelbach: Twitter About the Guest

Jared Hoberock joined NVIDIA Research in October 2008. His interests include parallel programming models and physically-based rendering. Jared is the co-creator of Thrust, a high performance parallel algorithms library. While at NVIDIA, Jared has contributed to the DirectX graphics driver, Gelato, a final frame film renderer, and OptiX, a high-performance, programmable ray tracing engine. Jared received a Ph.D in computer science from the University of Illinois at Urbana-Champaign. He is a two-time recipient of the NVIDIA Graduate Research Fellowship. Show Notes Date Generated: 2025-05-21 Date Released: 2025-06-06 ThrustThrust DocsC++98 std::transformthrust::reduceMPI_reduceNVIDIA MatXCuPyRAPIDS.aiThrust Summed Area Table ExampleADSP Episode 213: NumPy & Summed-Area TablesIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

This session dives into the practical application of generative AI and Google Cloud tools for transforming computer science education. Discover how educators can integrate these technologies into core computer science courses to equip future developers with the in-demand skills needed for an AI-driven future. We'll explore the Google Tech Exchange curriculum, a model for bridging the gap between academia and industry through hands-on experience with cutting-edge AI and cloud technologies.

In the retail industry, data science is not just about crunching numbers—it's about driving business impact through well-designed experiments. A-B testing in a physical store setting presents unique challenges that require careful planning and execution. How do you balance the need for statistical rigor with the practicalities of store operations? What role does data science play in ensuring that test results lead to actionable insights?  Philipp Paraguya is the Chapter Lead for Data Science at Aldi DX. Previously, Philipp studied applied mathematics and computer science and has worked as a BI and advanced analytics consultant in various industries and projects since graduating. Due to his background as a software developer, he has a strong connection to classic software engineering and the sensible use of data science solutions. In the episode, Adel and Philipp explore the intricacies of A-B testing in retail, the challenges of running experiments in brick-and-mortar settings, aligning stakeholders for successful experimentation, the evolving role of data scientists, the impact of genAI on data workflows, and much more. Links Mentioned in the Show: Aldi DXConnect with PhilippCourse: Customer Analytics and A/B Testing in PythonRelated Episode: Can You Use AI-Driven Pricing Ethically? with Jose Mendoza, Academic Director & Clinical Associate Professor at NYUSign up to attend RADAR: Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Grokking Relational Database Design

A friendly illustrated guide to designing and implementing your first database. Grokking Relational Database Design makes the principles of designing relational databases approachable and engaging. Everything in this book is reinforced by hands-on exercises and examples. In Grokking Relational Database Design, you’ll learn how to: Query and create databases using Structured Query Language (SQL) Design databases from scratch Implement and optimize database designs Take advantage of generative AI when designing databases A well-constructed database is easy to understand, query, manage, and scale when your app needs to grow. In Grokking Relational Database Design you’ll learn the basics of relational database design including how to name fields and tables, which data to store where, how to eliminate repetition, good practices for data collection and hygiene, and much more. You won’t need a computer science degree or in-depth knowledge of programming—the book’s practical examples and down-to-earth definitions are beginner-friendly. About the Technology Almost every business uses a relational database system. Whether you’re a software developer, an analyst creating reports and dashboards, or a business user just trying to pull the latest numbers, it pays to understand how a relational database operates. This friendly, easy-to-follow book guides you from square one through the basics of relational database design. About the Book Grokking Relational Database Design introduces the core skills you need to assemble and query tables using SQL. The clear explanations, intuitive illustrations, and hands-on projects make database theory come to life, even if you can’t tell a primary key from an inner join. As you go, you’ll design, implement, and optimize a database for an e-commerce application and explore how generative AI simplifies the mundane tasks of database designs. What's Inside Define entities and their relationships Minimize anomalies and redundancy Use SQL to implement your designs Security, scalability, and performance About the Reader For self-taught programmers, software engineers, data scientists, and business data users. No previous experience with relational databases assumed. About the Authors Dr. Qiang Hao and Dr. Michail Tsikerdekis are both professors of Computer Science at Western Washington University. Quotes If anyone is looking to improve their database design skills, they can’t go wrong with this book. - Ben Brumm, DatabaseStar Goes beyond SQL syntax and explores the core principles. An invaluable resource! - William Jamir Silva, Adjust Relational database design is best done right the first time. This book is a great help to achieve that! - Maxim Volgin, KLM Provides necessary notions to design and build databases that can stand the data challenges we face. - Orlando Méndez, Experian

In this episode, I uncover the nine biggest LIES about landing a data job. Maybe what's stopping you from pursuing a data career is just a big lie. No College Degree As A Data Analyst YT Playlist: https://www.youtube.com/playlist?list=PLo0oTKi2fPNjHi6iXT3Pu68kUmiT-xDWs Don’t Learn Python as a Data Analyst (Learn This Instead): https://www.youtube.com/watch?v=VVhURHXMSlA 💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com/interviewsimulator ⌚ TIMESTAMPS 00:00 Introduction 00:05 You Need a Computer Science or Math Degree 01:20 You Have to Be Good at Math and Statistics 03:00 You Must Know Everything About Data Analytics 04:27 Certifications Matter 05:35 Skills Are Enough 07:20 AI Will Take Your Job 09:24 You'll Spend 80% of Your Time Cleaning Data 10:08 Data Titles 11:44 There Are Lots of Remote Jobs 13:17 The "Self-Taught" Data Analyst 🔗 CONNECT WITH AVERY 🎥 YouTube Channel: https://www.youtube.com/@averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

The rise of AI agents in the workplace is transforming how businesses operate, tackling repetitive tasks and freeing up human employees for more creative endeavors. But what does this mean for the future of work, and how can professionals leverage these tools effectively? As AI agents become more sophisticated, capable of reasoning and decision-making, how do you ensure they align with your business goals? What are the implications for data privacy and security, and how do you manage the transition to a more automated workforce while maintaining human oversight? Surojit Chatterjee is the founder and CEO of Ema. Previously, he guided Coinbase through a successful 2021 IPO as its Chief Product Officer and scaled Google Mobile Ads and Google Shopping into multi-billion dollar businesses as the VP and Head of Product. Surojit holds 40 US patents and has an MBA from MIT, MS in Computer Science from SUNY at Buffalo, and B. Tech from IIT Kharagpur. In the episode, Richie and Surojit explore the transformative role of AI agents in automating repetitive business tasks, enhancing creativity and innovation, improving customer support, and redefining workplace efficiency. They discuss the potential of AI employees, data privacy concerns, and the future of AI-driven business processes, and much more. Links Mentioned in the Show: EmaConnect with SurojitSkill Track: Artificial Intelligence (AI) LeadershipRelated Episode: How Generative AI is Changing Leadership with Christie Smith, Founder of the Humanity Institute and Kelly Monahan, Managing Director, Research InstituteAttend RADAR Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Analytics the Right Way

CLEAR AND CONCISE TECHNIQUES FOR USING ANALYTICS TO DELIVER BUSINESS IMPACT AT ANY ORGANIZATION Organizations have more data at their fingertips than ever, and their ability to put that data to productive use should be a key source of sustainable competitive advantage. Yet, business leaders looking to tap into a steady and manageable stream of “actionable insights” often, instead, get blasted with a deluge of dashboards, chart-filled slide decks, and opaque machine learning jargon that leaves them asking, “So what?” Analytics the Right Way is a guide for these leaders. It provides a clear and practical approach to putting analytics to productive use with a three-part framework that brings together the realities of the modern business environment with the deep truths underpinning statistics, computer science, machine learning, and artificial intelligence. The result: a pragmatic and actionable guide for delivering clarity, order, and business impact to an organization’s use of data and analytics. The book uses a combination of real-world examples from the authors’ direct experiences—working inside organizations, as external consultants, and as educators—mixed with vivid hypotheticals and illustrations—little green aliens, petty criminals with an affinity for ice cream, skydiving without parachutes, and more—to empower the reader to put foundational analytical and statistical concepts to effective use in a business context.

In this podcast episode, we talked with Tamara Atanasoska about ​building fair AI systems.

About the Speaker:​Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emphasized the importance of tools, human judgment, and collaboration. The guest also shared their involvement with PyLadies and encouraged contributions to Fairlearn. 00:00 Introduction to the event and the community 01:51 Topic introduction: Linguistic fairness and socio-technical perspectives in AI 02:37 Guest introduction: Tamara’s background and career 03:18 Tamara’s career journey: Software engineering, music tech, and computational linguistics 09:53 Tamara’s background in language and computer science 14:52 Exploring fairness in AI and its impact on society 21:20 Fairness in AI models26:21 Automating fairness analysis in models 32:32 Balancing technical and domain expertise in decision-making 37:13 The role of humans in the loop for fairness 40:02 Joining Probable and working on open-source projects 46:20 Scopes library and its integration with Hugging Face 50:48 PyLadies and community involvement 55:41 The ethos of Scikit-learn and Fairlearn

🔗 CONNECT WITH TAMARA ATANASOSKA Linkedin - https://www.linkedin.com/in/tamaraatanasoska GitHub- https://github.com/TamaraAtanasoska

🔗 CONNECT WITH DataTalksClub Join DataTalks.Club:⁠⁠https://datatalks.club/slack.html⁠⁠ Our events:⁠⁠https://datatalks.club/events.html⁠⁠ Datalike Substack -⁠⁠https://datalike.substack.com/⁠⁠ LinkedIn:⁠⁠  / datatalks-club  

AI is not just about writing code; it's about improving the entire software development process. From generating documentation to automating code reviews, AI tools are becoming indispensable. But how do you ensure the quality of AI-generated code? What strategies can you employ to maintain high standards while leveraging AI's capabilities? These are the questions developers must consider as they incorporate AI into their workflows. Eran Yahav is an associate professor at the Computer Science Department at the Technion – Israel Institute of Technology and co-founder and CTO of Tabnine (formerly Codota). Prior to that, he was a research staff member at the IBM T.J. Watson Research Center in New York (2004-2010). He received his Ph.D. from Tel Aviv University (2005) and his B.Sc. from the Technion in 1996. His research interests include program analysis, program synthesis, and program verification. Eran is a recipient of the prestigious Alon Fellowship for Outstanding Young Researchers, the Andre Deloro Career Advancement Chair in Engineering, the 2020 Robin Milner Young Researcher Award (POPL talk here), the ERC Consolidator Grant as well as multiple best paper awards at various conferences. In the episode, Richie and Eran explore AI's role in software development, the balance between AI assistance and manual coding, the impact of generative AI on code review and documentation, the evolution of developer tools, and the future of AI-driven workflows, and much more. Links Mentioned in the Show: TabnineConnect with EranCourse: Working with the OpenAI APIRelated Episode: Getting Generative AI Into Production with Lin Qiao, CEO and Co-Founder of Fireworks AIRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

The Data Science Handbook, 2nd Edition

Practical, accessible guide to becoming a data scientist, updated to include the latest advances in data science and related fields. Becoming a data scientist is hard. The job focuses on mathematical tools, but also demands fluency with software engineering, understanding of a business situation, and deep understanding of the data itself. This book provides a crash course in data science, combining all the necessary skills into a unified discipline. The focus of The Data Science Handbook is on practical applications and the ability to solve real problems, rather than theoretical formalisms that are rarely needed in practice. Among its key points are: An emphasis on software engineering and coding skills, which play a significant role in most real data science problems. Extensive sample code, detailed discussions of important libraries, and a solid grounding in core concepts from computer science (computer architecture, runtime complexity, and programming paradigms). A broad overview of important mathematical tools, including classical techniques in statistics, stochastic modeling, regression, numerical optimization, and more. Extensive tips about the practical realities of working as a data scientist, including understanding related jobs functions, project life cycles, and the varying roles of data science in an organization. Exactly the right amount of theory. A solid conceptual foundation is required for fitting the right model to a business problem, understanding a tool’s limitations, and reasoning about discoveries. Data science is a quickly evolving field, and this 2nd edition has been updated to reflect the latest developments, including the revolution in AI that has come from Large Language Models and the growth of ML Engineering as its own discipline. Much of data science has become a skillset that anybody can have, making this book not only for aspiring data scientists, but also for professionals in other fields who want to use analytics as a force multiplier in their organization.

Apache Spark for Machine Learning

Dive into the power of Apache Spark as a tool for handling and processing big data required for machine learning. With this book, you will explore how to configure, execute, and deploy machine learning algorithms using Spark's scalable architecture and learn best practices for implementing real-world big data solutions. What this Book will help me do Understand the integration of Apache Spark with large-scale infrastructures for machine learning applications. Employ data processing techniques for preprocessing and feature engineering efficiently with Spark. Master the implementation of advanced supervised and unsupervised learning algorithms using Spark. Learn to deploy machine learning models within Spark ecosystems for optimized performance. Discover methods for analyzing big data trends and machine learning model tuning for improved accuracy. Author(s) The author, Deepak Gowda, is an experienced data scientist with over ten years of expertise in machine learning and big data. His career spans industries such as supply chain, cybersecurity, and more where he has utilized Apache Spark extensively. Deepak's teaching style is marked by clarity and practicality, making complex concepts approachable. Who is it for? Apache Spark for Machine Learning is tailored for data engineers, machine learning practitioners, and computer science students looking to advance their ability to process, analyze, and model using large datasets. If you're already familiar with basic machine learning and want to scale your solutions using Spark, this book is ideal for your studies and professional growth.

Generative AI and data are more interconnected than ever. If you want quality in your AI product, you need to be connected to a database with high quality data. But with so many database options and new AI tools emerging, how do you ensure you’re making the right choices for your organization? Whether it’s enhancing customer experiences or improving operational efficiency, understanding the role of your databases in powering AI is crucial.  Andi Gutmans is the General Manager and Vice President for Databases at Google. Andi’s focus is on building, managing, and scaling the most innovative database services to deliver the industry’s leading data platform for businesses. Prior to joining Google, Andi was VP Analytics at AWS running services such as Amazon Redshift. Prior to his tenure at AWS, Andi served as CEO and co-founder of Zend Technologies, the commercial backer of open-source PHP. Andi has over 20 years of experience as an open source contributor and leader. He co-authored open source PHP. He is an emeritus member of the Apache Software Foundation and served on the Eclipse Foundation’s board of directors. He holds a bachelor’s degree in computer science from the Technion, Israel Institute of Technology. In the episode, Richie and Andi explore databases and their relationship with AI and GenAI, key features needed in databases for AI, GCP database services, AlloyDB, federated queries in Google Cloud, vector databases, graph databases, practical use cases of AI in databases and much more.  Links Mentioned in the Show: GCPConnect with AndiAlloyDB for PostgreSQLCourse: Responsible AI Data ManagementRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeSign up to RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

We talked about:

00:00 DataTalks.Club intro

08:06 Background and career journey of Katarzyna

09:06 Transition from linguistics to computational linguistics

11:38 Merging linguistics and computer science

15:25 Understanding phonetics and morpho-syntax

17:28 Exploring morpho-syntax and its relation to grammar

20:33 Connection between phonetics and speech disorders

24:41 Improvement of voice recognition systems

27:31 Overview of speech recognition technology

30:24 Challenges of ASR systems with atypical speech

30:53 Strategies for improving recognition of disordered speech

37:07 Data augmentation for training models

40:17 Transfer learning in speech recognition

42:18 Challenges of collecting data for various speech disorders

44:31 Stammering and its connection to fluency issues

45:16 Polish consonant combinations and pronunciation challenges

46:17 Use of Amazon Transcribe for generating podcast transcripts

47:28 Role of language models in speech recognition

49:19 Contextual understanding in speech recognition

51:27 How voice recognition systems analyze utterances

54:05 Personalization of ASR models for individuals

56:25 Language disorders and their impact on communication

58:00 Applications of speech recognition technology

1:00:34 Challenges of personalized and universal models

1:01:23 Voice recognition in automotive applications

1:03:27 Humorous voice recognition failures in cars

1:04:13 Closing remarks and reflections on the discussion

About the speaker:

Katarzyna is a computational linguist with over 10 years of experience in NLP and speech recognition. She has developed language models for automotive brands like Audi and Porsche and specializes in phonetics, morpho-syntax, and sentiment analysis.

Kasia also teaches at the University of Warsaw and is passionate about human-centered AI and multilingual NLP.

Join our slack: https://datatalks.club/slack.html

Brought to you by: • Paragon: ​​Build native, customer-facing SaaS integrations 7x faster. • WorkOS: For B2B leaders building enterprise SaaS — On today’s episode of The Pragmatic Engineer, I’m joined by Quinn Slack, CEO and co-founder of Sourcegraph, a leading code search and intelligence platform. Quinn holds a degree in Computer Science from Stanford and is deeply passionate about coding: to the point that he still codes every day! He also serves on the board of Hack Club, a national nonprofit dedicated to bringing coding clubs to high schools nationwide. In this insightful conversation, we discuss:             • How Sourcegraph's operations have evolved since 2021 • Why more software engineers should focus on delivering business value • Why Quinn continues to code every day, even as a CEO • Practical AI and LLM use cases and a phased approach to their adoption • The story behind Job Fairs at Sourcegraph and why it’s no longer in use • Quinn’s leadership style and his focus on customers and product excellence • The shift from location-independent pay to zone-based pay at Sourcegraph • And much more! — Where to find Quinn Slack: • X: https://x.com/sqs • LinkedIn: https://www.linkedin.com/in/quinnslack/ • Website: https://slack.org/ Where to find Gergely: • Newsletter: https://www.pragmaticengineer.com/ • YouTube: https://www.youtube.com/c/mrgergelyorosz • LinkedIn: https://www.linkedin.com/in/gergelyorosz/ • X: https://x.com/GergelyOrosz — In this episode, we cover: (01:35) How Sourcegraph started and how it has evolved over the past 11 years (04:14) How scale-ups have changed  (08:27) Learnings from 2021 and how Sourcegraph’s operations have streamlined (15:22) Why Quinn is for gradual increases in automation and other thoughts on AI (18:10) The importance of changelogs (19:14) Keeping AI accountable and possible future use cases  (22:29) Current limitations of AI (25:08) Why early adopters of AI coding tools have an advantage  (27:38) Why AI is not yet capable of understanding existing codebases  (31:53) Changes at Sourcegraph since the deep dive on The Pragmatic Engineer blog (40:14) The importance of transparency and understanding the different forms of compensation (40:22) Why Sourcegraph shifted to zone-based pay (47:15) The journey from engineer to CEO (53:28) A comparison of a typical week 11 years ago vs. now (59:20) Rapid fire round The Pragmatic Engineer deepdives relevant for this episode: • Inside Sourcegraph’s engineering culture: Part 1 https://newsletter.pragmaticengineer.com/p/inside-sourcegraphs-engineering-culture• Inside Sourcegraph’s engineering culture: Part 2 https://newsletter.pragmaticengineer.com/p/inside-sourcegraphs-engineering-culture-part-2 — References and Transcript: See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Computational Intelligence in Sustainable Computing and Optimization

Computational Intelligence in Sustainable Computing and Optimization: Trends and Applications focuses on developing and evolving advanced computational intelligence algorithms for the analysis of data involved in applications, such as agriculture, biomedical systems, bioinformatics, business intelligence, economics, disaster management, e-learning, education management, financial management, and environmental policies. The book presents research in sustainable computing and optimization, combining methods from engineering, mathematics, artificial intelligence, and computer science to optimize environmental resources Computational intelligence in the field of sustainable computing combines computer science and engineering in applications ranging from Internet of Things (IoT), information security systems, smart storage, cloud computing, intelligent transport management, cognitive and bio-inspired computing, and management science. In addition, data intelligence techniques play a critical role in sustainable computing. Recent advances in data management, data modeling, data analysis, and artificial intelligence are finding applications in energy networks and thus making our environment more sustainable. Presents computational, intelligence–based data analysis for sustainable computing applications such as pattern recognition, biomedical imaging, sustainable cities, sustainable transport, sustainable agriculture, and sustainable financial management Develops research in sustainable computing and optimization, combining methods from engineering, mathematics, and computer science to optimize environmental resources Includes three foundational chapters dedicated to providing an overview of computational intelligence and optimization techniques and their applications for sustainable computing

With AI tools constantly evolving, the potential for innovation seems limitless. But with great potential comes significant costs, and the question of efficiency and scalability becomes crucial. How can you ensure that your AI models are not only pushing boundaries but also delivering results in a cost-effective way? What strategies can help reduce the financial burden of training and deploying models, while still driving meaningful business outcomes?  Natalia Vassilieva is the VP & Field CTO of ML at Cerebras Systems. Natalia has a wealth of experience in research and development in natural language processing, computer vision, machine learning, and information retrieval. As Field CTO, she helps drive product adoption and customer engagement for Cerebras Systems' wafer-scale AI chips. Previously, Natalia was a Senior Research Manager at Hewlett Packard Labs, leading the Software and AI group. She also served as the head of HP Labs Russia leading research teams focused on developing algorithms and applications for text, image, and time-series analysis and modeling. Natalia has an academic background, having been a part-time Associate Professor at St. Petersburg State University and a lecturer at the Computer Science Center in St. Petersburg, Russia. She holds a PhD in Computer Science from St. Petersburg State University. Andy Hock is the Senior VP, Product & Strategy at Cerebras Systems. Andy runs the product strategy and roadmap for Cerebras Systems, focusing on integrating AI research, hardware, and software to accelerate the development and deployment of AI models. He has 15 years of experience in product management, technical program management, and enterprise business development; over 20 years of experience in research, algorithm development, and data analysis for image processing; and  9 years of experience in applied machine learning and AI. Previously he was Product Management lead for Data and Analytics for Terra Bella at Google, where he led the development of machine learning-powered data products from satellite imagery. Earlier, he was Senior Director for Advanced Technology Programs at Skybox Imaging (which became Terra Bella following its acquisition by Google in 2014), and before that was a Senior Program Manager and Senior Scientist at Arete Associates. He has a Ph.D. in Geophysics and Space Physics from the University of California, Los Angeles. In the episode, Richie, Natalia and Andy explore the dramatic recent progress in generative AI, cost and infrastructure challenges in AI, Cerebras’ custom AI chips and other hardware innovations, quantization in AI models, mixture of experts, RLHF, relevant AI use-cases, centralized vs decentralized AI compute, the future of AI and much more.  Links Mentioned in the Show: CerebrasCerebras Launches the World’s Fastest AI InferenceConnect with Natalia and AndyCourse: Implementing AI Solutions in BusinessRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills witha...

Artificial Intelligence

Artificial Intelligence (AI) revolves around creating and utilizing intelligent machines through science and engineering. This book delves into the theory and practical applications of computer science methods that incorporate AI across many domains. It covers techniques such as Machine Learning (ML), Convolutional Neural Networks (CNN), Deep Learning (DL), and Large Language Models (LLM) to tackle complex issues and overcome various challenges.

Ben Shneiderman is a leading figure in the field of human-computer interaction (HCI). Having founded one of the oldest HCI research centers in the country at the University of Maryland in 1983, Shneiderman has been intently studying the design of computer technology and its use by humans. Currently, Ben is a Distinguished University Professor in the Department of Computer Science at the University of Maryland and is working on a new book on human-centered artificial intelligence.

I’m so excited to welcome this expert from the field of UX and design to today’s episode of Experiencing Data! Ben and I talked a lot about the complex intersection of human-centered design and AI systems.

In our chat, we covered:

Ben's career studying human-computer interaction and computer science. (0:30) 'Building a culture of safety': Creating and designing ‘safe, reliable and trustworthy’ AI systems. (3:55) 'Like zoning boards': Why Ben thinks we need independent oversight of privately created AI. (12:56) 'There’s no such thing as an autonomous device': Designing human control into AI systems. (18:16) A/B testing, usability testing and controlled experiments: The power of research in designing good user experiences. (21:08) Designing ‘comprehensible, predictable, and controllable’ user interfaces for explainable AI systems and why [explainable] XAI matters. (30:34) Ben's upcoming book on human-centered AI. (35:55)

Resources and Links: People-Centered Internet: https://peoplecentered.net/ Designing the User Interface (one of Ben’s earlier books): https://www.amazon.com/Designing-User-Interface-Human-Computer-Interaction/dp/013438038X Bridging the Gap Between Ethics and Practice: https://doi.org/10.1145/3419764 Partnership on AI: https://www.partnershiponai.org/ AI incident database: https://www.partnershiponai.org/aiincidentdatabase/ University of Maryland Human-Computer Interaction Lab: https://hcil.umd.edu/ ACM Conference on Intelligent User Interfaces: https://iui.acm.org/2021/hcai_tutorial.html Human-Computer Interaction Lab, University of Maryland, Annual Symposium: https://hcil.umd.edu/tutorial-human-centered-ai/ Ben on Twitter: https://twitter.com/benbendc

Quotes from Today’s Episode The world of AI has certainly grown and blossomed — it’s the hot topic everywhere you go. It’s the hot topic among businesses around the world — governments are launching agencies to monitor AI and are also making regulatory moves and rules. … People want explainable AI; they want responsible AI; they want safe, reliable, and trustworthy AI. They want a lot of things, but they’re not always sure how to get them. The world of human-computer interaction has a long history of giving people what they want, and what they need. That blending seems like a natural way for AI to grow and to accommodate the needs of real people who have real problems. And not only the methods for studying the users, but the rules, the principles, the guidelines for making it happen. So, that’s where the action is. Of course, what we really want from AI is to make our world a better place, and that’s a tall order, but we start by talking about the things that matter — the human values: human rights, access to justice, and the dignity of every person. We want to support individual goals, a person’s sense of self-efficacy — they can do what they need to in the world, their creativity, their responsibility, and their social connections; they want to reach out to people. So, those are the sort of high aspirational goals that become the hard work of figuring out how to build it. And that’s where we want to go. - Ben (2:05)  

The software engineering teams creating AI systems have got real work to do. They need the right kind of workflows, engineering patterns, and Agile development methods that will work for AI. The AI world is different because it’s not just programming, but it also involves the use of data that’s used for training. The key distinction is that the data that drives the AI has to be the appropriate data, it has to be unbiased, it has to be fair, it has to be appropriate to the task at hand. And many people and many companies are coming to grips with how to manage that. This has become controversial, let’s say, in issues like granting parole, or mortgages, or hiring people. There was a controversy that Amazon ran into when its hiring algorithm favored men rather than women. There’s been bias in facial recognition algorithms, which were less accurate with people of color. That’s led to some real problems in the real world. And that’s where we have to make sure we do a much better job and the tools of human-computer interaction are very effective in building these better systems in testing and evaluating. - Ben (6:10)

Every company will tell you, “We do a really good job in checking out our AI systems.” That’s great. We want every company to do a really good job. But we also want independent oversight of somebody who’s outside the company — someone who knows the field, who’s looked at systems at other companies, and who can bring ideas and bring understanding of the dangers as well. These systems operate in an adversarial environment — there are malicious actors out there who are causing trouble. You need to understand what the dangers and threats are to the use of your system. You need to understand where the biases come from, what dangers are there, and where the software has failed in other places. You may know what happens in your company, but you can benefit by learning what happens outside your company, and that’s where independent oversight from accounting companies, from governmental regulators, and from other independent groups is so valuable. - Ben (15:04)

There’s no such thing as an autonomous device. Someone owns it; somebody’s responsible for it; someone starts it; someone stops it; someone fixes it; someone notices when it’s performing poorly. … Responsibility is a pretty key factor here. So, if there’s something going on, if a manager is deciding to use some AI system, what they need is a control panel, let them know: what’s happening? What’s it doing? What’s going wrong and what’s going right? That kind of supervisory autonomy is what I talk about, not full machine autonomy that’s hidden away and you never see it because that’s just head-in-the-sand thinking. What you want to do is expose the operation of a system, and where possible, give the stakeholders who are responsible for performance the right kind of control panel and the right kind of data. … Feedback is the breakfast of champions. And companies know that. They want to be able to measure the success stories, and they want to know their failures, so they can reduce them. The continuous improvement mantra is alive and well. We do want to keep tracking what’s going on and make sure it gets better. Every quarter. - Ben (19:41)

Google has had some issues regarding hiring in the AI research area, and so has Facebook with elections and the way that algorithms tend to become echo chambers. These companies — and this is not through heavy research — probably have the heaviest investment of user experience professionals within data science organizations. They have UX, ML-UX people, UX for AI people, they’re at the cutting edge. I see a lot more generalist designers in most other companies. Most of them are rather unfamiliar with any of this or what the ramifications are on the design work that they’re doing. But even these largest companies that have, probably, the biggest penetration into the most number of people out there are getting some of this really important stuff wrong. - Brian (26:36)

Explainability is a competitive advantage for an AI system. People will gravitate towards systems that they understand, that they feel in control of, that are predictable. So, the big discussion about explainable AI focuses on what’s usually called post-hoc explanations, and the Shapley, and LIME, and other methods are usually tied to the post-hoc approach.That is, you use an AI model, you get a result and you say, “What happened?” Why was I denied a parole, or a mortgage, or a job? At that point, you want to get an explanation. Now, that idea is appealing, but I’m afraid I haven’t seen too many success stories of that working. … I’ve been diving through this for years now, and I’ve been looking for examples of good user interfaces of post-hoc explanations. It took me a long time till I found one. The culture of AI model-building would be much bolstered by an infusion of thinking about what the user interface will be for these explanations. And even the DARPA’s XAI—Explainable AI—project, which has 11 projects within it—has not really grappled with this in a good way about designing what it’s going to look like. Show it to me. … There is another way. And the strategy is basically prevention. Let’s prevent the user from getting confused and so they don’t have to request an explanation. We walk them along, let the user walk through the step—this is like Amazon checkout process, seven-step process—and you know what’s happened in each step, you can go back, you can explore, you can change things in each part of it. It’s also what TurboTax does so well, in really complicated situations, and walks you through it. … You want to have a comprehensible, predictable, and controllable user interface that makes sense as you walk through each step. - Ben (31:13)