talk-data.com
Topic
GenAI
Generative AI
1517
tagged
Activity Trend
Top Events
Join our FREE webinar, Generative AI Overview for Project Managers and gain practical insights into integrating AI into your project management strategies. In today’s dynamic landscape, projects come with unique challenges, but with the right AI tools, you can extend your capabilities further. This session will provide a solid understanding of AI and Generative AI, showing you how to streamline processes and enhance decision-making. Don’t miss this opportunity to elevate your skills and stay ahead of industry trends.
Great data presentations tell a story. Learn how to organize, visualize, and present data using Python, generative AI, and the cutting-edge Altair data visualization toolkit. Take the fast track to amazing data presentations! Data Storytelling with Altair and AI introduces a stack of useful tools and tried-and-tested methodologies that will rapidly increase your productivity, streamline the visualization process, and leave your audience inspired. In Data Storytelling with Altair and AI you’ll discover: Using Python Altair for data visualization Using Generative AI tools for data storytelling The main concepts of data storytelling Building data stories with the DIKW pyramid approach Transforming raw data into a data story Data Storytelling with Altair and AI teaches you how to turn raw data into effective, insightful data stories. You’ll learn exactly what goes into an effective data story, then combine your Python data skills with the Altair library and AI tools to rapidly create amazing visualizations. Your bosses and decision-makers will love your new presentations—and you’ll love how quick Generative AI makes the whole process! About the Technology Every dataset tells a story. After you’ve cleaned, crunched, and organized the raw data, it’s your job to share its story in a way that connects with your audience. Python’s Altair data visualization library, combined with generative AI tools like Copilot and ChatGPT, provide an amazing toolbox for transforming numbers, code, text, and graphics into intuitive data presentations. About the Book Data Storytelling with Altair and AI teaches you how to build enhanced data visualizations using these tools. The book uses hands-on examples to build powerful narratives that can inform, inspire, and motivate. It covers the Altair data visualization library, along with AI techniques like generating text with ChatGPT, creating images with DALL-E, and Python coding with Copilot. You’ll learn by practicing with each interesting data story, from tourist arrivals in Portugal to population growth in the USA to fake news, salmon aquaculture, and more. What's Inside The Data-Information-Knowledge-Wisdom (DIKW) pyramid Publish data stories using Streamlit, Tableau, and Comet Vega and Vega-Lite visualization grammar About the Reader For data analysts and data scientists experienced with Python. No previous knowledge of Altair or Generative AI required. About the Author Angelica Lo Duca is a researcher at the Institute of Informatics and Telematics of the National Research Council, Italy. The technical editor on this book was Ninoslav Cerkez. Quotes This book’s step-by-step approach, illustrated through real-world examples, makes complex data accessible and actionable. - Alexey Grigorev, DataTalks.Club A clear and concise guide to data storytelling. Highly recommended. - Andrew Madson, Insights x Design Data storytelling in a way that anyone can do! This book feels ahead of its time. - Avery Smith, Data Career Jumpstart Excellent hands-on exercises that combine two of my favorite tools: AI and the Altair library. - Jose Berengueres, Author of DataViz and Storytelling
Lot’s of AI use-cases can start with big ideas and exciting possibilities, but turning those ideas into real results is where the challenge lies. How do you take a powerful model and make it work effectively in a specific business context? What steps are necessary to fine-tune and optimize your AI tools to deliver both performance and cost efficiency? And as AI continues to evolve, how do you stay ahead of the curve while ensuring that your solutions are scalable and sustainable? Lin Qiao is the CEO and Co-Founder of Fireworks AI. She previously worked at Meta as a Senior Director of Engineering and as head of Meta's PyTorch, served as a Tech Lead at Linkedin, and worked as a Researcher and Software Engineer at IBM. In the episode, Richie and Lin explore generative AI use cases, getting AI into products, foundational models, the effort required and benefits of fine-tuning models, trade-offs between models sizes, use cases for smaller models, cost-effective AI deployment, the infrastructure and team required for AI product development, metrics for AI success, open vs closed-source models, excitement for the future of AI development and much more. Links Mentioned in the Show: Fireworks.aiHugging Face - Preference Tuning LLMs with Direct Preference Optimization MethodsConnect with LinCourse - Artificial Intelligence (AI) StrategyRelated Episode: Creating Custom LLMs with Vincent Granville, Founder, CEO & Chief Al Scientist at GenAltechLab.comRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
The rapid rise of generative AI is changing how businesses operate, but with this change comes new challenges. How do you navigate the balance between innovation and risk, especially in a regulated industry? As organizations race to adopt AI, it’s crucial to ensure that these technologies are not only transformative but also responsible. What steps can you take to harness AI’s potential while maintaining control and transparency? And how can you build excitement and trust around AI within your organization, ensuring that everyone is ready to embrace this new era? Steve Holden is the Senior Vice President and Head of Single-Family Analytics at Fannie Mae, leading a team of data science professionals, supporting loan underwriting, pricing and acquisition, securitization, loss mitigation, and loan liquidation for the company’s multi-trillion-dollar Single-Family mortgage portfolio. He is also responsible for all Generative AI initiatives across the enterprise. His team provides real-time analytic solutions that guide thousands of daily business decisions necessary to manage this extensive mortgage portfolio. The team comprises experts in econometric models, machine learning, data engineering, data visualization, software engineering, and analytic infrastructure design. Holden previously served as Vice President of Credit Portfolio Management Analytics at Fannie Mae. Before joining Fannie Mae in 1999, he held several analytic leadership roles and worked on economic issues at the Economic Strategy Institute and the U.S. Bureau of Labor Statistics. In the episode Adel and Steve explore opportunities in generative AI, building a GenAI program, use-case prioritization, driving excitement and engagement for an AI-first culture, skills transformation, governance as a competitive advantage, challenges of scaling AI, future trends in AI, and much more. Links Mentioned in the Show: Fannie MaeSteve’s recent DataCamp Webinar: Bringing Generative AI to the EnterpriseVideo: Andrej Karpathy - [1hr Talk] Intro to Large Language ModelsSkill Track - AI Business FundamentalsRelated Episode: Generative AI at EY with John Thompson, Head of AI at EYRewatch sessions from RADAR: AI Edition Join the DataFramed team! Data Evangelist Data & AI Video Creator New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
“Last week was a great year in GenAI,” jokes Mark Ramsey—and it’s a great philosophy to have as LLM tools especially continue to evolve at such a rapid rate. This week, you’ll get to hear my fun and insightful chat with Mark from Ramsey International about the world of large language models (LLMs) and how we make useful UXs out of them in the enterprise.
Mark shared some fascinating insights about using a company’s website information (data) as a place to pilot a LLM project, avoiding privacy landmines, and how re-ranking of models leads to better LLM response accuracy. We also talked about the importance of real human testing to ensure LLM chatbots and AI tools truly delight users. From amusing anecdotes about the spinning beach ball on macOS to envisioning a future where AI-driven chat interfaces outshine traditional BI tools, this episode is packed with forward-looking ideas and a touch of humor.
Highlights/ Skip to:
(0:50) Why is the world of GenAI evolving so fast? (4:20) How Mark thinks about UX in an LLM application (8:11) How Mark defines “Specialized GenAI?” (12:42) Mark’s consulting work with GenAI / LLMs these days (17:29) How GenAI can help the healthcare industry (30:23) Uncovering users’ true feelings about LLM applications (35:02) Are UIs moving backwards as models progress forward? (40:53) How will GenAI impact data and analytics teams? (44:51) Will LLMs be able to consistently leverage RAG and produce proper SQL? (51:04) Where can find more from Mark and Ramsey International
Quotes from Today’s Episode “With [GenAI], we have a solution that we’ve built to try to help organizations, and build workflows. We have a workflow that we can run and ask the same question [to a variety of GenAI models] and see how similar the answers are. Depending on the complexity of the question, you can see a lot of variability between the models… [and] we can also run the same question against the different versions of the model and see how it’s improved. Folks want a human-like experience interacting with these models.. [and] if the model can start responding in just a few seconds, that gives you much more of a conversational type of experience.” - Mark Ramsey (2:38) “[People] don’t understand when you interact [with GenAI tools] and it brings tokens back in that streaming fashion, you’re actually seeing inside the brain of the model. Every token it produces is then displayed on the screen, and it gives you that typewriter experience back in the day. If someone has to wait, and all you’re seeing is a logo spinning, from a UX experience standpoint… people feel like the model is much faster if it just starts to produce those results in that streaming fashion. I think in a design, it’s extremely important to take advantage of that [...] as opposed to waiting to the end and delivering the results some models support that, and other models don’t.”- Mark Ramsey (4:35) "All of the data that’s on the website is public information. We’ve done work with several organizations on quickly taking the data that’s on their website, packaging it up into a vector database, and making that be the source for questions that their customers can ask. [Organizations] publish a lot of information on their websites, but people really struggle to get to it. We’ve seen a lot of interest in vectorizing website data, making it available, and having a chat interface for the customer. The customer can ask questions, and it will take them directly to the answer, and then they can use the website as the source information.” - Mark Ramsey (14:04) “I’m not skeptical at all. I’ve changed much of my [AI chatbot searches] to Perplexity, and I think it’s doing a pretty fantastic job overall in terms of quality. It’s returning an answer with citations, so you have a sense of where it’s sourcing the information from. I think it’s important from a user experience perspective. This is a replacement for broken search, as I really don’t want to read all the web pages and PDFs you have that might be about my chiropractic care query to answer my actual [healthcare] question.” - Brian O’Neill (19:22)
“We’ve all had great experience with customer service, and we’ve all had situations where the customer service was quite poor, and we’re going to have that same thing as we begin to [release more] chatbots. We need to make sure we try to alleviate having those bad experiences, and have an exit. If someone is running into a situation where they’d rather talk to a live person, have that ability to route them to someone else. That’s why the robustness of the model is extremely important in the implementation… and right now, organizations like OpenAI and Anthropic are significantly better at that [human-like] experience.” - Mark Ramsey (23:46) "There’s two aspects of these models: the training aspect and then using the model to answer questions. I recommend to organizations to always augment their content and don’t just use the training data. You’ll still get that human-like experience that’s built into the model, but you’ll eliminate the hallucinations. If you have a model that has been set up correctly, you shouldn’t have to ask questions in a funky way to get answers.” - Mark Ramsey (39:11) “People need to understand GenAI is not a predictive algorithm. It is not able to run predictions, it struggles with some math, so that is not the focus for these models. What’s interesting is that you can use the model as a step to get you [the answers]. A lot of the models now support functions… when you ask a question about something that is in a database, it actually uses its knowledge about the schema of the database. It can build the query, run the query to get the data back, and then once it has the data, it can reformat the data into something that is a good response back." - Mark Ramsey (42:02)
Links Mark on LinkedIn Ramsey International Email: mark [at] ramsey.international Ramsey International's YouTube Channel
Large language models (LLMs) and generative AI are rapidly changing the healthcare industry. These technologies have the potential to revolutionize healthcare by improving the efficiency, accuracy, and personalization of care. This practical book shows healthcare leaders, researchers, data scientists, and AI engineers the potential of LLMs and generative AI today and in the future, using storytelling and illustrative use cases in healthcare. Authors Kerrie Holley, former Google healthcare professionals, guide you through the transformative potential of large language models (LLMs) and generative AI in healthcare. From personalized patient care and clinical decision support to drug discovery and public health applications, this comprehensive exploration covers real-world uses and future possibilities of LLMs and generative AI in healthcare. With this book, you will: Understand the promise and challenges of LLMs in healthcare Learn the inner workings of LLMs and generative AI Explore automation of healthcare use cases for improved operations and patient care using LLMs Dive into patient experiences and clinical decision-making using generative AI Review future applications in pharmaceutical R&D, public health, and genomics Understand ethical considerations and responsible development of LLMs in healthcare "The authors illustrate generative's impact on drug development, presenting real-world examples of its ability to accelerate processes and improve outcomes across the pharmaceutical industry." --Harsh Pandey, VP, Data Analytics & Business Insights, Medidata-Dassault Kerrie Holley is a retired Google tech executive, IBM Fellow, and VP/CTO at Cisco. Holley's extensive experience includes serving as the first Technology Fellow at United Health Group (UHG), Optum, where he focused on advancing and applying AI, deep learning, and natural language processing in healthcare. Manish Mathur brings over two decades of expertise at the crossroads of healthcare and technology. A former executive at Google and Johnson & Johnson, he now serves as an independent consultant and advisor. He guides payers, providers, and life sciences companies in crafting cutting-edge healthcare solutions.
Today, we’re joined by Joseph Ruscio, General Partner at Heavybit Industries, the leading investor in developer-first startups. We talk about: The most efficient go to market approachWhy developer velocity is the single most important thingGiving humans superpowers, not replacing themWhy it’s laughable to claim generative AI will replace nearly all developersHelping technologists build go-to-market plans & why, even though they think salespeople are icky, they must employ a lot of them
One of the big use cases of generative AI is having small applications to solve specific tasks. These are known as AI agents or AI assistants. Since they’re small and narrow in scope, you probably want to create and use lots of them, which means you need to be able to create them cheaply and easily. I’m curious as to how you go about doing this from an organizational point of view. Who needs to be involved? What’s the workflow and what technology do you need? Dmitry Shapiro is the CEO of MindStudio. He was previously the CTO at MySpace and a product manager at Google. Dmitry is also a serial entrepreneur, having founded the web-app development platform Koji, acquired by Linktree, and Veoh Networks, an early YouTube competitor. He has extensive experience in building and managing engineering, product, and AI teams. In the episode, Richie and Dmitry explore generative AI applications, AI in SaaS, approaches to AI implementation, selecting processes for automation, changes in sales and marketing roles, MindStudio, AI governance and privacy concerns, cost management, the limitations and future of AI assistants, and much more. Links Mentioned in the Show: MindStudioConnect with Dmitry[Webinar] Dmitry at RADAR: From Learning to Earning: Navigating the AI Job LandscapeRelated Episode: Designing AI Applications with Robb Wilson, Co-Founder & CEO at Onereach.aiRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
Perhaps the biggest complaint about generative AI is hallucination. If the text you want to generate involves facts, for example, a chatbot that answers questions, then hallucination is a problem. The solution to this is to make use of a technique called retrieval augmented generation, where you store facts in a vector database and retrieve the most appropriate ones to send to the large language model to help it give accurate responses. So, what goes into building vector databases and how do they improve LLM performance so much? Ram Sriharsha is currently the CTO at Pinecone. Before this role, he was the Director of Engineering at Pinecone and previously served as Vice President of Engineering at Splunk. He also worked as a Product Manager at Databricks. With a long history in the software development industry, Ram has held positions as an architect, lead product developer, and senior software engineer at various companies. Ram is also a long time contributor to Apache Spark. In the episode, Richie and Ram explore common use-cases for vector databases, RAG in chatbots, steps to create a chatbot, static vs dynamic data, testing chatbot success, handling dynamic data, choosing language models, knowledge graphs, implementing vector databases, innovations in vector data bases, the future of LLMs and much more. Links Mentioned in the Show: PineconeWebinar - Charting the Path: What the Future Holds for Generative AICourse - Vector Databases for Embeddings with PineconeRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
By now, many of us are convinced that generative AI chatbots like ChatGPT are useful at work. However, many executives are rightfully worried about the risks from having business and customer conversations recorded by AI chatbot platforms. Some privacy and security-conscious organizations are going so far as to block these AI platforms completely. For organizations such as EY, a company that derives value from its intellectual property, leaders need to strike a balance between privacy and productivity. John Thompson runs the department for the ideation, design, development, implementation, & use of innovative Generative AI, Traditional AI, & Causal AI solutions, across all of EY's service lines, operating functions, geographies, & for EY's clients. His team has built the world's largest, secure, private LLM-based chat environment. John also runs the Marketing Sciences consultancy, advising clients on monetization strategies for data. He is the author of four books on data, including "Data for All' and "Causal Artificial Intelligence". Previously, he was the Global Head of AI at CSL Behring, an Adjunct Professor at Lake Forest Graduate School of Management, and an Executive Partner at Gartner. In the episode, Richie and John explore the adoption of GenAI at EY, data privacy and security, GenAI use cases and productivity improvements, GenAI for decision making, causal AI and synthetic data, industry trends and predictions and much more. Links Mentioned in the Show: Azure OpenAICausality by Judea Pearl[Course] AI EthicsRelated Episode: Data & AI at Tesco with Venkat Raghavan, Director of Analytics and Science at TescoCatch John talking about AI Maturity this SeptemberRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
Dive into this AWS session with Rick Sears, General Manager of Amazon Athena, EMR, and Lake Formation at AWS, and Martin Holste, CTO Cloud and AI at Trellix. Explore how Trellix uses Amazon Bedrock and AWS EMR to revolutionize security operations. Learn how generative AI and comprehensive data strategies enhance threat detection and automate security processes, driving a new era of efficiency and protection. Discover practical AI applications and real-world examples, and get ready to accelerate your AI journey with AWS.
Speakers: Martin Holste, CTO Cloud and AI, Trellix Rick Sears, General Manager of Amazon Athena, EMR, and Lake Formation, Amazon Web Services
Learn more: https://go.aws/3x2mha0 Learn more about AWS events: https://go.aws/3kss9CP
Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
AWSEvents #awsaidataconference #awsgenai
Summary Data contracts are both an enforcement mechanism for data quality, and a promise to downstream consumers. In this episode Tom Baeyens returns to discuss the purpose and scope of data contracts, emphasizing their importance in achieving reliable analytical data and preventing issues before they arise. He explains how data contracts can be used to enforce guarantees and requirements, and how they fit into the broader context of data observability and quality monitoring. The discussion also covers the challenges and benefits of implementing data contracts, the organizational impact, and the potential for standardization in the field.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.At Outshift, the incubation engine from Cisco, they are driving innovation in AI, cloud, and quantum technologies with the powerful combination of enterprise strength and startup agility. Their latest innovation for the AI ecosystem is Motific, addressing a critical gap in going from prototype to production with generative AI. Motific is your vendor and model-agnostic platform for building safe, trustworthy, and cost-effective generative AI solutions in days instead of months. Motific provides easy integration with your organizational data, combined with advanced, customizable policy controls and observability to help ensure compliance throughout the entire process. Move beyond the constraints of traditional AI implementation and ensure your projects are launched quickly and with a firm foundation of trust and efficiency. Go to motific.ai today to learn more!Your host is Tobias Macey and today I'm interviewing Tom Baeyens about using data contracts to build a clearer API for your dataInterview IntroductionHow did you get involved in the area of data management?Can you describe the scope and purpose of data contracts in the context of this conversation?In what way(s) do they differ from data quality/data observability?Data contracts are also known as the API for data, can you elaborate on this?What are the types of guarantees and requirements that you can enforce with these data contracts?What are some examples of constraints or guarantees that cannot be represented in these contracts?Are data contracts related to the shift-left?Data contracts are also known as the API for data, can you elaborate on this?The obvious application of data contracts are in the context of pipeline execution flows to prevent failing checks from propagating further in the data flow. What are some of the other ways that these contracts can be integrated into an organization's data ecosystem?How did you approach the design of the syntax and implementation for Soda's data contracts?Guarantees and constraints around data in different contexts have been implemented in numerous tools and systems. What are the areas of overlap in e.g. dbt, great expectations?Are there any emerging standards or design patterns around data contracts/guarantees that will help encourage portability and integration across tooling/platform contexts?What are the most interesting, innovative, or unexpected ways that you have seen data contracts used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data contracts at Soda?When are data contracts the wrong choice?What do you have planned for the future of data contracts?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SodaPodcast EpisodeJBossData ContractAirflowUnit TestingIntegration TestingOpenAPIGraphQLCircuit Breaker PatternSodaCLSoda Data ContractsData MeshGreat Expectationsdbt Unit TestsOpen Data ContractsODCS == Open Data Contract StandardODPS == Open Data Product SpecificationThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
It is possible to implement AI-based solutions without coding. In this workshop, I will reshape an old application of mine for fraud detection. First I will read documents from a dataset of contracts, then find the frauds, and finally use GenAI to send the customer an alert email in their preferred language. It takes just three nodes to generate the email text: Authenticate – Connect – Prompt. With a few more nodes, using RAG, I can also generate the alert email from a context document.