Building Generative AI apps with your data in Azure Cosmos DB

2024-08-21 · NYC Meetup On AI, Platforms & Cloud Native

talk

by James Codella (Azure CosmosDB)

azure cosmos db diskann vector databases

For today’s Gen-AI apps, fast performance, instant scalability, and cost-effectiveness are more critical than ever. This session will delve into the importance of these factors when building RAG pattern apps while maintaining low costs. We will explore the capabilities of Azure Cosmos DB and its new vector database capabilities using DiskANN, a technology developed by Microsoft Research. With DiskANN, users can achieve low latency, high-recall vector search at any scale. Combined with Azure Cosmos DB’s unique scale-out architecture and instant autoscale, it provides enormous value with a cost profile unmatched by any vector database in the market today. This allows for the development of large-scale applications that are not only powerful and reliable but also economical. Join us and discover how to architect high accuracy, low latency, and cost-effective RAG pattern applications at any scale using Azure Cosmos DB and DiskANN. Regardless of your role, this session will provide valuable insights into bringing this new generation of applications to your business.

RAG (Retrieval-Augmented Generation)

2024-08-13 · August Members Talk Evening

talk

by Fateme Kamali (DHL Data & Analytics team)

NLP retrieval-augmented generation

Talk description TBC

#234 High Performance Generative AI Applications with Ram Sriharsha, CTO at Pinecone

2024-08-12 · DataFramed Listen

podcast_episode

by Ram Sriharsha (Pinecone) , Richie (DataCamp)

AI/ML Databricks GenAI LLM Pinecone Spark Splunk Vector DB

Perhaps the biggest complaint about generative AI is hallucination. If the text you want to generate involves facts, for example, a chatbot that answers questions, then hallucination is a problem. The solution to this is to make use of a technique called retrieval augmented generation, where you store facts in a vector database and retrieve the most appropriate ones to send to the large language model to help it give accurate responses. So, what goes into building vector databases and how do they improve LLM performance so much? Ram Sriharsha is currently the CTO at Pinecone. Before this role, he was the Director of Engineering at Pinecone and previously served as Vice President of Engineering at Splunk. He also worked as a Product Manager at Databricks. With a long history in the software development industry, Ram has held positions as an architect, lead product developer, and senior software engineer at various companies. Ram is also a long time contributor to Apache Spark. In the episode, Richie and Ram explore common use-cases for vector databases, RAG in chatbots, steps to create a chatbot, static vs dynamic data, testing chatbot success, handling dynamic data, choosing language models, knowledge graphs, implementing vector databases, innovations in vector data bases, the future of LLMs and much more. Links Mentioned in the Show: PineconeWebinar - Charting the Path: What the Future Holds for Generative AICourse - Vector Databases for Embeddings with PineconeRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Tech Talk: Implementing AI solutions visually

2024-07-26 · AI Seminar (Virtual): Implementing AI solutions visually

workshop

AI/ML GenAI

It is possible to implement AI-based solutions without coding. In this workshop, I will reshape an old application of mine for fraud detection. First I will read documents from a dataset of contracts, then find the frauds, and finally use GenAI to send the customer an alert email in their preferred language. It takes just three nodes to generate the email text: Authenticate – Connect – Prompt. With a few more nodes, using RAG, I can also generate the alert email from a context document.

Tech Talk: Implementing AI solutions visually

2024-07-26 · AI Seminar (Virtual): Implementing AI solutions visually

talk

genai knime no-code ai

It is possible to implement AI-based solutions without coding. In this workshop, I will reshape an old application of mine for fraud detection. First I will read documents from a dataset of contracts, then find the frauds, and finally use GenAI to send the customer an alert email in their preferred language. It takes just three nodes to generate the email text: Authenticate – Connect – Prompt. With a few more nodes, using RAG, I can also generate the alert email from a context document.

Tech Talk: Implementing AI solutions visually

2024-07-26 · AI Seminar (Virtual): Implementing AI solutions visually

workshop

genai knime

Abstract: It is possible to implement AI-based solutions without coding. In this workshop, I will reshape an old application of mine for fraud detection. First I will read documents from a dataset of contracts, then find the frauds, and finally use GenAI to send the customer an alert email in their preferred language. It takes just three nodes to generate the email text: Authenticate – Connect – Prompt. With a few more nodes, using RAG, I can also generate the alert email from a context document.

Challenges & lessons learned from implementing RAG systems

2024-07-25 · MLOps.community Berlin - Summer Edition ☀️☀️

talk

by Joanna Stoffregen (Labsbit.ai)

LLM

Retrieval-Augmented Generation (RAG) has become a popular method to address this issue, augmenting LLMs with an external knowledge base. However, implementing RAG introduces distinct challenges. In this presentation, Joanna will share practical insights into the challenges encountered while implementing RAG systems, alongside strategies for overcoming them. You'll be equipped with the tools and methodologies needed to navigate these challenges successfully.

How Generative AI Is Impacting Data Engineering Teams

2024-07-21 · Data Engineering Podcast Listen

podcast_episode

by Tobias Macey , Lior Gavish (Monte Carlo)

AI/ML Data Engineering Data Lake Data Lakehouse Data Management Delta GenAI Hive Iceberg LLM Monte Carlo Python +2 more

Summary Generative AI has rapidly gained adoption for numerous use cases. To support those applications, organizational data platforms need to add new features and data teams have increased responsibility. In this episode Lior Gavish, co-founder of Monte Carlo, discusses the various ways that data teams are evolving to support AI powered features and how they are incorporating AI into their work. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Lior Gavish about the impact of AI on data engineersInterview IntroductionHow did you get involved in the area of data management?Can you start by clarifying what we are discussing when we say "AI"?Previous generations of machine learning (e.g. deep learning, reinforcement learning, etc.) required new features in the data platform. What new demands is the current generation of AI introducing?Generative AI also has the potential to be incorporated in the creation/execution of data pipelines. What are the risk/reward tradeoffs that you have seen in practice?What are the areas where LLMs have proven useful/effective in data engineering?Vector embeddings have rapidly become a ubiquitous data format as a result of the growth in retrieval augmented generation (RAG) for AI applications. What are the end-to-end operational requirements to support this use case effectively?As with all data, the reliability and quality of the vectors will impact the viability of the AI application. What are the different failure modes/quality metrics/error conditions that they are subject to?As much as vectors, vector databases, RAG, etc. seem exotic and new, it is all ultimately shades of the same work that we have been doing for years. What are the areas of overlap in the work required for running the current generation of AI, and what are the areas where it diverges?What new skills do data teams need to acquire to be effective in supporting AI applications?What are the most interesting, innovative, or unexpected ways that you have seen AI impact data engineering teams?What are the most interesting, unexpected, or challenging lessons that you have learned while working with the current generation of AI?When is AI the wrong choice?What are your predictions for the future impact of AI on data engineering teams?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your Links Monte CarloPodcast EpisodeNLP == Natural Language ProcessingLarge Language ModelsGenerative AIMLOpsML EngineerFeature StoreRetrieval Augmented Generation (RAG)LangchainThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Lessons from setting up RAG systems in production to search over internal data and automate customer support

2024-07-10 · London Reactor Meetup | AI at Scale

talk

customer support automation internal data search production-ready systems

Everyone has been crazy about Gen AI lately and yet few companies can say they derive measurable business value from Gen AI based solutions. I will go over two use cases and the journey to get them from an out-of-the-box RAG pipeline that kinda works in a demo to production quality systems that create value. The first use case is search over technical content such as documentation and developer forums. The second use case is automating ticket-based customer support.

Using RAG to improve responses in generative AI applications

2024-07-04 · #8 - London - Veracode

talk

by Rita Sethi (AWS) , Ariq Rahman (AWS)

AI/ML GenAI

Your generative AI applications can deliver better responses by incorporating organization-specific data. In this session, we will talk about how you can use your organization’s data with Generative AI and how you can simplify the process using Knowledge Bases for Amazon Bedrock. This session is suitable for either business or technical individuals wanting to achieve the best outcomes from their Generative AI applications.

Customizing LLMs: Leveraging Technology to tailor GenAI using Airflow

2024-07-01 · Airflow Summit 2024

session

by Moulay Zaidane Al Bahi Draidia , Jim Howard , Vincent La

AI/ML Airflow GenAI LLM

Laurel provides an AI-driven timekeeping solution tailored for accounting and legal firms, automating timesheet creation by capturing digital work activities. This session highlights two notable AI projects: UTBMS Code Prediction: Leveraging small language models, this system builds new embeddings to predict work codes for legal bills with high accuracy. More details are available in our case study: https://www.laurel.ai/resources-post/enhancing-legal-and-accounting-workflows-with-ai-insights-into-work-code-prediction . Bill Creation and Narrative Generation: Utilizing Retrieval-Augmented Generation (RAG), this approach transforms users’ digital activities into fully billable entries. Additionally, we will discuss how we use Airflow for model management in these AI projects: Daily Model Retraining: We retrain our models daily Model (Re)deployment: Our Airflow DAG evaluates model performance, redeploying it if improvements are detected Cost Management: To avoid high costs associated with querying large language models frequently, our DAG utilizes RAG to efficiently summarize daily activities into a billable timesheet at day’s end.

Gen AI using Airflow 3: A vision for Airflow RAGs

2024-07-01 · Airflow Summit 2024

session

by Kaxil Naik , Ash Berlin-Taylor (Astronomer)

AI/ML Airflow API GenAI LLM Vector DB

Gen AI has taken the computing world by storm. As Enterprises and Startups have started to experiment with LLM applications, it has become clear that providing the right context to these LLM applications is critical. This process known as Retrieval augmented generation (RAG) relies on adding custom data to the large language model, so that the efficacy of the response can be improved. Processing custom data and integrating with Enterprise applications is a strength of Apache Airflow. This talk goes into details about a vision to enhance Apache Airflow to more intuitively support RAG, with additional capabilities and patterns. Specifically, these include the following Support for unstructured data sources such as Text, but also extending to Image, Audio, Video, and Custom sensor data LLM model invocation, including both external model services through APIs and local models using container invocation. Automatic Index Refreshing with a focus on unstructured data lifecycle management to avoid cumbersome and expensive index creation on Vector databases Templates for hallucination reduction via testing and scoping strategies

How the Airflow Community Productionizes Generative AI

2024-07-01 · Airflow Summit 2024

session

by Pete DeJoy (Astronomer)

AI/ML Airflow Data Engineering Data Quality GenAI LLM MLOps

Every data team out there is being asked from their business stakeholders about Generative AI. Taking LLM centric workloads to production is not a trivial task. At the foundational level, there are a set of challenges around data delivery, data quality, and data ingestion that mirror traditional data engineering problems. Once you’re past those, there’s a set of challenges related to the underlying use case you’re trying to solve. Thankfully, because of how Airflow was already being used at these companies for data engineering and MLOps use cases, it has become the defacto orchestration layer behind many GenAI use cases for startups and Fortune 500s. This talk will be a tour of various methods, best practices, and considerations used in the Airflow community when taking GenAI use cases to production. We’ll focus on 4 primary use cases; RAG, fine tuning, resource management, and batch inference and take a walk through patterns different members in the community have used to productionize this new, exciting technology.

The Future of AI: LLMs, AGI, and Beyond (Part 2)

2024-06-21 · Future of Data and AI Listen

podcast_episode

by Bob van Luijt (Weaviate) , Raja Iqbal

AI/ML Data Science LLM Vector DB

This episode features the second part of an engaging discussion between Raja Iqbal, Founder and CEO of Data Science Dojo, and Bob van Luijt, Co-founder and CEO of Weaviate, a prominent open-source vector database in the industry. Raja and Bob trace the evolution of AI over the years, the current LLM landscape, and its outlook for the future. They further dive deep into various LLM concepts such as RAG, fine-tuning, challenges in enterprise adoption, vector search, context windows, the potential of SLMs, generative feedback loop, and more. Lastly, Raja and Bob explore Artificial General Intelligence (AGI) and whether it could be a reality in the near future. This episode is a must watch for anyone interested in a comprehensive outlook on the current state and future trajectory of AI.

AWS re:Inforce 2024 - Safeguarding sensitive data used in generative AI with RAG (DAP223)

2024-06-14 · AWS re:Invent 2024 Watch

video

by Myeongsu Jeon (Amazon Web Services)

Agile/Scrum AI/ML AWS Cloud Computing GenAI

As an increasing number of organizations leverage internal data for optimizing outputs in generative AI through Retrieval Augmented Generation (RAG), concerns about potential internal data leaks have grown. This talk delves into strategies for securely transmitting and storing the internal data used in RAG. Additionally, explore methods for identifying sensitive data and learn about best practices for subsequent measures to address these concerns.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Building a secure end-to-end generative AI application in the cloud (NIS321)

2024-06-14 · AWS re:Invent 2024 Watch

video

by Brandon Carroll (AWS)

Agile/Scrum AI/ML AWS Cloud Computing GenAI LLM S3 Cyber Security Snowflake Vector DB

The security and privacy of data during the training, fine-tuning, and inferencing phases of generative AI are paramount. This lightning talk introduces a reference architecture designed to use the security of AWS PrivateLink with generative AI applications. Explore the importance of protecting proprietary data in applications that leverage both AWS native LLMs and ISV-supplied external data stores. Learn about the secure movement and usage of data, particularly for RAG processes, across various data sources like Amazon S3, vector databases, and Snowflake. Learn how this reference architecture not only meets today’s security demands but also sets the stage for the future of secure generative AI development.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Persona-based access to enterprise data for generative AI apps (GAI325)

2024-06-14 · AWS re:Invent 2024 Watch

video

by Hardik Vasa (Amazon Web Services) , Bani Sharma (Amazon Web Services)

Agile/Scrum AI/ML AWS Cloud Computing GenAI Cyber Security

Enterprises face challenges controlling document access within generative AI applications. This lightning talk presents a solution using Retrieval Augmented Generation (RAG) and metadata filtering in Knowledge bases for Amazon Bedrock, enhancing data protection and security management. Learn how this approach secures role-specific information retrieval and improves knowledge management in enterprise chatbot systems, illustrated with a demo.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Mind your business: Secure your generative AI application on AWS (GAI322)

2024-06-12 · AWS re:Invent 2024 Watch

video

by Nitin Kumar (AWS Professional Services, ANZ)

Agile/Scrum AI/ML AWS Cloud Computing GenAI Cyber Security

Discover techniques to enhance the security of your generative AI application through secure prompt engineering and effective Retrieval Augmented Generation (RAG) strategies. This lightning talk guides you through proactive approaches to mitigate bias, protect privacy, and prevent toxicity in your language model using Amazon Bedrock and other services. Learn how to develop guardrails that align your generative AI solution with responsible AI principles while prioritizing security and privacy considerations.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

Enhancing search on AWS with AI, RAG, and vector databases (L300) | AWS Events

2024-06-04 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Aurora Cloud Computing GenAI LLM Vector DB

As AI continues to transform industries, the applications of generative AI and Large Language Models (LLMs) are becoming increasingly significant. This session delves into the utility of these models across various sectors. Gain an understanding of how to use LLMs, embeddings, vector datastores, and their indexing techniques to create search solutions for enhanced user experiences and improved outcomes on AWS using Amazon Bedrock, Aurora, and LangChain. By the end of this session, participants will be equipped with the knowledge to harness the power of LLMs and vector databases, paving the way for the development of innovative search solutions on AWS.

Learn more: https://go.aws/3x2mha0 Learn more about AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSEvents #GenerativeAI #AI #Cloud #AWSAIandDataConference

Tech Talk: Doing RAG the right way! - Advanced RAG Techniques

2024-05-16 · AI Meetup (May): Generative AI, LLMs and ML

talk

retrieval-augmented generation

In this talk, I will present some of the latest advances in retrieval-augmented generation(RAG) techniques, which combine the strengths of both retrieval-based and generative approaches for chatbot development. Retrieval-based methods can leverage existing text documents to provide informative and coherent responses, while generative methods can produce novel and engaging conversations personalized to the user.

talk-data.com

RAG

Activity Trend

Top Events

Top Speakers

Building Generative AI apps with your data in Azure Cosmos DB

RAG (Retrieval-Augmented Generation)

#234 High Performance Generative AI Applications with Ram Sriharsha, CTO at Pinecone

Tech Talk: Implementing AI solutions visually

Tech Talk: Implementing AI solutions visually

Tech Talk: Implementing AI solutions visually

Challenges & lessons learned from implementing RAG systems

How Generative AI Is Impacting Data Engineering Teams

Lessons from setting up RAG systems in production to search over internal data and automate customer support

Using RAG to improve responses in generative AI applications

Customizing LLMs: Leveraging Technology to tailor GenAI using Airflow

Gen AI using Airflow 3: A vision for Airflow RAGs

How the Airflow Community Productionizes Generative AI

The Future of AI: LLMs, AGI, and Beyond (Part 2)

AWS re:Inforce 2024 - Safeguarding sensitive data used in generative AI with RAG (DAP223)

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Building a secure end-to-end generative AI application in the cloud (NIS321)

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Persona-based access to enterprise data for generative AI apps (GAI325)

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

AWS re:Inforce 2024 - Mind your business: Secure your generative AI application on AWS (GAI322)

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

Enhancing search on AWS with AI, RAG, and vector databases (L300) | AWS Events

AWSEvents #GenerativeAI #AI #Cloud #AWSAIandDataConference

Tech Talk: Doing RAG the right way! - Advanced RAG Techniques