talk-data.com talk-data.com

Topic

Vector DB

ai

80

tagged

Activity Trend

10 peak/qtr
2020-Q1 2026-Q1

Activities

80 activities · Newest first

Simplify your GenAI journey and unlock the hidden power within your databases. Businesses often feel pressured to adopt new, specialized technologies to stay ahead. However, the power to revolutionize your applications with GenAI may already reside within your current database infrastructure. 

We’ll build understanding of vector capabilities, ease of use/ROI, and how PostgreSQL, enhanced with the pgvector extension, can address 80% of common GenAI use cases, providing a streamlined and cost-effective path to AI-driven solutions.

Join us to demystify the hype around dedicated vector databases and explore how built-in vector capabilities existing databases can efficiently support your GenAI workloads without extra overhead.

The massive interest in AI solutions has sparked a huge and pervasive wave of AI projects. We are now entering a second phase where the AI projects that have proven value are looking for operational landing places in enterprise environments. This is visible through the big hype for AI data systems like vector databases and feature stores. This phase of AI operationalizing is the hour of databases, which have proven already to be the battle-proof bedrock for data management enterprise environments. 

Postgres is naturally a front runner in this space. AI workloads are entirely tied to data, they start with data, they run on data and they produce data. Join this talk for a walkthrough on popular AI application flows, their strong ties to data and Postgres' strong operational qualities and demonstrate how they form the perfect environment for mission critical AI solutions in an enterprise.

“Last week was a great year in GenAI,” jokes Mark Ramsey—and it’s a great philosophy to have as LLM tools especially continue to evolve at such a rapid rate. This week, you’ll get to hear my fun and insightful chat with Mark from Ramsey International about the world of large language models (LLMs) and how we make useful UXs out of them in the enterprise. 

Mark shared some fascinating insights about using a company’s website information (data) as a place to pilot a LLM project, avoiding privacy landmines, and how re-ranking of models leads to better LLM response accuracy. We also talked about the importance of real human testing to ensure LLM chatbots and AI tools truly delight users. From amusing anecdotes about the spinning beach ball on macOS to envisioning a future where AI-driven chat interfaces outshine traditional BI tools, this episode is packed with forward-looking ideas and a touch of humor.

Highlights/ Skip to:

(0:50) Why is the world of GenAI evolving so fast? (4:20) How Mark thinks about UX in an LLM application (8:11) How Mark defines “Specialized GenAI?” (12:42) Mark’s consulting work with GenAI / LLMs these days (17:29) How GenAI can help the healthcare industry (30:23) Uncovering users’ true feelings about LLM applications (35:02) Are UIs moving backwards as models progress forward? (40:53) How will GenAI impact data and analytics teams? (44:51) Will LLMs be able to consistently leverage RAG and produce proper SQL? (51:04) Where can find more from Mark and Ramsey International

Quotes from Today’s Episode “With [GenAI], we have a solution that we’ve built to try to help organizations, and build workflows. We have a workflow that we can run and ask the same question [to a variety of GenAI models] and see how similar the answers are. Depending on the complexity of the question, you can see a lot of variability between the models… [and] we can also run the same question against the different versions of the model and see how it’s improved. Folks want a human-like experience interacting with these models.. [and] if the model can start responding in just a few seconds, that gives you much more of a conversational type of experience.” - Mark Ramsey (2:38) “[People] don’t understand when you interact [with GenAI tools] and it brings tokens back in that streaming fashion, you’re actually seeing inside the brain of the model. Every token it produces is then displayed on the screen, and it gives you that typewriter experience back in the day. If someone has to wait, and all you’re seeing is a logo spinning, from a UX experience standpoint… people feel like the model is much faster if it just starts to produce those results in that streaming fashion. I think in a design, it’s extremely important to take advantage of that [...] as opposed to waiting to the end and delivering the results some models support that, and other models don’t.”- Mark Ramsey (4:35) "All of the data that’s on the website is public information. We’ve done work with several organizations on quickly taking the data that’s on their website, packaging it up into a vector database, and making that be the source for questions that their customers can ask. [Organizations] publish a lot of information on their websites, but people really struggle to get to it. We’ve seen a lot of interest in vectorizing website data, making it available, and having a chat interface for the customer. The customer can ask questions, and it will take them directly to the answer, and then they can use the website as the source information.” - Mark Ramsey (14:04) “I’m not skeptical at all. I’ve changed much of my [AI chatbot searches] to Perplexity, and I think it’s doing a pretty fantastic job overall in terms of quality. It’s returning an answer with citations, so you have a sense of where it’s sourcing the information from. I think it’s important from a user experience perspective. This is a replacement for broken search, as I really don’t want to read all the web pages and PDFs you have that might be about my chiropractic care query to answer my actual [healthcare] question.” - Brian O’Neill (19:22)

“We’ve all had great experience with customer service, and we’ve all had situations where the customer service was quite poor, and we’re going to have that same thing as we begin to [release more] chatbots. We need to make sure we try to alleviate having those bad experiences, and have an exit. If someone is running into a situation where they’d rather talk to a live person, have that ability to route them to someone else. That’s why the robustness of the model is extremely important in the implementation… and right now, organizations like OpenAI and Anthropic are significantly better at that [human-like] experience.” - Mark Ramsey (23:46) "There’s two aspects of these models: the training aspect and then using the model to answer questions. I recommend to organizations to always augment their content and don’t just use the training data. You’ll still get that human-like experience that’s built into the model, but you’ll eliminate the hallucinations. If you have a model that has been set up correctly, you shouldn’t have to ask questions in a funky way to get answers.” - Mark Ramsey (39:11) “People need to understand GenAI is not a predictive algorithm. It is not able to run predictions, it struggles with some math, so that is not the focus for these models. What’s interesting is that you can use the model as a step to get you [the answers]. A lot of the models now support functions… when you ask a question about something that is in a database, it actually uses its knowledge about the schema of the database. It can build the query, run the query to get the data back, and then once it has the data, it can reformat the data into something that is a good response back." - Mark Ramsey (42:02)

Links Mark on LinkedIn Ramsey International Email: mark [at] ramsey.international Ramsey International's YouTube Channel

Utilizing a Hailo AI Acceleration Module with a Raspberry Pi 5 device we will process real-time video streams from an edge camera and store real-time results in a vector database and send messages to Slack channels. We will show you how to build Edge AI applications that can stream unstructured data to the cloud or store it locally in a local Vector database. We will run a live demo of a neural network inference accelerator capable of 13 tera-operations per second (TOPS).

Perhaps the biggest complaint about generative AI is hallucination. If the text you want to generate involves facts, for example, a chatbot that answers questions, then hallucination is a problem. The solution to this is to make use of a technique called retrieval augmented generation, where you store facts in a vector database and retrieve the most appropriate ones to send to the large language model to help it give accurate responses. So, what goes into building vector databases and how do they improve LLM performance so much? Ram Sriharsha is currently the CTO at Pinecone. Before this role, he was the Director of Engineering at Pinecone and previously served as Vice President of Engineering at Splunk. He also worked as a Product Manager at Databricks. With a long history in the software development industry, Ram has held positions as an architect, lead product developer, and senior software engineer at various companies. Ram is also a long time contributor to Apache Spark.  In the episode, Richie and Ram explore common use-cases for vector databases, RAG in chatbots, steps to create a chatbot, static vs dynamic data, testing chatbot success, handling dynamic data, choosing language models, knowledge graphs, implementing vector databases, innovations in vector data bases, the future of LLMs and much more.  Links Mentioned in the Show: PineconeWebinar - Charting the Path: What the Future Holds for Generative AICourse - Vector Databases for Embeddings with PineconeRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Summary Generative AI has rapidly gained adoption for numerous use cases. To support those applications, organizational data platforms need to add new features and data teams have increased responsibility. In this episode Lior Gavish, co-founder of Monte Carlo, discusses the various ways that data teams are evolving to support AI powered features and how they are incorporating AI into their work. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Lior Gavish about the impact of AI on data engineersInterview IntroductionHow did you get involved in the area of data management?Can you start by clarifying what we are discussing when we say "AI"?Previous generations of machine learning (e.g. deep learning, reinforcement learning, etc.) required new features in the data platform. What new demands is the current generation of AI introducing?Generative AI also has the potential to be incorporated in the creation/execution of data pipelines. What are the risk/reward tradeoffs that you have seen in practice?What are the areas where LLMs have proven useful/effective in data engineering?Vector embeddings have rapidly become a ubiquitous data format as a result of the growth in retrieval augmented generation (RAG) for AI applications. What are the end-to-end operational requirements to support this use case effectively?As with all data, the reliability and quality of the vectors will impact the viability of the AI application. What are the different failure modes/quality metrics/error conditions that they are subject to?As much as vectors, vector databases, RAG, etc. seem exotic and new, it is all ultimately shades of the same work that we have been doing for years. What are the areas of overlap in the work required for running the current generation of AI, and what are the areas where it diverges?What new skills do data teams need to acquire to be effective in supporting AI applications?What are the most interesting, innovative, or unexpected ways that you have seen AI impact data engineering teams?What are the most interesting, unexpected, or challenging lessons that you have learned while working with the current generation of AI?When is AI the wrong choice?What are your predictions for the future impact of AI on data engineering teams?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your Links Monte CarloPodcast EpisodeNLP == Natural Language ProcessingLarge Language ModelsGenerative AIMLOpsML EngineerFeature StoreRetrieval Augmented Generation (RAG)LangchainThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Despite GPT, Claude, Gemini, LLama and the other host of LLMs that we have access to, a variety of organizations are still exploring their options when it comes to custom LLMs. Logging in to ChatGPT is easy enough, and so is creating a 'custom' openAI GPT, but what does it take to create a truly custom LLM? When and why might this be useful, and will it be worth the effort? Vincent Granville is a pioneer in the AI and machine learning space, he is Co-Founder of Data Science Central, Founder of MLTechniques.com, former VC-funded executive, author, and patent owner. Vincent’s corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He is also a former post-doc at Cambridge University and the National Institute of Statistical Sciences. Vincent has published in the Journal of Number Theory, Journal of the Royal Statistical Society, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the author of multiple books, including “Synthetic Data and Generative AI”. In the episode, Richie and Vincent explore why you might want to create a custom LLM including issues with standard LLMs and benefits of custom LLMs, the development and features of custom LLMs, architecture and technical details, corporate use cases, technical innovations, ethics and legal considerations, and much more.  Links Mentioned in the Show: Read Articles by VincentSynthetic Data and Generative AI by Vincent GranvilleConnect with Vincent on Linkedin[Course] Developing LLM Applications with LangChainRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Gen AI has taken the computing world by storm. As Enterprises and Startups have started to experiment with LLM applications, it has become clear that providing the right context to these LLM applications is critical. This process known as Retrieval augmented generation (RAG) relies on adding custom data to the large language model, so that the efficacy of the response can be improved. Processing custom data and integrating with Enterprise applications is a strength of Apache Airflow. This talk goes into details about a vision to enhance Apache Airflow to more intuitively support RAG, with additional capabilities and patterns. Specifically, these include the following Support for unstructured data sources such as Text, but also extending to Image, Audio, Video, and Custom sensor data LLM model invocation, including both external model services through APIs and local models using container invocation. Automatic Index Refreshing with a focus on unstructured data lifecycle management to avoid cumbersome and expensive index creation on Vector databases Templates for hallucination reduction via testing and scoping strategies

This episode features the second part of an engaging discussion between Raja Iqbal, Founder and CEO of Data Science Dojo, and Bob van Luijt, Co-founder and CEO of Weaviate, a prominent open-source vector database in the industry. Raja and Bob trace the evolution of AI over the years, the current LLM landscape, and its outlook for the future. They further dive deep into various LLM concepts such as RAG, fine-tuning, challenges in enterprise adoption, vector search, context windows, the potential of SLMs, generative feedback loop, and more. Lastly, Raja and Bob explore Artificial General Intelligence (AGI) and whether it could be a reality in the near future. This episode is a must watch for anyone interested in a comprehensive outlook on the current state and future trajectory of AI.

Arguably one of the verticals that is both at the same time most ripe for disruption by AI and the hardest to disrupt is search. We've seen many attempts at reimagining search using AI, and many are trying to usurp Google from its throne as the top search engine on the planet, but I think no one is laying the case better for AI assisted search than perplexity. AI. Perplexity doesn't need an introduction. It is an AI powered search engine that lets you get the information you need as fast as possible. Denis Yarats is the Co-Founder and Chief Technology Officer of Perplexity AI. He previously worked at Facebook as an AI Research Scientist. Denis Yarats attended New York University. His previous research interests broadly involved Reinforcement Learning, Deep Learning, NLP, robotics and investigating ways of semi-supervising Hierarchical Reinforcement Learning using natural language. In the episode, Adel and Denis explore Denis’ role at Perplexity.ai, key differentiators of Perplexity.ai when compared to other chatbot-powered tools, culture at perplexity, competition in the AI space, building genAI products, the future of AI and search, open-source vs closed-source AI and much more.  Links Mentioned in the Show: Perplexity.aiNeurIPS Conference[Course] Artificial Intelligence (AI) StrategyRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

AWS re:Inforce 2024 - Building a secure end-to-end generative AI application in the cloud (NIS321)

The security and privacy of data during the training, fine-tuning, and inferencing phases of generative AI are paramount. This lightning talk introduces a reference architecture designed to use the security of AWS PrivateLink with generative AI applications. Explore the importance of protecting proprietary data in applications that leverage both AWS native LLMs and ISV-supplied external data stores. Learn about the secure movement and usage of data, particularly for RAG processes, across various data sources like Amazon S3, vector databases, and Snowflake. Learn how this reference architecture not only meets today’s security demands but also sets the stage for the future of secure generative AI development.

Learn more about AWS re:Inforce at https://go.aws/reinforce.

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

reInforce2024 #CloudSecurity #AWS #AmazonWebServices #CloudComputing

Enhancing search on AWS with AI, RAG, and vector databases (L300) | AWS Events

As AI continues to transform industries, the applications of generative AI and Large Language Models (LLMs) are becoming increasingly significant. This session delves into the utility of these models across various sectors. Gain an understanding of how to use LLMs, embeddings, vector datastores, and their indexing techniques to create search solutions for enhanced user experiences and improved outcomes on AWS using Amazon Bedrock, Aurora, and LangChain. By the end of this session, participants will be equipped with the knowledge to harness the power of LLMs and vector databases, paving the way for the development of innovative search solutions on AWS.

Learn more: https://go.aws/3x2mha0 Learn more about AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSEvents #GenerativeAI #AI #Cloud #AWSAIandDataConference

As generative AI applications mature, retrieval-augmented generation (RAG) has become popular for improving large language model-based apps. We expect teams to move beyond basic RAG to autonomous agents and generative loops. We'll set up a Weaviate vector database on Google Kubernetes Engine (GKE) and Gemini to showcase generative feedback loops.

After this session, a Google Cloud GKE user should be able to: - Deploy Weaviate open source on GKE - Set up a pipeline to ingest data from the Cloud Storage bucket - Query, RAG, and enhance the responses

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Learn the strategies and tools for a successful migration from Oracle databases to CloudSQL/AlloyDB. This session covers everything from schema conversion to data replication, offering insights into leveraging Google Cloud technology for a smooth transition to open source database engines. Join us for this Mini Talk at 'Meet the Experts, hosted by Google Cloud Consulting' at Expo. Seating is limited and on a first-come, first served basis; standing areas are available.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

At Google I/O last year, we announced several new AI Firebase Extensions using the PaLM API. This year, we’ve added support for Google's latest Gemini models. Easily add a chatbot, text summarizer, content generator, vector database pipeline, and more to your app without learning new APIs. In this session, get an end-to-end view of how you can use Firebase and Gemini to create an enterprise-ready customer support app. Build many apps with the powerful combo of Gemini's multimodal features and Firebase's convenient suite of developer tools.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Generative AI is fantastic but has a major problem: sometimes it "hallucinates", meaning it makes things up. In a business product like a chatbot, this can be disastrous. Vector databases like Pinecone are one of the solutions to mitigating the problem. Vector databases are a key component to any AI application, as well as things like enterprise search and document search. They have become an essential tool for every business, and with the rise in interest in AI in the last couple of years, the space is moving quickly. In this episode, you'll find out how to make use of vector databases, and find out about the latest developments at Pinecone. Elan Dekel is the VP of Product at Pinecone, where he oversees the development of the Pinecone vector database. He was previously Product Lead for Core Data Serving at Google, where he led teams working on the indexing systems to serve data for Google search, YouTube search, and Google Maps. Before that, he was Founder and CEO of Medico, which was acquired by Everyday Health. In the episode, RIchie and Elan explore LLMs, hallucination in generative models, vector databases and the best use-cases for them, semantic search, business applications of vector databases and semantic search, the tech stack for AI applications, cost considerations when investing in AI projects, emerging roles within the AI space, the future of vector databases and AI, and much more.   Links Mentioned in the Show: Pinecone CanopyPinecone ServerlessLlamaIndexLangchain[Code Along] Semantic Search with PineconeRelated Episode: Expanding the Scope of Generative AI in the Enterprise with Bal Heroor, CEO and Principal at MactoresSign up to RADAR: The Analytics Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Send us a text Understanding Search, GenAI, RAG methodology, and vector databases with Nixon Cheaz, Engineering Lead at IBM's Experience Engine.   02:24 Meet Nixon Cheaz04:32 Search without Google06:35 Experience Engine08:30 Elements of Good Search12:46 Search Data Source15:36 GenAI Use Cases and Vector DBs 19:40 Foundational Models?22:07 Impact of Vector DBs25:38 IBM Public Content DB28:02 Use Cases29:58 IBM Technologies32:54 RAG40:12 Health is WealthLinkedIn: linkedin.com/in/nixon-cheaz

Want to be featured as a guest on Making Data Simple?  Reach out to us at [email protected] and tell us why you should be next.  The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

We talked about:

Atita’s background How NLP relates to search Atita’s experience with Lucidworks and OpenSource Connections Atita’s experience with Qdrant and vector databases Utilizing vector search Major changes to search Atita has noticed throughout her career RAG (Retrieval-Augmented Generation) Building a chatbot out of transcripts with LLMs Ingesting the data and evaluating the results Keeping humans in the loop Application of vector databases for machine learning Collaborative filtering Atita’s resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/atitaarora/
Twitter: https://x.com/atitaarora Github: https://github.com/atarora Human-in-the-Loop Machine Learning: https://www.manning.com/books/human-in-the-loop-machine-learning Relevant Search: https://www.manning.com/books/relevant-search Let's learn about Vectors: https://hub.superlinked.com/ Langchain: https://python.langchain.com/docs/get_started/introduction Qdrant blog: https://blog.qdrant.tech/ OpenSource Connections Blog: https://opensourceconnections.com/blog/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Zain Hasan: Getting Started With Vector Databases

Join Zain Hasan to dive into Vector Databases and revolutionize your search capabilities with machine learning. 🚀🔍 Learn how to harness the power of cloud-based data storage in this insightful session. 💡💻 #VectorDatabases #MachineLearning #datastorage

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear