talk-data.com talk-data.com

Filter by Source

Select conferences and events

Activities & events

Title & Speakers Event
Data Talks Club – host @ DataTalks.Club , Sebastian Ayala Ruano – bioinformatics researcher and software engineer

In this talk, Sebastian, a bioinformatics researcher and software engineer, shares his inspiring journey from wet lab biotechnology to computational bioinformatics. Hosted by Data Talks Club, this session explores how data science, AI, and open-source tools are transforming modern biological research — from DNA sequencing to metagenomics and protein structure prediction.

You’ll learn about: - The difference between wet lab and dry lab workflows in biotechnology - How bioinformatics enables faster insights through data-driven modeling - The MCW2 Graph Project and its role in studying wastewater microbiomes - Using co-abundance networks and the CC Lasso algorithm to map microbial interactions - How AlphaFold revolutionized protein structure prediction - Building scientific knowledge graphs to integrate biological metadata - Open-source tools like VueGen and VueCore for automating reports and visualizations - The growing impact of AI and large language models (LLMs) in research and documentation - Key differences between R (BioConductor) and Python ecosystems for bioinformatics

This talk is ideal for data scientists, bioinformaticians, biotech researchers, and AI enthusiasts who want to understand how data science, AI, and biology intersect. Whether you work in genomics, computational biology, or scientific software, you’ll gain insights into real-world tools and workflows shaping the future of bioinformatics.

Links: - MicW2Graph: https://zenodo.org/records/12507444 - VueGen: https://github.com/Multiomics-Analytics-Group/vuegen - Awesome-Bioinformatics: https://github.com/danielecook/Awesome-Bioinformatics

TIMECODES00:00 Sebastian’s Journey into Bioinformatics06:02 From Wet Lab to Computational Biology08:23 Wet Lab vs Dry Lab Explained12:35 Bioinformatics as Data Science for Biology15:30 How DNA Sequencing Works19:29 MCW2 Graph and Wastewater Microbiomes23:10 Building Microbial Networks with CC Lasso26:54 Protein–Ligand Simulation Basics29:58 Predicting Protein Folding in 3D33:30 AlphaFold Revolution in Protein Prediction36:45 Inside the MCW2 Knowledge Graph39:54 VueGen: Automating Scientific Reports43:56 VueCore: Visualizing OMIX Data47:50 Using AI and LLMs in Bioinformatics50:25 R vs Python in Bioinformatics Tools53:17 Closing Thoughts from Ecuador Connect with Sebastian Twitter - https://twitter.com/sayalaruanoLinkedin - https://linkedin.com/in/sayalaruano Github - https://github.com/sayalaruanoWebsite - https://sayalaruano.github.io/ Connect with DataTalks.Club: Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

AI/ML Analytics Data Science GitHub LLM Python
DataTalks.Club
HH DS + ML Meetup at cronn 2025-08-28 · 16:15

Register: https://www.meetup.com/de-DE/hamburg-data-science-meetup/events/310344244/

Welcome to the August version of our Data Science + Machine Learning Meetup Hamburg!

Neo4j Session:

Martin Preusse, Partner & Founder, Kaiser & Preusse 🔗 LinkedIn Neurosymbolic AI with LLMs and Knowledge Graphs

This talk outlines a neurosymbolic approach—combining neural LLMs with symbolic knowledge graphs and ontologies—for practical data-science workflows. We’ll introduce GraphRAG: building a graph from your corpus and using graph structure and summaries to retrieve context, contrasted with text-only RAG. We’ll then use ontologies and SHACL constraints as machine-checkable contracts to scope prompts and validate model outputs. Finally, we’ll discuss using LLMs for entity harmonization and for bootstrapping vocabularies/ontologies, with notes on accuracy, cost, and human oversight. For context, we’ll situate these patterns in recent work on neurosymbolic AI.

HH DS + ML Meetup at cronn

Dive into building applications that combine the power of Large Language Models (LLMs) with Neo4j knowledge graphs, Haystack, and Spring AI to deliver intelligent, data-driven recommendations and search outcomes. This book provides actionable insights and techniques to create scalable, robust solutions by leveraging the best-in-class frameworks and a real-world project-oriented approach. What this Book will help me do Understand how to use Neo4j to build knowledge graphs integrated with LLMs for enhanced data insights. Develop skills in creating intelligent search functionalities by combining Haystack and vector-based graph techniques. Learn to design and implement recommendation systems using LangChain4j and Spring AI frameworks. Acquire the ability to optimize graph data architectures for LLM-driven applications. Gain proficiency in deploying and managing applications on platforms like Google Cloud for scalability. Author(s) Ravindranatha Anthapu, a Principal Consultant at Neo4j, and Siddhant Agarwal, a Google Developer Expert in Generative AI, bring together their vast experience to offer practical implementations and cutting-edge techniques in this book. Their combined expertise in Neo4j, graph technology, and real-world AI applications makes them authoritative voices in the field. Who is it for? Designed for database developers and data scientists, this book caters to professionals aiming to leverage the transformational capabilities of knowledge graphs alongside LLMs. Readers should have a working knowledge of Python and Java as well as familiarity with Neo4j and the Cypher query language. If you're looking to enhance search or recommendation functionalities through state-of-the-art AI integrations, this book is for you.

data data-engineering graph-databases Neo4j AI/ML Cloud Computing GCP GenAI Java LLM Python
O'Reilly Data Engineering Books

Join us to explore the process of building knowledge graphs using Large Language Models (LLMs). This online session focuses on using LLMs to automatically extract knowledge graphs from unstructured data.

You'll learn how an LLM can identify key entities and their relationships, creating a structured representation of information. We will then demonstrate how to integrate this LLM-generated knowledge graph into a RAG pipeline, transforming it into a GraphRAG architecture for improved accuracy and contextual understanding in question answering.

This online session is tailored for developers interested in advancing their RAG implementations or other AI projects with the power of LLM-driven knowledge graphs.

Agenda

  • Introduction / Meet the Community - 10 minutes
  • Building Knowledge Graphs using LLMs - 40 mins
  • Q&A - 10 mins

👉 New to SurrealDB? Get started here.

🗣️ Speaker opportunities - submit your talk! Working on an interesting project that you would like to share with the community? Submit your talk here.

FAQs

Am I guaranteed a ticket at this event? Our events are tech-focused and in the interest of keeping our events relevant and meaningful for those attending, tickets are issued at our discretion. We therefore reserve the right to refund ticket orders before the event and to request proof of identity and/or professional background upon entry.

Is this event for me? SurrealDB events are for software engineers, developers, architects, data scientists, data engineers, or any tech professionals keen to discover more about SurrealDB: a scalable multi-model database that allows users and developers to focus on building their applications with ease and speed.

Are there any House Rules? At SurrealDB, we are committed to providing live and online events that are safe and enjoyable for all attending. Please review our Code of Conduct and Privacy Policy for more information. It is compulsory for all attendees to be registered with a first and last name in order to attend. Any attendees who do not adhere to these requirements will be refused a ticket.

Building Knowledge Graphs using LLMs

** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access

✔️ Pre-Workshop Checklist

  • RSVP HERE: Required to enter workshop
  • If not already a Discord Member - join our community here: Neo4j Discord (You can also login and see our community)
  • Python 3.10+ environment
  • Access to a Neo4j sandbox or local instance
  • Familiarity with Python and basic LLM concepts

Beyond GenAI: Building Intelligent AI with Neo4j & RAG

This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications.

You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows.

​Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance.

​We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG

** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access

✔️ Pre-Workshop Checklist

  • RSVP HERE: Required to enter workshop
  • If not already a Discord Member - join our community here: Neo4j Discord (You can also login and see our community)
  • Python 3.10+ environment
  • Access to a Neo4j sandbox or local instance
  • Familiarity with Python and basic LLM concepts

Beyond GenAI: Building Intelligent AI with Neo4j & RAG

This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications.

You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows.

​Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance.

​We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG

** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access

✔️ Pre-Workshop Checklist

  • RSVP HERE: Required to enter workshop
  • If not already a Discord Member - join our community here: Neo4j Discord (You can also login and see our community)
  • Python 3.10+ environment
  • Access to a Neo4j sandbox or local instance
  • Familiarity with Python and basic LLM concepts

Beyond GenAI: Building Intelligent AI with Neo4j & RAG

This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications.

You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows.

​Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance.

​We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG

** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access

✔️ Pre-Workshop Checklist

  • RSVP HERE: Required to enter workshop
  • If not already a Discord Member - join our community here: Neo4j Discord (You can also login and see our community)
  • Python 3.10+ environment
  • Access to a Neo4j sandbox or local instance
  • Familiarity with Python and basic LLM concepts

Beyond GenAI: Building Intelligent AI with Neo4j & RAG

This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications.

You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows.

​Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance.

​We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust.

🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
Trey Grainger – author

Apply cutting-edge machine learning techniques—from crowdsourced relevance and knowledge graph learning, to Large Language Models (LLMs)—to enhance the accuracy and relevance of your search results. Delivering effective search is one of the biggest challenges you can face as an engineer. AI-Powered Search is an in-depth guide to building intelligent search systems you can be proud of. It covers the critical tools you need to automate ongoing relevance improvements within your search applications. Inside you’ll learn modern, data-science-driven search techniques like: Semantic search using dense vector embeddings from foundation models Retrieval augmented generation (RAG) Question answering and summarization combining search and LLMs Fine-tuning transformer-based LLMs Personalized search based on user signals and vector embeddings Collecting user behavioral signals and building signals boosting models Semantic knowledge graphs for domain-specific learning Semantic query parsing, query-sense disambiguation, and query intent classification Implementing machine-learned ranking models (Learning to Rank) Building click models to automate machine-learned ranking Generative search, hybrid search, multimodal search, and the search frontier AI-Powered Search will help you build the kind of highly intelligent search applications demanded by modern users. Whether you’re enhancing your existing search engine or building from scratch, you’ll learn how to deliver an AI-powered service that can continuously learn from every content update, user interaction, and the hidden semantic relationships in your content. You’ll learn both how to enhance your AI systems with search and how to integrate large language models (LLMs) and other foundation models to massively accelerate the capabilities of your search technology. About the Technology Modern search is more than keyword matching. Much, much more. Search that learns from user interactions, interprets intent, and takes advantage of AI tools like large language models (LLMs) can deliver highly targeted and relevant results. This book shows you how to up your search game using state-of-the-art AI algorithms, techniques, and tools. About the Book AI-Powered Search teaches you to create a search that understands natural language and improves automatically the more it is used. As you work through dozens of interesting and relevant examples, you’ll learn powerful AI-based techniques like semantic search on embeddings, question answering powered by LLMs, real-time personalization, and Retrieval Augmented Generation (RAG). What's Inside Sparse lexical and embedding-based semantic search Question answering, RAG, and summarization using LLMs Personalized search and signals boosting models Learning to Rank, multimodal, and hybrid search About the Reader For software developers and data scientists familiar with the basics of search engine technology. About the Author Trey Grainger is the Founder of Searchkernel and former Chief Algorithms Officer and SVP of Engineering at Lucidworks. Doug Turnbull is a Principal Engineer at Reddit and former Staff Relevance Engineer at Spotify. Max Irwin is the Founder of Max.io and former Managing Consultant at OpenSource Connections. Quotes Belongs on the shelf of every search practitioner! - Khalifeh AlJadda, Google A treasure map! Now you have decades of semantic search knowledge at your fingertips. - Mark Moyou, NVIDIA Modern and comprehensive! Everything you need to build world-class search experiences. - Kelvin Tan, SearchStax Kick starts your ability to implement AI search with easy to understand examples. - David Meza, NASA

data data-engineering search AI/ML LLM RAG
O'Reilly AI & ML Books

For the final Neo4j Meetup in 2024 we have invited a few Neo4j Friends, Community Members and Neo4j Staff to bring you various topics in 10mins lightning talks.

Agenda:

Session 1: Graph Visualization and Diagramming: The why and the how Sebastian Müller, CTO of yWorks Learn how to create interactive graph visualizations from Neo4j databases using Jupyter Notebooks and free tools. This lightning talk will demonstrate quick setup, data-bound visualizations, and advanced techniques to transform your queries into insightful diagrams.

Sebastian brings over two decades of expertise in graph visualization and diagramming to this presentation. With a diploma in Computer Science from the University of Tübingen, Sebastian has collaborated with global clients across various industries, helping them visualize complex connected data.

Session 2: Autoscaling LLMs in Production Yann Leger, Koyeb Building AI-driven applications is only the first step—scaling them effectively is where the real challenges emerge. Learn how to automatically scale LLMs in production, optimize your resource usage, and improve performance for your AI-driven applications.

Yann Leger is co-founder of Koyeb, a serverless platform for AI workloads, and spent the last 12 years building large-scale cloud service providers from scratch. Passionate about cloud computing, he has a deep understanding of the underlying infrastructure, from data centers to the software stack running on hypervisors. After building Scaleway, originally with bare metal ARM servers, Yann decided to go serverless with alternative chips for AI.

Session 3: Graph-Powered Open Ownership: Data, Challenges, and Queries Stephen Abbott Pugh, Open Ownership Join Stephen from OpenOwnership and Christophe from GraphAware to explore the Beneficial Ownership Data Standard (BODS). We will explain the data structure, introducing LEI and other identifiers, and discuss the challenges of building and modeling the open ownership dataset. Learn about the ingestion process and see practical query examples and recent Cypher features.

Stephen Abbott Pugh is Open Ownership’s Head of Technology. He looks after the overall technical roadmap for the organisation, providing technical assistance to governments with implementing technology reforms to advance beneficial ownership transparency. He is the product owner of the Beneficial Ownership Data Standard and the Open Ownership Register, along with a range of other technical products.

Session 4: The Power of GraphRAG Nyah Macklin, Neo4j

Session 5: open - could be you! Let us know!

On Knowledge Graphs, GraphRAG, Graph Visualisation, AI and more

Join us for part two of Vector search and Graph use cases in SurrealDB! Learn how you can leverage this functionality in your own projects through informative talks with practical examples.

The meetup will highlight:

  • The power of knowledge graphs in providing structured and semantically rich context to LLMs, leading to more informed and coherent responses.
  • How vector embeddings, numerical representations of text that capture semantic meaning, enable semantic search within the knowledge graph, allowing the system to retrieve the most relevant information for a given query.
  • A comparative demonstrating the difference in LLM responses when using:
  • A standard prompt referencing source material alone.
  • A prompt augmented by the knowledge graph and vector embeddings.

Attendees will gain practical insights into:

  • The process of querying a graph-based RAG system for question answering.
  • How to leverage the combined capabilities of SurrealDB, a multi-model database.
  • Graph Capabilities: Representing relationships between entities within the knowledge graph.
  • Vector Capabilities: Enabling semantic search to pinpoint relevant information within the knowledge graph.
  • How this approach, utilizing SurrealDB's graph and vector features, enhances LLM responses by providing contextually relevant information retrieved through the knowledge graph.

This meetup is ideal for individuals who attended Part 1 or possess a basic understanding of knowledge graph extraction and are eager to learn advanced techniques for improving LLM outputs using graph-based RAG systems.

🗣️ Speaker opportunity - submit your talk! Working on an interesting project that you would like to share with the community? Submit your talk here.

⏰ Date/time: December 10, 6:30 - 9:00PM

📍 Location: The Yard: Columbus Circle Coworking Office Space NYC

Agenda

18:30 - 19:00 Welcome drinks, pizza & networking Attendees arrive – grab a drink, explore the space and meet the SurrealDB team.

19:00 - 19:30 Improving LLM Responses with Knowledge Graphs and Semantic Vector Search Sandro Pireno, Director Solutions Engineering at SurrrealDB. Building on the foundational concepts of knowledge graph construction from our last meetup in which we extracted knowledge graphs using a large language model (LLM),, this meetup explores advanced techniques for enhancing LLM outputs using graph-based Retrieval-Augmented Generation (RAG) systems. The session will showcase how integrating structured knowledge from a knowledge graph, coupled with semantic search powered by vector embeddings, can significantly improve the quality and relevance of LLM-generated responses.

19:30 - 20:00 Refreshments & networking Connect with others in the tech community. Grab a slice of pizza & a drink and chat with other attendees and members of the SurrealDB team.

20:00 - 20:30 How Index Uses SurrealDB with Decentralized Autonomous Agents Description: Explore how Index, a decentralized protocol for peer-to-peer discovery, integrates SurrealDB to enhance its network of autonomous agents. Discover how SurrealDB enables dynamic schemas, context-aware indexing, and seamless collaboration between agents.

20:30 - 21:00 Refreshments and networking

21:00 End of event

-- Host: Alessandro Pireno \| LinkedIn Alessandro is a seasoned product development and solutions leader with a proven track record of building and scaling data-driven solutions across diverse industries. He has led product strategy and development at companies like HUMAN and Omnicom Media Group, optimized data collection and distribution at GroupM, and was an early leader of success at Snowflake. With a deep understanding of the challenges and opportunities facing today’s tech landscape, Alessandro is passionate about empowering organizations to unlock the full potential of their data through innovative database solutions.

Guest speaker: Seref Yarar \| LinkedIn Seref Yarar is the co-founder of Index Network, with 15 years of experience across media, journalism, e-commerce, and ad-tech. His work is shaped by a focus on the semantic web, distributed systems, and decentralized technologies, which influence his approach to information discovery challenges.

--

👉 New to SurrealDB? Get started here.

FAQs

Is the venue accessible? The Yard is located on the 2nd floor. When you arrive, just let security know that you're heading up to The Yard.

Am I guaranteed a ticket at this event? Our events are tech-focused and in the interest of keeping our events relevant and meaningful for those attending, tickets are issued at our discretion. We therefore reserve the right to refund ticket orders before the event and to request proof of identity and/or professional background upon entry.

Is this event for me? SurrealDB events are for software engineers, developers, architects, data scientists, data engineers, or any tech professionals keen to discover more about SurrealDB: a scalable multi-model database that allows users and developers to focus on building their applications with ease and speed.

Are there any House Rules? At SurrealDB, we are committed to providing live and online events that are safe and enjoyable for all attending. Please review our Code of Conduct and Privacy Policy for more information. It is compulsory for all attendees to be registered with a first and last name in order to attend. Any attendees who do not adhere to these requirements will be refused a ticket.

Graphs and Vectors in SurrealDB: Part 2

Please rsvp for the event here: https://www.meetup.com/nlp_london/events/304258590/?utm_medium=referral&utm_campaign=share-btn_savedevents_share_modal&utm_source=link

Details

Overview: If you are using LLMs in applications for the Legal sector, then you may come across the challenges of connecting siloed information, building Assistants to automate tasks, and reliably making sense of the semantics of your data. We’re delighted to have three fantastic presentations sharing actionable insights:

  1. Topic: Understanding Embeddings for the Legal Sector: How AI Distinguishes Among Concepts

Speaker: Jocelyn Matthews from Pinecone If you had a collection of every kind of animal on earth, from mules to narwhals to goldfish, how would we pick out just the housepets? And how does AI distinguish between a pack mule and a Moscow Mule? Lawyers often need to identify underlying themes or concepts to build arguments, which is akin to how embeddings distil high-dimensional data into lower-dimensional, meaningful representations. Legal professionals are skilled at recognizing fact patterns that may apply across different cases. This is analogous to how embeddings recognize conceptual or contextual similarities in data. Embeddings are numerical representations that capture the essential features and relationships of objects, like words or images, in a continuous vector space, enabling tasks such as semantic search, clustering, and recommendations. We'll explore the core concepts of embeddings, using relatable examples to make advanced ideas accessible. Legal professionals may find embeddings particularly relevant due to their ability to distinguish between different entities and concepts with nuance. Such capability is crucial for addressing legal issues like precedent analysis, data privacy, bias mitigation, intellectual property, contract review and compliance, legal research, risk management in mergers and acquisitions, and automated redaction of sensitive information. By understanding how embeddings distinguish between concepts, attendees can draw parallels to their own legal reasoning processes, gaining insights into how these AI mechanisms intersect with legal frameworks. This session encourages legal professionals to explore the intellectual and professional possibilities that embeddings present, deepening an understanding of AI’s role in law. 2. Topic: TrustGraph: AI Powered Knowledge Graphs Meet Scalable Data Engineering

Speaker: Mark Adams from TrustGraph Heavily regulated industries often see regulations as a barrier to innovation. When regulations are coupled with disconnected data silos, innovation grinds to a halt. Critical information is buried in thousands of pages - tens of thousands of pages - of technical designs that must couple to regulatory requirements. Whether it’s legal texts, compliance docs, or even industry best practices, connecting these data silos is essential to enabling technical innovation. In this technical presentation, Mark will provide a brief overview of the TrustGraph open source framework and how it can break down these information barriers and connect the most complex of data silos. The focus will be on demonstrating how TrustGraph deploys reliable, scalable, and accurate AI agents through its modular design and innovative features. Live demo of course! 3. Topic: What are AI assistants and how can they help me and how can I put AI in production?

Speakers: Christoffer Noring and Liam Hampton from Microsoft Their talk and demo will cover LLMs, Tool calling, Assistants and also showcase some practical examples where assistants shine. Prepare to be inspired and hopefully get started to create your own AI Assistant. We'll also look into IaC, infrastructure as code, some tooling associated, and how to deploy your apps.

By the end of the meetup we will have gained practical, actionable insights and takeaways that can accelerate development and deployment of NLP in production for Legal applications. Schedule 18.00: Doors Open 18:00-18.30: Networking (food, drink) 18.30-20.00: Talks 20.00-21.00: More networking Extra special thanks to our sponsors and speakers:

About the speakers

  • Jocelyn Matthews is Head of Community at Pinecone.
  • Mark Adams is Co-Founder and Developer at TrustGraph
  • Christoffer Noring and Liam Hampton are Senior Cloud Advocates at Microsoft
Unlocking information in the Legal sector using Assistants, VectorDBs, GraphDBs

Event Description: Biology and medicine are complex topics, and it is only natural to look to breakthroughs in AI to help us navigate them. However, modern generative AI models have a complexity that rivals that of biological systems. We will discuss approaches to understanding current state-of-the-art models from the lens of traditional biomedical science and ask whether we can achieve “molecular causality” in our current age of foundation models.

Building on these advances, we will discover two open-source tools for increasing accessibility in biomedical research:

  • BioCypher simplifies the organization of complex biological data into unified, accessible knowledge graphs. By streamlining data curation and fostering collaboration, it accelerates scientific discovery and makes handling vast biomedical information more manageable. (Read more about BioCypher here.)
  • BioChatter takes this a step further by integrating advanced AI language models. It allows researchers to interact with these knowledge graphs using natural, conversational language, making data exploration intuitive and accessible—even for those without extensive technical expertise. (Read more about BioChatter here.)

Together, BioCypher and BioChatter empower scientists to explore complex biological phenomena, facilitate personalized medicine in areas like cancer research, and prevent reinventing many wheels via an open-source philosophy.

Join us for an exciting session of learning, exploration, and networking. Come see the potential of AI and knowledge graphs to transform data into scientific knowledge!

Using Knowledge Graphs & LLMs to Represent Scientific Knowledge

We’re excited to announce our upcoming meetup in collaboration with Datenna, a pioneering scale-up based in Eindhoven. This event promises to be a deep dive into the innovative use of data and technology, showcasing cutting-edge applications that are shaping the future of open-source intelligence. And all this brought to you by the CTO and Founder of Datenna; Edward Brinkmann. In addition, Shu and Remi are sharing what they learned from building a tool to annotate LLM outputs. Sounds cool and interesting right? Datenna is our host this time and will open the doors of their office on October 29th, see you then!

How Datenna built a digital twin of China using graphs and GenAI How do you create a detailed and reliable digital twin of one of the largest economies in the world? How do you ensure that the data being collected from open sources is trustworthy? How do you handle conflicting pieces of information, merge entities across data sources, and ensure every conclusion is explainable and traceable back to the source? These are some of the challenges Datenna tackles daily in its mission to provide the best open-source intelligence to governments worldwide for economic and national security purposes. Discover how Datenna leverages graph databases and GenAI technology to build an open-source intelligence engine that continuously collects information on over 100 million entities in China, mapping all these entities and their relationships. Learn how Datenna, a scale-up founded in Eindhoven, has used these novel technologies to gain a competitive edge globally and became a world leader in techno-economic intelligence on China.

What we've learnt from building a tool to annotate LLM outputs LLMs can take files, audio, and video as input and generate summaries, answer questions, and extract information. With the Gemini family of models capable of supporting up to 1 million tokens in their context window, users can feed a PDF of hundreds of pages into these models and output only the results they care about. However, the outputs may contain errors. In this talk, the presenters will share their learnings from building a tool that enables the manual annotation and evaluation of these models' outputs based on a collection of models chosen by the users. They find the comparison results interesting and would like to share them with the audience.

Program

  • 17:00 – 18:00 🍕 Food
  • 18:00 – 18:10 🎤 Welcome
  • 18:10 – 19:00 🎤 Edward Brinkman: How Datenna built a digital twin of China using graphs and GenAI
  • 19:00 – 19:15 ⏸️ Break
  • 19:15 – 20:00 🎤 Shu Zhao & Remi Baar: What we've learnt from building a tool to annotate LLM outputs
  • 20:00-21:00 🥤 Drinks

About: Edward Brinkmann As the CTO and co-founder of Datenna, Edward has guided the company through various stages of growth, transforming it from a technology start-up to a thriving scale-up. His first role as CTO was as founding engineer, implementing the first versions of the intelligence platform, and later as engineering manager and lead architect whilst expanding the development team. With a background in software engineering, data engineering, and systems architecture, Edward has a broad interest in technology, especially in translating business needs into the most suitable technical solutions. Before co-founding Datenna, Edward enjoyed working on end-to-end projects in the capacity of lead developer and full-stack engineer, gaining experience across various business domains, use cases, and technologies.

About: Shu Zhao Shu has an MSc. in Artificial Intelligence, and part of her thesis was published in ECCV 2022 about artistic pose analysis based on computer vision. She also participated in AI Song Contest 2021 by producing one song by training an RNN-based language model.

Before she worked for Xebia data, Shu worked in various roles from large banks to smaller fintechs to e-commerce where she accumulated a wide span of technical skills to come to a sustainable solution.

About Remi Baar Fifteen years ago, at just 17, Remi launched his own software development company, quickly focusing on the exciting fields of artificial intelligence and data science. Since then, he has held various data science roles across a diverse range of organizations—from startups to multinational corporations, and from government agencies to airlines. His unique blend of software engineering expertise and data science has garnered him recognition and appreciation in each of these positions.

Currently, Remi is a valued member of the Xebia team, collaborating with fellow experts to enhance their collective skills and push the limits of AI. With a passion for knowledge sharing, Remi eagerly shares his latest insights.

Eindhoven Data Community meetup 18 - Datenna

Register for the Zoom:

https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-sept-12-2024/

Reducing Hallucinations in ChatGPT and Similar AI Systems

LLMs are prone to producing hallucinations, largely due to their limited content and knowledge base. One of the most widely used techniques to reduce hallucinations is incorporating external knowledge sources. Among these, using knowledge graphs has shown particularly impressive results in enhancing the accuracy and reliability of the results produced by LLMs. In this talk, we will explore what knowledge graphs are, why they are important, and how to utilize the Neo4j graph database to improve the reliability of LLMs.

About the Speaker

Abhimanyu Aryan started in VR industry, then worked as an ML Engineer (Vision) for the Indian Air Force and contributed to Julia’s open-source web ecosystem( mostly Genie). Currently, building an AI stealth startup.

Update: Data-Centric AI Competition on Hugging Face Spaces

Are you ready to challenge the status quo in AI development? Then join Voxel51’s Harpreet Sahota for the latest updates, plus tips and tricks on the first-ever Data-Centric AI competition on Hugging Face Spaces, focusing on the often-overlooked yet crucial aspect of AI: data curation. Learn more about the competition, rules and prizes.

About the Speaker

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

It's in the Air Tonight. Sensor Data in RAG

I will do a quick overview of the basics of Vector Databases and Milvus and then dive into a practical example of how to use one as part of an application. I will demonstrate how to consume air quality data and ingest it into Milvus as vectors and scalars. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Preview the demo on Medium.

About the Speaker

Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Generative AI, HuggingFace, Python, Big Data, IoT, and Edge AI. Tim has over twelve years of experience with the IoT, big data, distributed computing, messaging, machine learning and streaming technologies.

Sept 12 - AI, ML and Computer Vision Meetup

Register for the Zoom:

https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-sept-12-2024/

Reducing Hallucinations in ChatGPT and Similar AI Systems

LLMs are prone to producing hallucinations, largely due to their limited content and knowledge base. One of the most widely used techniques to reduce hallucinations is incorporating external knowledge sources. Among these, using knowledge graphs has shown particularly impressive results in enhancing the accuracy and reliability of the results produced by LLMs. In this talk, we will explore what knowledge graphs are, why they are important, and how to utilize the Neo4j graph database to improve the reliability of LLMs.

About the Speaker

Abhimanyu Aryan started in VR industry, then worked as an ML Engineer (Vision) for the Indian Air Force and contributed to Julia’s open-source web ecosystem( mostly Genie). Currently, building an AI stealth startup.

Update: Data-Centric AI Competition on Hugging Face Spaces

Are you ready to challenge the status quo in AI development? Then join Voxel51’s Harpreet Sahota for the latest updates, plus tips and tricks on the first-ever Data-Centric AI competition on Hugging Face Spaces, focusing on the often-overlooked yet crucial aspect of AI: data curation. Learn more about the competition, rules and prizes.

About the Speaker

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

It's in the Air Tonight. Sensor Data in RAG

I will do a quick overview of the basics of Vector Databases and Milvus and then dive into a practical example of how to use one as part of an application. I will demonstrate how to consume air quality data and ingest it into Milvus as vectors and scalars. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Preview the demo on Medium.

About the Speaker

Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Generative AI, HuggingFace, Python, Big Data, IoT, and Edge AI. Tim has over twelve years of experience with the IoT, big data, distributed computing, messaging, machine learning and streaming technologies.

Sept 12 - AI, ML and Computer Vision Meetup

Register for the Zoom:

https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-sept-12-2024/

Reducing Hallucinations in ChatGPT and Similar AI Systems

LLMs are prone to producing hallucinations, largely due to their limited content and knowledge base. One of the most widely used techniques to reduce hallucinations is incorporating external knowledge sources. Among these, using knowledge graphs has shown particularly impressive results in enhancing the accuracy and reliability of the results produced by LLMs. In this talk, we will explore what knowledge graphs are, why they are important, and how to utilize the Neo4j graph database to improve the reliability of LLMs.

About the Speaker

Abhimanyu Aryan started in VR industry, then worked as an ML Engineer (Vision) for the Indian Air Force and contributed to Julia’s open-source web ecosystem( mostly Genie). Currently, building an AI stealth startup.

Update: Data-Centric AI Competition on Hugging Face Spaces

Are you ready to challenge the status quo in AI development? Then join Voxel51’s Harpreet Sahota for the latest updates, plus tips and tricks on the first-ever Data-Centric AI competition on Hugging Face Spaces, focusing on the often-overlooked yet crucial aspect of AI: data curation. Learn more about the competition, rules and prizes.

About the Speaker

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

It's in the Air Tonight. Sensor Data in RAG

I will do a quick overview of the basics of Vector Databases and Milvus and then dive into a practical example of how to use one as part of an application. I will demonstrate how to consume air quality data and ingest it into Milvus as vectors and scalars. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Preview the demo on Medium.

About the Speaker

Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Generative AI, HuggingFace, Python, Big Data, IoT, and Edge AI. Tim has over twelve years of experience with the IoT, big data, distributed computing, messaging, machine learning and streaming technologies.

Sept 12 - AI, ML and Computer Vision Meetup

Register for the Zoom:

https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-sept-12-2024/

Reducing Hallucinations in ChatGPT and Similar AI Systems

LLMs are prone to producing hallucinations, largely due to their limited content and knowledge base. One of the most widely used techniques to reduce hallucinations is incorporating external knowledge sources. Among these, using knowledge graphs has shown particularly impressive results in enhancing the accuracy and reliability of the results produced by LLMs. In this talk, we will explore what knowledge graphs are, why they are important, and how to utilize the Neo4j graph database to improve the reliability of LLMs.

About the Speaker

Abhimanyu Aryan started in VR industry, then worked as an ML Engineer (Vision) for the Indian Air Force and contributed to Julia’s open-source web ecosystem( mostly Genie). Currently, building an AI stealth startup.

Update: Data-Centric AI Competition on Hugging Face Spaces

Are you ready to challenge the status quo in AI development? Then join Voxel51’s Harpreet Sahota for the latest updates, plus tips and tricks on the first-ever Data-Centric AI competition on Hugging Face Spaces, focusing on the often-overlooked yet crucial aspect of AI: data curation. Learn more about the competition, rules and prizes.

About the Speaker

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

It's in the Air Tonight. Sensor Data in RAG

I will do a quick overview of the basics of Vector Databases and Milvus and then dive into a practical example of how to use one as part of an application. I will demonstrate how to consume air quality data and ingest it into Milvus as vectors and scalars. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Preview the demo on Medium.

About the Speaker

Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Generative AI, HuggingFace, Python, Big Data, IoT, and Edge AI. Tim has over twelve years of experience with the IoT, big data, distributed computing, messaging, machine learning and streaming technologies.

Sept 12 - AI, ML and Computer Vision Meetup

Register: https://lu.ma/sakz1lmv

If you're passionate about AI, machine learning, data science, or linguistics, this event is for you. Connect with like-minded professionals, share insights, and learn from industry experts as they dive into the real-world applications of LLMs.

Speakers & Topics: ​Lena Nahorna, Analytical Linguist at Grammarly Topic: Building Frameworks for Evaluation of LLM Output at Grammarly LLMs have opened up new avenues in NLP with their possible applications, but evaluating their output introduces a new set of challenges. In this talk, we discuss how the evaluation of LLMs differs from the evaluation of classic ML-based solutions and how we tackle the challenges.

​Halyna Oliinyk, Senior Data Engineer at Delivery Hero Topic: Data Engineering Workflow Before, After, and For LLMs Halyna will take you through the journey of deploying LLMs into production, focusing on the creation and management of modern data pipelines. She'll cover essential topics like system design, data sources, observability, and monitoring, all backed by real-world examples and common mistakes to avoid.

Djordje Benn-Maksimovic, Senior Data Scientist at Eviden Topic: Cypher Query Building with Open-Source LLMs Djordje will discuss creating knowledge graphs from news articles using small transformers for entity and relation extraction, and automating Cypher queries with open-source LLMs.

LLM Meetup: Practical Use Cases
Richie – host @ DataCamp , Ram Sriharsha – CTO @ Pinecone

Perhaps the biggest complaint about generative AI is hallucination. If the text you want to generate involves facts, for example, a chatbot that answers questions, then hallucination is a problem. The solution to this is to make use of a technique called retrieval augmented generation, where you store facts in a vector database and retrieve the most appropriate ones to send to the large language model to help it give accurate responses. So, what goes into building vector databases and how do they improve LLM performance so much? Ram Sriharsha is currently the CTO at Pinecone. Before this role, he was the Director of Engineering at Pinecone and previously served as Vice President of Engineering at Splunk. He also worked as a Product Manager at Databricks. With a long history in the software development industry, Ram has held positions as an architect, lead product developer, and senior software engineer at various companies. Ram is also a long time contributor to Apache Spark.  In the episode, Richie and Ram explore common use-cases for vector databases, RAG in chatbots, steps to create a chatbot, static vs dynamic data, testing chatbot success, handling dynamic data, choosing language models, knowledge graphs, implementing vector databases, innovations in vector data bases, the future of LLMs and much more.  Links Mentioned in the Show: PineconeWebinar - Charting the Path: What the Future Holds for Generative AICourse - Vector Databases for Embeddings with PineconeRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

AI/ML Databricks GenAI LLM Pinecone RAG Spark Splunk Vector DB
DataFramed