talk-data.com talk-data.com

Topic

llms

102

tagged

Activity Trend

19 peak/qtr
2020-Q1 2026-Q1

Activities

102 activities · Newest first

Mildly annoyed by the big green owl's limitations, I decided to build Translamore - an app that lets you turn whatever you're reading into your own language exercises. What started as a small weekend project quickly turned into a full-blown side quest. I'll kick things off with a quick demo of Translamore and then share some of the lessons from building it in my spare time: Project Management: staying organized when no one's watching, figuring out what's worth your time, keeping motivation alive, and the few tools that saved my sanity; LLMs & Prompt Engineering: what actually worked for me, using unit tests to wrangle prompts, a bit of templating magic, and my Prompt Resolver contraption; Server-Side Dart: why you really shouldn't ship your LLM API keys, how I structured packages and dependencies, used sealed classes for the API, and yes - called Python from Dart in the least elegant way possible. Expect some lessons, a few confessions, and probably one or two dont do what I did moments.

Join Kostia Omelianchuk and Lukas Beisteiner as they unpack the full scope of Grammatical Error Correction (GEC) from task framing, evaluation, and training to inference optimization and serving high-performance production systems at Grammarly. They will discuss: The modern GEC recipe (shift from heavily human-annotated corpora to semi-synthetic data generation), LLM-as-a-judge techniques for scalable evaluation, and techniques to make deployment fast and affordable, including Speculative Decoding.

Adam Amara, cofounder & CEO at Turing Biosystems, discusses network-based modelling and multiomics integration on metabolic networks, multilayer networks to integrate mutiomics and clinical data with knowledge graphs, and the use of TuringDB.ai to analyse large biomedical knowledge graphs and build digital twins.

Spanish: Charla Programs that write prompts with DSPy por Martín Quesada. Los sistemas que interactúan con modelos generativos y dependen de prompts escritos a mano son frágiles, y resulta difícil actualizarlos cuando hay un cambio. DSPy es una librería de código abierto que permite representar flujos de IA como código. En esta charla aprenderemos cómo utilizarla para sustituir prompts artesanales por módulos de Python optimizados automáticamente. Martín Quesada es Senior Data Scientist en Datamaran. En el turno de preguntas se permitirá tanto el uso de inglés como español. Tras la charla se ofrecerá un aperitivo para animar el networking, cortesía de Datamaran. English: Talk Programs that write prompts with DSPy by Martín Quesada. Systems that interact with generative models and rely on hand-written prompts are often fragile and hard to maintain: any change to the metrics or model requires updating text blobs through trial and error. DSPy is an open-source framework that tackles this issue by allowing developers to build complete AI workflows with pure Python code. In this talk, we will learn how to use it to replace handcrafted prompts with compact modules that are automatically optimized. Martín Quesada is Senior Data Scientist at Datamaran. You will be free to ask questions in Spanish or English during the Q&A. After the talk, we will enjoy some networking with appetizers, courtesy of Datamaran.

Panel discussion with four engineering leaders on how AI agents and LLMs are affecting engineering teams in practice. Topics include agent development tools, changes in hiring and team structure, embedding LLMs into products, what's working and what isn't.

Building a price comparison platform requires solving multiple ML challenges at scale. This talk covers a year-long production project combining LLMs, graph algorithms, and computer vision.

We'll explore:

Orchestrating complex ML workflows with Vertex AI Pipelines Using Gemini to classify products, extract attributes, and generating titles/descriptions Connecting product variants across retailers with graph algorithms Deduplicating images using computer vision

You'll learn practical lessons from deploying these systems in production, including trade-offs and challenges encountered along the way

Daniel will take a hands-on journey into building AI analyst agents from scratch. Using dbt metadata to provide large language models with the right context, he’ll show how to connect LLMs to your data effectively. Expect a deep dive into the challenges of query generation, practical frameworks for success, and lessons learned from real-world implementations.

Search engines are at the heart of the user experience. But how do you move from a “classic” keyword-based search to a semantic search that truly understands user intent? In this session, The Fork team will share their journey toward evolving into an AI-augmented search. Building on their existing OpenSearch stack, they added a semantic layer powered by LLMs. The goal: analyze user queries, extract the key elements, and translate them into a much more relevant semantic search. You’ll discover the challenges they faced, the implementation choices, and the real-world results achieved in a high-impact user case.

This talk presents a formal methodology for constructing a Multi-Modal Knowledge Graph for a smart city, addressing data privacy and heterogeneity by using entirely synthetic data. We demonstrate a Python pipeline that leverages Large Language Models for text generation and knowledge extraction, Pandas for sensor data simulation, and rdflib for graph construction. The result is a robust, privacy-preserving foundation for a Cognitive Digital Twin, enabling advanced urban analytics.

Abstract: There is great interest in scaling the number of tokens that LLMs can efficiently and effectively ingest, a problem that is notoriously difficult. Training LLMs on a smaller context and hoping that they generalize well to much longer contexts has largely proven to be ineffective. In this talk, I will go over our work that aims to understand the failure points in modern LLM architectures. In particular, I will discuss dispersion in the softmax layers, generalization issues related to positional encodings, and smoothing effects that occur in the representations. Understanding these issues has proven to be fruitful, with related ideas now already being part of frontier models such as LLaMa 4. The talk is intended to be broadly accessible, but a basic understanding of the Transformer architectures used in modern LLMs will be helpful.

In this talk, you will learn about GraphRAG, a technique that combines graph databases with generative AI to improve the quality of LLM-generated content. We will explore the terms Retrieval-Augmented Generation (RAG) and Context Engineering, and how GraphRAG can be used in both scenarios. The topic is aimed at Generative AI practitioners who are familiar with vector-based Retrieval-Augmented Generation (RAG) and would like to understand how the approach of GraphRAG can improve the quality of LLM-generated content.

Models need up-to-date facts (data) to solve tasks. But data (retrieval) needs models, too: for semantic search and for ranking top candidates. At this meetup, we will go through the data/model interplay: you will learn how to transform problems into the numeric domain using tensors, and with this, work with text, image, and videos. We’ll do live demos from e-commerce and media. Whether it’s personalizing the shopping experience in real time or finding the next song to autoplay, this session will help you think beyond LLMs—and design retrieval-first GenAI systems that deliver real-world impact.

Abstract: Models need up-to-date facts (data) to solve tasks. But data (retrieval) needs models, too: for semantic search and for ranking top candidates. At this meetup, we will go through the data/model interplay: you will learn how to transform problems into the numeric domain using tensors, and with this, work with text, image, and videos.\nWe’ll do live demos from e-commerce and media. Whether it’s personalizing the shopping experience in real time or finding the next song to autoplay, this session will help you think beyond LLMs—and design retrieval-first GenAI systems that deliver real-world impact.

This isn’t a “what if” conversation — it’s a behind-the-scenes look at real GenAI deployments from women leaders across different domains. Each speaker will share the tools, workflows, and measurable results from their projects — plus the lessons learned along the way.

In the rapidly evolving field of AI, most of the time is spent optimising. You are either maximising your accuracy, or minimising your latency. Join our live SurrealDB webinar where we'll be showing some LangChain components, testing some prompt engineering tricks, and identifying specific use-case challenges. We’ll walk through an experiment: a chatbot answering questions over chat-style conversations, showing when vector retrieval wins, when lightweight graphs help, and how to handle tricky bits like time awareness. In this session, you’ll learn how to: Set up SurrealDB as both a graph and vector store—one connection, one system; Use LangChain to ingest documents; Use LLMs to infer keywords; Tune retrieval (k, thresholds) and compare vector-only, graph-only, and intersected results