Accurately extracting information from documents has been a decades-old dream. Important workflows — from automated back-office processing to enterprise RAG — depend on it. LLMs promise to fulfill this dream but currently fall short: they hallucinate information, struggle with long documents, and break down on complex layouts. The solution: LLMs specialized in information extraction. In this talk, I will present: NuExtract — the first LLM specialized in extracting structured information (JSON output); NuMarkdown — the first reasoning OCR LLM (RAG-ready Markdown output). These low-hallucination open-source models outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller, enabling private usage. I will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction.
talk-data.com
E
Speaker
Etienne Bernard
1
talks
Co-founder & CEO
NuMind
AI/ML expert; co-founder & CEO of NuMind; speaker at 100+ events.
Bio from: Outclassing Frontier LLMs at Extracting Information
Filtering by:
Outclassing Frontier LLMs at Extracting Information
×
Filter by Event / Source
Talks & appearances
Showing 1 of 3 activities