In this talk, the speaker presents NuExtract, the first LLM specialized in extracting structured information (JSON output), and NuMarkdown, the first reasoning OCR LLM (RAG-ready Markdown output). The talk demonstrates low-hallucination open-source models that outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller, enabling private usage. It will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction.
talk-data.com
Topic
ocr
7
tagged
Activity Trend
Top Events
Top Speakers
Transform an LLM into a Vision Language Model (VLM). Process PDFs like a pro with OCR tools.
Transform an LLM into a Vision Language Model (VLM). Process PDFs like a pro with OCR tools.
EasyLens is an iOS app that uses AI and computer vision to simplify daily tasks like waste sorting, understanding German documents, and identifying tourist spots. Users simply point their camera and select a feature; the app does the rest using Core ML and OCR.
Hands-on session exploring how to use Docling for data extraction and cleanup across PDFs, HTML, and DOCX. Includes getting started with Docling, extracting content from documents, handling table and image data, and extracting content from scanned PDF documents using OCR.
Hands-on workshop on using Docling to extract and clean data from documents, including PDFs, HTML, and OCR for scanned PDFs. Key activities: getting started with Docling; extracting content from PDFs/HTML; handling table and image data; extracting content from scanned PDFs using OCR.
Project aim is to try building a web scraping library that uses OCR, an LLM and some automation scripts to retrieve data from highly protected websites without API’s.