talk-data.com talk-data.com

Topic

API

Application Programming Interface (API)

integration software_development data_exchange

856

tagged

Activity Trend

65 peak/qtr
2020-Q1 2026-Q1

Activities

856 activities · Newest first

Déployer des agents IA en production pose trois défis : la sécurité des données, la souveraineté de l’infrastructure et la fiabilité des résultats.

Avec le AI Model Hub de IONOS, vous accédez à des modèles open source via une API OpenAI-compatible, stateless et opérée en Europe, garantissant conformité et souveraineté.

Découvrez un exemple concret de système multi-agents qui combine LLMs, accès à des sources variées, utilisation d'outils, étapes de processing et mécanismes de sélection pour fournir des réponses fiables et contextualisées.

La plateforme NORMA, disponible en version open source, vient évaluer chaque étape (extraction, classification, génération) pour détecter faiblesses ou régressions, et garantir un comportement sûr.

Ses capacités de batch testing et d'intégration continue permettent de comparer vos versions, suivre la qualité dans le temps et bloquer toute régression avant mise en production.

En combinant l'infrastructure souveraine de IONOS et l’évaluation continue de NORMA, vous obtenez un pipeline robuste pour transformer vos PoC en solutions IA fiables et sécurisées!

With over 50,000 active users, discover how we transformed enterprise data interaction through Snowflake's Cortex Analyst API with SiemensGPT. Our plugin architecture, powered by the ReACT agent model, converts natural language into SQL queries and dynamic visualizations, orchestrating everything through a unified interface. Beyond productivity gains, this solution democratizes data access across Siemens, enabling employees at all levels to derive business insights through simple conversations.

CoSApp: an open-source library to design complex systems

CoSApp, for Collaborative System Approach, is a Python library dedicated to the simulation and design of multi-disciplinary systems. It is primarily intended for engineers and system architects during the early stage of industrial product design. The API of CoSApp is focused on simplicity and explicit declaration of design problems. Special attention is given to modularity; a very flexible mechanism of solver assembly allows users to construct complex, customized simulation workflows. This presentation aims at presenting the key features of the framework.

https://cosapp.readthedocs.io https://gitlab.com/cosapp/cosapp

Modern data engineering leverages Python to build robust, scalable, end-to-end workflows. In this talk, we will cover how Snowflake offers you a flexible development environment for developing Python data pipelines, performing transformation at scale, orchestrating and deploying your pipelines at scale. Topics we’ll cover include: – Ingest: Data source APIs, Snowflake file-to-read and ingest data of any format when files arrive, with sources outside Snowflake – Develop: Packaging (artifact repo), Python runtimes, IDE (Notebook, vscode) – Transform: Snowpark pandas, UDFs, UDAFs – Deploy: Tasks, Notebook scheduling

Documents Meet LLMs: Tales from the Trenches

Processing documents with LLMs comes with unexpected challenges: handling long inputs, enforcing structured outputs, catching hallucinations, and recovering from partial failures. In this talk, we’ll cover why large context windows are not a silver bullet, why chunking is deceptively hard and how to design input and output that allow for intelligent retrial. We'll also share practical prompting strategies, discuss OCR and parsing tools, compare different LLMs (and their cloud APIs) and highlight real-world insights from our experience developing production GenAI applications with multiple document processing scenarios.

ActiveTigger: A Collaborative Text Annotation Research Tool for Computational Social Sciences

The exponential growth of textual data—ranging from social media posts and digital news archives to speech-to-text transcripts—has opened new frontiers for research in the social sciences. Tasks such as stance detection, topic classification, and information extraction have become increasingly common. At the same time, the rapid evolution of Natural Language Processing, especially pretrained language models and generative AI, has largely been led by the computer science community, often leaving a gap in accessibility for social scientists.

To address this, we initiated since 2023 the development of ActiveTigger, a lightweight, open-source Python application (with a web frontend in React) designed to accelerate annotation process and manage large-scale datasets through the integration of fine-tuned models. It aims to support computational social science for a large public both within and outside social sciences. Already used by a dynamic community in social sciences, the stable version is planned for early June 2025.

From a more technical prospect, the API is designed to manage the complete workflow from project creation, embeddings computation, exploration of the text corpus, human annotation with active learning, fine-tuning of pre-trained models (BERT-like), prediction on a larger corpus, and export. It also integrates LLM-as-a-service capabilities for prompt-based annotation and information extraction, offering a flexible approach for hybrid manual/automatic labeling. Accessible both with a web frontend and a Python client, ActiveTigger encourages customization and adaptation to specific research contexts and practices.

In this talk, we will delve into the motivations behind the creation of ActiveTigger, outline its technical architecture, and walk through its core functionalities. Drawing on several ongoing research projects within the Computational Social Science (CSS) group at CREST, we will illustrate concrete use cases where ActiveTigger has accelerated data annotation, enabled scalable workflows, and fostered collaborations. Beyond the technical demonstration, the talk will also open a broader reflection on the challenges and opportunities brought by generative AI in academic research—especially in terms of reliability, transparency, and methodological adaptation for qualitative and quantitative inquiries.

The repository of the project : https://github.com/emilienschultz/activetigger/

The development of this software is funded by the DRARI Ile-de-France and supported by Progédo.

A Hitchhiker's Guide to the Array API Standard Ecosystem

The array API standard is unifying the ecosystem of Python array computing, facilitating greater interoperability between code written for different array libraries, including NumPy, CuPy, PyTorch, JAX, and Dask.

But what are all of these "array-api-" libraries for? How can you use these libraries to 'future-proof' your libraries, and provide support for GPU and distributed arrays to your users? Find out in this talk, where I'll guide you through every corner of the array API standard ecosystem, explaining how SciPy and scikit-learn are using all of these tools to adopt the standard. I'll also be sharing progress updates from the past year, to give you a clear picture of where we are now, and what the future holds.

Investing for Programmers

Maximize your portfolio, analyze markets, and make data-driven investment decisions using Python and generative AI. Investing for Programmers shows you how you can turn your existing skills as a programmer into a knack for making sharper investment choices. You’ll learn how to use the Python ecosystem, modern analytic methods, and cutting-edge AI tools to make better decisions and improve the odds of long-term financial success. In Investing for Programmers you’ll learn how to: Build stock analysis tools and predictive models Identify market-beating investment opportunities Design and evaluate algorithmic trading strategies Use AI to automate investment research Analyze market sentiments with media data mining In Investing for Programmers you'll learn the basics of financial investment as you conduct real market analysis, connect with trading APIs to automate buy-sell, and develop a systematic approach to risk management. Don’t worry—there’s no dodgy financial advice or flimsy get-rich-quick schemes. Real-life examples help you build your own intuition about financial markets, and make better decisions for retirement, financial independence, and getting more from your hard-earned money. About the Technology A programmer has a unique edge when it comes to investing. Using open-source Python libraries and AI tools, you can perform sophisticated analysis normally reserved for expensive financial professionals. This book guides you step-by-step through building your own stock analysis tools, forecasting models, and more so you can make smart, data-driven investment decisions. About the Book Investing for Programmers shows you how to analyze investment opportunities using Python and machine learning. In this easy-to-read handbook, experienced algorithmic investor Stefan Papp shows you how to use Pandas, NumPy, and Matplotlib to dissect stock market data, uncover patterns, and build your own trading models. You’ll also discover how to use AI agents and LLMs to enhance your financial research and decision-making process. What's Inside Build stock analysis tools and predictive models Design algorithmic trading strategies Use AI to automate investment research Analyze market sentiment with media data mining About the Reader For professional and hobbyist Python programmers with basic personal finance experience. About the Author Stefan Papp combines 20 years of investment experience in stocks, cryptocurrency, and bonds with decades of work as a data engineer, architect, and software consultant. Quotes Especially valuable for anyone looking to improve their investing. - Armen Kherlopian, Covenant Venture Capital A great breadth of topics—from basic finance concepts to cutting-edge technology. - Ilya Kipnis, Quantstrat Trader A top tip for people who want to leverage development skills to improve their investment possibilities. - Michael Zambiasi, Raiffeisen Digital Bank Brilliantly bridges the worlds of coding and finance. - Thomas Wiecki, PyMC Labs

Minus Three Tier: Data Architecture Turned Upside Down

Every data architecture diagram out there makes it abundantly clear who's in charge: At the bottom sits the analyst, above that is an API server, and on the very top sits the mighty data warehouse. This pattern is so ingrained we never ever question its necessity, despite its various issues like slow data response time, multi-level scaling issues, and massive cost.

But there is another way: Disconnect of storage and compute enables localization of query processing closer to people, leading to much snappier responses, natural scaling with client-side query processing, and much reduced cost.

In this talk, it will be discussed how modern data engineering paradigms like decomposition of storage, single-node query processing, and lakehouse formats enable a radical departure from the tired three-tier architecture. By inverting the architecture we can put user's needs first. We can rely on commoditised components like object store to enable fast, scalable, and cost-effective solutions.

In this talk, you will learn about some of the common challenges that you might encounter while developing meaningful applications with large language models.

Using non-deterministic systems as the basis for important applications is certainly an 'interesting' new frontier for software development, but hope is not lost. In this session, we will explore some of the well known (and less well known) issues in building applications on the APIs provided by LLM providers, and on 'open' LLMs, such as Mistral, Llama, or DeepSeek.

We will also (of course) dive into some of the approaches that you can take to address these challenges, and mitigate some of the inherent behaviors that are present within LLMs, enabling you to build more reliable and robust systems on top of LLMs, unlocking the potential of this new development paradigm.

Face To Face
by Roberto Flores (Magnum Ice Cream Company (a division of Unilever))

In this session, we will explore the world of small language models, focusing on their unique advantages and practical applications. We will cover the basics of language models, the benefits of using smaller models, and provide hands-on examples to help beginners get started. By the end of the session, attendees will have a solid understanding of how to leverage small language models in their projects. The session will highlight the efficiency, customization, and adaptability of small models, making them ideal for edge devices and real-time applications.

We will introduce attendees to two highly used Small Language Models: Qwen3 and SmolLM3. Specifically, we will cover:

1. Accessing Models: How to navigate HuggingFace to explore and select available models. How to view model documentation and determine its usefulness for specific tasks

2. Deployment: How to get started using

(a) Inference Provider - using HuggingFace inference API or Google CLI

(b) On-Tenant - using Databricks Model Serving

(c) Running the Model Locally - Using Ollama and LMstudio

3. We also examine the tradeoffs of each route

This session will provide a Maia demo with roadmap teasers. The demo will showcase Maia's core capabilities: authoring pipelines in business language, multiplying productivity by accelerating tasks, and enabling self-service. It demonstrates how Maia takes natural language prompts and translates them into YAML-based, human-readable Data Pipeline Language (DPL), generating graphical pipelines. Expect to see Maia interacting with Snowflake metadata to sample data and suggest transformations, as well as its ability to troubleshoot and debug pipelines in real-time. The session will also cover how Maia can create custom connectors from REST API documentation in seconds, a task that traditionally takes days . Roadmap teasers will likely include the upcoming Semantic Layer, a Pipeline Reviewing Agent, and enhanced file type support for various legacy ETL tools and code conversions.

In this 20-minute session, you'll learn how to build a custom Fivetran connector using the Fivetran Connector SDK and the Anthropic Workbench (AI Assistant) to integrate data from a custom REST API into Snowflake. 

You'll then learn how to create a Streamlit in Snowflake data application powering metrics and Snowflake Cortex AI-driven applications.

In this session, Omni CEO Colin Zima and VP of Product Arielle Strong will share how early experiments led to AI features our customers actually use and love: from natural language chat, to embeddable AI products, to APIs and an MCP server. 

They’ll walk through what worked, what didn’t, and how AI has reshaped our product roadmap. Expect real-world examples of AI analytics in production, along with best practices for getting your data AI-ready.

As AI adoption accelerates across industries, many organisations are realising that building a model is only the beginning. Real-world deployment of AI demands robust infrastructure, clean and connected data, and secure, scalable MLOps pipelines. In this panel, experts from across the AI ecosystem share lessons from the frontlines of operationalising AI at scale.

We’ll dig into the tough questions:

• What are the biggest blockers to AI adoption in large enterprises — and how can we overcome them?

• Why does bad data still derail even the most advanced models, and how can we fix the data quality gap?

• Where does synthetic data fit into real-world AI pipelines — and how do we define “real” data?

• Is Agentic AI the next evolution, or just noise — and how should MLOps prepare?

• What does a modern, secure AI stack look like when using external partners and APIs?

Expect sharp perspectives on data integration, model lifecycle management, and the cyber-physical infrastructure needed to make AI more than just a POC.

Unlock the true potential of your data with the Qlik Open Lakehouse, a revolutionary approach to Iceberg integration designed for the enterprise. Many organizations face the pain points of managing multiple, costly data platforms and struggling with low-latency ingestion. While Apache Iceberg offers robust features like ACID transactions and schema evolution, achieving optimal performance isn't automatic; it requires sophisticated maintenance. Introducing the Qlik Open Lakehouse, a fully managed and optimized solution built on Apache Iceberg, powered by Qlik's Adaptive Iceberg Optimizer. Discover how you can do data differently and achieve 10x faster queries, a 33-42% reduction in file API overhead, and ultimately, a 50% reduction in costs through streamlined operations and compute savings.

Is your analytics workflow stuck in fragmented chaos? AlphaSights, the global leader in expert knowledge on demand, used to juggle queries, scripts, spreadsheets, and dashboards across different tools just to get one analysis out the door. Manual updates slowed their teams, stakeholders waited too long for insights, and opportunities slipped through the cracks.With Hex, AlphaSights built a fully integrated Research Hub that unifies data queries, API calls, ML-powered enrichment, and reporting — all in one place. They eliminated manual work, automated updates, and empowered business teams to act faster on opportunities.The result: faster reaction times, broader coverage, and measurable commercial impact. Join this session to see how AlphaSights turned fragmented workflows into a seamless, automated pipeline — and learn how your team can build faster, smarter insights too.

In the age of agentic AI, competitive advantage lies not only in AI models, but in the quality of the data agents reason on and the agility of the tools that feed them. To fully realize the ROI of agentic AI, organizations need a platform that enables high-quality data pipelines and provides scalable, enterprise-grade tools. In this session, discover how a unified platform for integration, data management, MCP server management, API management, and agent orchestration can help you to bring cohesion and control to how data and agents are used across your organization.

When we launched our new GraphQL API at Netflix, it felt perfect—destined to power hundreds of millions of devices. Yet, change is inevitable. Even if your schema seems flawless today (which it isn't), requirements will shift, new features will emerge, and regrets will follow. GraphQL promises evolvability, allowing us to move forward without multiple API versions. But how does this hold up in practice? We mark fields as @deprecated, but what happens next? How can we embrace experimentation without entombing technical debt in the API? Does federation complicate things? Evolving your schema without breaking clients is easy, right? Right???. Drawing from experience with the Netflix API, this talk explores techniques for evolving your schema safely and painlessly. We'll cover the schema lifecycle—from experimentation to design, deprecation, and deletion. Attendees will leave with: - Schema design principles that facilitate change - Practical techniques for evolving GraphQL schemas - Strategies for managing a deprecation workflow Join us as we learn to face the inevitability of change with confidence and serenity.