talk-data.com talk-data.com

Topic

ELK

Elasticsearch/ELK Stack

search_engine log_analysis elk_stack

168

tagged

Activity Trend

10 peak/qtr
2020-Q1 2026-Q1

Activities

168 activities · Newest first

Elasticsearch Query Language the Definitive Guide

Streamline your workflow with ESQL enhance data analysis with real-time insights, and speed up aggregations and visualizations Key Features Apply ESQL efficiently in analytics, observability, and cybersecurity Optimize performance and scalability for high-demand environments Discover how to visualize and debug ESQL queries Purchase of the print or Kindle book includes a free PDF eBook Book Description Built to simplify high-scale data analytics in Elasticsearch, this practical guide will take you from foundational concepts to advanced applications across search, observability, and security. It will help you overcome common challenges such as efficiently querying large datasets, applying advanced analytics without deep prior knowledge, and resolving for a unique and consolidated query language. Written by senior experts at Elastic with extensive field experience, this book delivers actionable guidance rooted in solving today’s data challenges at scale. After introducing ESQL and its architecture, the chapters explore real-world applications across various domains, including analytics, raw log analysis, observability, and cybersecurity. Advanced topics such as scaling, optimization, and future developments are also covered to help you maximize your ESQL capabilities. By the end of this book, you’ll be able to leverage ESQL for comprehensive data management and analysis, optimizing your workflows and enhancing your productivity with Elasticsearch. What you will learn Gain a solid understanding of ESQL and its architecture Use ESQL for data analysis and performance monitoring Apply ESQL in cybersecurity for threat detection and incident response Find out how to perform advanced searches using ESQL Prepare for future ESQL developments Showcase ESQL in action through real-world, persona-driven use cases Who this book is for If you’re an Elasticsearch user, this book is essential for your growth. Whether you’re a data analyst looking to build analytics on top of Elasticsearch, an SRE monitoring the health of your IT system, or a cybersecurity analyst, this book will give you a complete understanding of how ESQL is built and used. Additionally, database administrators, business intelligence professionals, and operational intelligence professionals will find this book invaluable. Even with a beginner-level knowledge of Elasticsearch, you’ll be able to get started and make the most of this comprehensive guide.

AWS re:Invent 2025 - SageMaker HyperPod: Checkpointless & elastic training for AI models (AIM3338)

Transform your generative AI model development with checkpointless and elastic training on Amazon SageMaker HyperPod. Learn how checkpointless training eliminates costly downtime by automatically recovering from infrastructure faults in minutes instead of hours, using peer-to-peer state transfer without relying on restarting from checkpoints. Discover how elastic training can dynamically expand to claim idle accelerators or gracefully contract when higher-priority tasks need capacity, all without manual intervention. See how these innovations help you maintain forward training momentum despite infrastructure faults or fluctuations in resource availability, helping you scale and accelerate generative AI model development across hundreds to thousands of AI accelerators.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Balance cost, performance & reliability for AI at enterprise scale (AIM3304)

Deploying generative AI at enterprise scale requires balancing performance, cost, and reliability across diverse business purposes and use cases. Amazon Bedrock offers a complete portfolio of inference options, with on-demand cross-region inference for elastic scaling, on-demand service tiers for balancing performance and cost, including optimization options like prompt caching for improving latency while significantly reducing cost, and batch inference for cost-effective bulk processing. This interactive session covers the tools and approaches needed to architect hybrid inference strategies that enable enterprises to maximize price-performance ratios as AI workloads scale.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

For Security Operations Center leaders, the daily reality is a battle against alert fatigue. IUniderstand how you can build security operations to detect, investigate, and respond to threats at the speed and scale of cloud. Learn about AI-powered Attack Discovery which automatically surfaces high-fidelity threats, Elastic Workflows, a native automation engine, and Elastic AI Agent Builder to investigate and monitor using out-of-the-box AI agents or to build custom agents.

Multi-tenant apps need scalable platforms to meet growing workload and AI demands, but scaling PostgreSQL can be complex. Elastic Clusters in Azure Database for PostgreSQL offers a fix: simple horizontal scale out of PostgreSQL data with row and schema-based sharding built into the managed service. Join us to learn how this best-in-class capability is empowering ITOps, architects and developers to build and manage resilient, high-performing multi-tenant apps while optimizing operations and cost.

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

Discover how Elasticsearch’s powerful vector database capabilities and the robust Azure AI Foundry Agent Framework combine to power smarter agents. View how to synthesize information from diverse data sources and how to use A2A and MCP for orchestrating complex tasks. Learn how to design and apply intelligent agents for scalable information retrieval, task coordination, and integration across systems.

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

Abstract: We added on-the-fly gzip decompression to Elastic Filebeat and the Elastic Agent—our log collection tools—to enable the ingestion of gzip archives and rotated logs. A performance drop was expected, so we benchmarked the feature only to find that the performance didn't drop at all. This talk is the story of our hunt for a non-existent bottleneck and how a holistic view of application performance uncovered the surprising truth about where the real costs lie.

Summary In this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to more modern approaches like vectors, RAG, and relational databases. Mark explains why agents require serverless, elastic, and operationally simple databases, and how AWS solutions like Aurora and DSQL address these needs with features such as rapid provisioning, automated patching, geodistribution, and spiky usage. The conversation covers topics including tool calling, improved model capabilities, state in agents versus stateless LLM calls, and the role of Lambda and AgentCore for long-running, session-isolated agents. Mark also touches on the shift from local MCP tools to secure, remote endpoints, the rise of object storage as a durable backplane, and the need for better identity and authorization models. The episode highlights real-world patterns like agent-driven SQL fuzzing and plan analysis, while identifying gaps in simplifying data access, hardening ops for autonomous systems, and evolving serverless database ergonomics to keep pace with agentic development.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Marc Brooker about the impact of agentic workflows on database usage patterns and how they change the architectural requirements for databasesInterview IntroductionHow did you get involved in the area of data management?Can you describe what the role of the database is in agentic workflows?There are numerous types of databases, with relational being the most prevalent. How does the type and purpose of an agent inform the type of database that should be used?Anecdotally I have heard about how agentic workloads have become the predominant "customers" of services like Neon and Fly.io. How would you characterize the different patterns of scale for agentic AI applications? (e.g. proliferation of agents, monolithic agents, multi-agent, etc.)What are some of the most significant impacts on workload and access patterns for data storage and retrieval that agents introduce?What are the categorical differences in that behavior as compared to programmatic/automated systems?You have spent a substantial amount of time on Lambda at AWS. Given that LLMs are effectively stateless, how does the added ephemerality of serverless functions impact design and performance considerations around having to "re-hydrate" context when interacting with agents?What are the most interesting, innovative, or unexpected ways that you have seen serverless and database systems used for agentic workloads?What are the most interesting, unexpected, or challenging lessons that you have learned while working on technologies that are supporting agentic applications?Contact Info BlogLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links AWS Aurora DSQLAWS LambdaThree Tier ArchitectureVector DatabaseGraph DatabaseRelational DatabaseVector EmbeddingRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeGraphRAGAI Engineering Podcast EpisodeLLM Tool CallingMCP == Model Context ProtocolA2A == Agent 2 Agent ProtocolAWS Bedrock AgentCoreStrandsLangChainKiroThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.

  • Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.
  • Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.
  • André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.
  • Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.
  • Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.
  • Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.

Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.

Kacper Łukawski Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant’s support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps. Connect: https://www.linkedin.com/in/kacperlukawski/

Manish Gill
Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager.
Connect: https://www.linkedin.com/in/manishgill/

André Charton
Head of Search at Kleinanzeigen, he discussed shaping the company’s search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering.
Connect: https://www.linkedin.com/in/andrecharton/

Filip Makraduli
Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters.
Connect: https://www.linkedin.com/in/filipmakraduli/

Brian Goldin
Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager’s 12 years of enterprise experience.
Connect: https://www.linkedin.com/in/brian-goldin-04170a1/

Atita Arora
Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion.
Connect: https://www.linkedin.com/in/atitaarora/

Je ziet ze overal: de jubelende succesverhalen over AI. Maar is het pad naar AI echt zo vlekkeloos? Of ben je al snel de AI clown die vooral mooie woorden orakelt? En verandert dit allemaal ook wel echt de core van je business? Bij Springbok weten we dat AI méér is dan glanzende posts. Het is ploeteren, zweten, vallen en opstaan. Ja, AI kan je core business veranderen, maar alleen met lef en doorzettingsvermogen. Want achter elk succesverhaal schuilt ook de rauwe realiteit van keihard bouwen.

Data Engineering for Cybersecurity

Security teams rely on telemetry—the continuous stream of logs, events, metrics, and signals that reveal what’s happening across systems, endpoints, and cloud services. But that data doesn’t organize itself. It has to be collected, normalized, enriched, and secured before it becomes useful. That’s where data engineering comes in. In this hands-on guide, cybersecurity engineer James Bonifield teaches you how to design and build scalable, secure data pipelines using free, open source tools such as Filebeat, Logstash, Redis, Kafka, and Elasticsearch and more. You’ll learn how to collect telemetry from Windows including Sysmon and PowerShell events, Linux files and syslog, and streaming data from network and security appliances. You’ll then transform it into structured formats, secure it in transit, and automate your deployments using Ansible. You’ll also learn how to: Encrypt and secure data in transit using TLS and SSH Centrally manage code and configuration files using Git Transform messy logs into structured events Enrich data with threat intelligence using Redis and Memcached Stream and centralize data at scale with Kafka Automate with Ansible for repeatable deployments Whether you’re building a pipeline on a tight budget or deploying an enterprise-scale system, this book shows you how to centralize your security data, support real-time detection, and lay the groundwork for incident response and long-term forensics.

High energy particle (HEP) physics research is going through fundamental changes as we move to collect larger amounts of data from the Large Hadron Collider (LHC). Analysis facilities and distributed computing, through HTCs, have come together to create the next pythonic generation of analysis by utilizing htcdaskgateway, a Dask gateway extension, allowing users to spawn workers compatible with both their analysis and heterogeneous clusters in line with authentication requirements. This is enabling physicists to engage with scientific python in ways they had not before because of domain specific C++ tools. An example of htcdaskgateway’s use is Fermilab’s Elastic Analysis Facility.

In a world flooded with data, dashboards alone aren't enough—organisations need real-time answers that drive action. Auror, a leading retail crime intelligence platform, leverages Elastic’s AI-powered search to unify and analyse data at scale—accelerating investigations, enabling cross-organisational collaboration, and significantly reducing retail shrink. In this session, discover how search-native architecture empowers decision intelligence, operational resilience, and frontline impact—delivering measurable ROI and strategic business value.

Discover how Elastic Cloud Serverless and Google Vertex AI empower the creation of AI-driven search applications with effortless scalability. This session explores Elastic's intuitive serverless architecture and dynamic scaling, integrating with Google Vertex AI to create world class search experiences. Learn how this powerful partnership simplifies deployments and accelerates innovation for modern search, observability, and security workloads.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.