Cyber Security

Navigating Secure and Cost-Efficient Flink Batch on Kubernetes with Airflow

2025-07-01 · Airflow Summit 2025

session

by Prakash Nandha Mukunthan , Purshotam Shah

Airflow Flink Kubernetes

At Yahoo, we built a secure, scalable, and cost-efficient batch processing platform using Amazon MWAA to orchestrate Apache Flink jobs on EKS, managed by the Flink Kubernetes Operator. This setup enables dynamic job orchestration while meeting strict enterprise compliance standards. In this session, we’ll share how Airflow DAGs: Dynamically launch, monitor, and clean up isolated Flink clusters per batch job, improving resource efficiency. Securely fetch EKS kubeconfig, submit FlinkDeployment CRDs using FlinkKubernetesOperator, and poll job status using Airflow sensors. Integrate IAM for access control and meet Yahoo’s security requirements, including mutual TLS (mTLS) with Athenz. Optimize for cost and resilience through automated cleanup of jobs and the operator, and handle job failures and retries. Join us for practical strategies and lessons from Yahoo’s production-scale Flink workflows in a Kubernetes environment.

Securing Airflow CLI with API

2025-07-01 · Airflow Summit 2025

session

by Buğra Öztürk

Airflow API

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle. We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

Security made us do it: Airflow’s new Task Execution Architecture

2025-07-01 · Airflow Summit 2025

session

by Ash Berlin-Taylor (Astronomer) , Amogh Rajesh Desai

Airflow API

Airflow v2 architecture has strong coupling between the Airflow core & the User Code running in an Airflow task. This poses barriers in security, maintenance, and adoption. One such threat is that user code can access the source of truth of Airflow - the metadata DB and run any query against it! From a scalability angle, ‘n’ tasks create ‘n’ DB connections, limiting Airflow’s ability to scale effectively. To address this we proposed AIP-72 – a client-server model for task execution. The new architecture addresses several long-standing issues, including DB isolation from workers, dependency conflicts between Airflow core & workers, and ‘n’ number of DB connections.The new architecture has two parts: Execution API Server: Tasks no longer have direct DB access, they use this new slim, secure API Task SDK: A lightweight toolkit that lets you write tasks without drowning within Airflow’s codebase Beyond isolation and security, the redesign unlocks the ability for native multi-language task authoring support, and secure Remote Execution. Join us to explore how AIP-72 transforms Airflow task execution, paving the way for a more secure, flexible, and futuristic task orchestration!

Using SonarQube in a Microservice Migration

2025-06-26 · Software Social & Talk on Using SonarQube in a Microservice Migration

talk

ci/cd code quality code smells microservices sonarqube tech debt

Jonah will walk through his experience helping migrate a legacy enterprise system from stored procedures to a microservices architecture. Early in the project, they introduced SonarQube to improve code quality and surface security concerns during the POC phase leading them to reevaluate some of their initial architectural choices. Jonah will share how they integrated SonarQube into their CI, how it shaped their dev process, and what he learned from digging through the codebase to clean up smells and reduce tech debt! Key Takeaways: 1. Practical lessons from cleaning up real-world code smells and vulnerabilities manually as part of a growing team. 2. How to integrate SonarQube into a CI/CD pipeline for a microservices project and what impact it actually has on code quality. 3. Using SonarQube for POC work benefits as a design feedback tool, not just a cleanup tool.

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

2025-06-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Kelly P. Vincent

AI/ML Analytics Data Analytics Data Science data data-science

Unlock the world of data science—no coding required. Curious about data science but not sure where to start? This book is a beginner-friendly guide to what data science is and how people use it. It walks you through the essential topics—what data analysis involves, which skills are useful, and how terms like “data analytics” and “machine learning” connect—without getting too technical too fast. Data science isn’t just about crunching numbers, pulling data from a database, or running fancy algorithms. It’s about asking the right questions, understanding the process from start to finish, and knowing what’s possible (and what’s not). This book teaches you all of that, while also introducing important topics like ethics, privacy, and security—because working with data means thinking about people, too. Whether you're a student exploring new skills, a professional navigating data-driven decisions, or someone considering a career change, this book is your friendly gateway into the world of data science, one of today’s most exciting fields. No coding or programming experience? No problem. You'll build a solid foundation and gain the confidence to engage with data science concepts— just as AI and data become increasingly central to everyday life. What You Will Learn Grasp foundational statistics and how it matters in data analysis and data science Understand the data science project life cycle and how to manage a data science project Examine the ethics of working with data and its use in data analysis and data science Understand the foundations of data security and privacy Collect, store, prepare, visualize, and present data Identify the many types of machine learning and know how to gauge performance Prepare for and find a career in data science Who This Book is for A wide range of readers who are curious about data science and eager to build a strong foundation. Perfect for undergraduates in the early semesters of their data science degrees, as it assumes no prior programming or industry experience. Professionals will find particular value in the real-world insights shared through practitioner interviews. Business leaders can use it to better understand what data science can do for them and how their teams are applying it. And for career changers, this book offers a welcoming entry point into the field—helping them explore the landscape before committing to more intensive learning paths like degrees or boot camps.

Hide the broccoli: delivering security without devs throwing a tantrum

2025-06-24 · 🛡️ Go at Datadog: Deep Dive into Application Security 🤿

talk

by Datadog engineer (Datadog)

Remote presentation by a USADatadog engineer based in the USA.

Management and Security with OpenShift Platform Plus + Multi-Cluster management

2025-06-24 · OpenShift Showcase NYC

talk

PostgreSQL Mistakes and How to Avoid Them

2025-06-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jimmy Angelakos

SQL data data-engineering postgresql relational-databases

Recognize and avoid these common PostgreSQL mistakes! The best mistakes to learn from are ones made by other people! In PostgreSQL Mistakes and How To Avoid Them you’ll explore dozens of common PostgreSQL errors so you can easily avoid them in your own projects, learning proactively why certain approaches fail and others succeed. In PostgreSQL Mistakes and How To Avoid Them you’ll learn how to: Avoid configuration and operation issues Maximize PostgreSQL utility and performance Fix bad SQL practices Solve common security and administration issues Ensure smooth migration and upgrades Diagnose and fix a bad database As PostgreSQL continues its rise as a leading open source database, mastering its intricacies is crucial. PostgreSQL Mistakes and How To Avoid Them is full of tested best practices to ensure top performance, and future-proof your database systems for seamless change and growth. Each of the mistakes is carefully described and accompanied by a demo, along with an explanation that expands your knowledge of PostgreSQL internals and helps you to build a stronger mental model of how the database engine works. About the Technology Fixing mistakes in PostgreSQL databases can be time-consuming and risky—especially when you’re making live changes to an in-use system. Fortunately, you can learn from the mistakes other Postgres pros have already made! This incredibly practical book lays out how to find and avoid the most common, dangerous, and sneaky errors you’ll encounter using PostgreSQL. About the Book PostgreSQL Mistakes and How To Avoid Them identifies Postgres problems in key areas like data types, features, security, and high availability. For each mistake you’ll find a real-world narrative that illustrates the pattern and provides concrete recommendations for improvement. You’ll especially appreciate the illustrative code snippets, schema samples, mind maps, and tables that show the pros and cons of different approaches. What's Inside Diagnose configuration and operation issues Fix bad SQL code Address security and administration issues Ensure smooth migration and upgrades About the Reader For PostgreSQL database administrators and application developers. About the Author Jimmy Angelakos is a systems and database architect and PostgreSQL Contributor. He works as a Senior Principal Engineer at Deriv. Quotes I’ve run into many of these mistakes. Read up to get prepared! - Milorad Imbra, FEVO Navigates PostgreSQL pitfalls with clarity. I highly recommend it. - Manohar Sai Jasti, Workday A straightforward style and real-world examples make it an essential read. - Potito Coluccelli, Econocom Italia Provides valuable tips to avoid common PostgreSQL pitfalls. - Fernando Bugni, Grupo QuintoAndar

Beyond RBAC: Elevating Your Security Controls

2025-06-18 · OpenSearch Project Berlin 2nd meetup

talk

Tired of one-size-fits-all security? In this advanced session, we'll venture past traditional role-based access controls and discover powerful techniques to protect your sensitive data with surgical precision. We will explore how you can implement access policies for documents and fields (DLS/FLS), anonymize fields on the fly (goodbye exposed PII!), and how to blend role and attribute-based approaches for dramatically simpler role definitions.

Security Monitoring of OpenSearch with OpenSearch

2025-06-18 · OpenSearch Project Berlin 2nd meetup

talk

by Nils Bandener

One of the really cool aspects of OpenSearch is its ability to serve as a monitoring tool for itself. Let's have a look at how to piece together a monitoring solution for OpenSearch using components from the OpenSearch ecosystem such as Data Prepper and OpenSearch alerting. We will focus on security aspects like authentication and authorization.\n\nAny software dealing with confidential data needs to have a security solution providing authentication and authorization. And any such security solution should have proper monitoring. OpenSearch is no exception in this regard.\n\nIn this presentation, we will look how to use components like Audit Logs, Data Prepper and OpenSearch Alerting to create a small, but effective security monitoring solution.

The present, past and future of GitHub

2025-06-18 · The Pragmatic Engineer Listen

podcast_episode

by Thomas Dohmke (GitHub) , Gergely Orosz

AI/ML Analytics Cloud Computing GitHub Linux Marketing Microsoft

Supported by Our Partners •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. • Graphite — The AI developer productivity platform. • Augment Code — AI coding assistant that pro engineering teams love — GitHub recently turned 17 years old—but how did it start, how has it evolved, and what does the future look like as AI reshapes developer workflows? In this episode of The Pragmatic Engineer, I’m joined by Thomas Dohmke, CEO of GitHub. Thomas has been a GitHub user for 16 years and an employee for 7. We talk about GitHub’s early architecture, its remote-first operating model, and how the company is navigating AI—from Copilot to agents. We also discuss why GitHub hires junior engineers, how the company handled product-market fit early on, and why being a beloved tool can make shipping harder at times. Other topics we discuss include: • How GitHub’s architecture evolved beyond its original Rails monolith • How GitHub runs as a remote-first company—and why they rarely use email • GitHub’s rigorous approach to security • Why GitHub hires junior engineers • GitHub’s acquisition by Microsoft • The launch of Copilot and how it’s reshaping software development • Why GitHub sees AI agents as tools, not a replacement for engineers • And much more! — Timestamps (00:00) Intro (02:25) GitHub’s modern tech stack (08:11) From cloud-first to hybrid: How GitHub handles infrastructure (13:08) How GitHub’s remote-first culture shapes its operations (18:00) Former and current internal tools including Haystack (21:12) GitHub’s approach to security (24:30) The current size of GitHub, including security and engineering teams (25:03) GitHub’s intern program, and why they are hiring junior engineers (28:27) Why AI isn’t a replacement for junior engineers (34:40) A mini-history of GitHub (39:10) Why GitHub hit product market fit so quickly (43:44) The invention of pull requests (44:50) How GitHub enables offline work (46:21) How monetization has changed at GitHub since the acquisition (48:00) 2014 desktop application releases (52:10) The Microsoft acquisition (1:01:57) Behind the scenes of GitHub’s quiet period (1:06:42) The release of Copilot and its impact (1:14:14) Why GitHub decided to open-source Copilot extensions (1:20:01) AI agents and the myth of disappearing engineering jobs (1:26:36) Closing — The Pragmatic Engineer deepdives relevant for this episode: • AI Engineering in the real world • The AI Engineering stack • How Linux is built with Greg Kroah-Hartman • Stacked Diffs (and why you should know about them) • 50 Years of Microsoft and developer tools — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Unstructured Data Quality: How to Improve Trust to Ensure AI-Ready Data

2025-06-18 · gartner-data-analytics-apac-2025

talk

by Melody Chien (Gartner)

AI/ML Data Quality

Data architects are increasingly tasked with provisioning quality unstructured data to support AI models. However, little has been done to manage unstructured data beyond data security and privacy requirements. This session will look at what it takes to improve the quality of unstructured data and the emerging best practices in this space.

Denodo: Cracking the Code to Data Excellence: Meridian’s Path to Trusted, Smarter Decisions

2025-06-18 · gartner-data-analytics-apac-2025

talk

by Rianto Lukman (Meridian Energy Ltd)

Data Lakehouse ETL/ELT

Meridian Energy, New Zealand’s leader in 100% renewable generation, adopted Denodo as a unified semantic data layer to accelerate the delivery of diverse use cases across its lakehouse environment. From security risk modelling to incident management, ESG compliance and more, Denodo enables governed, real-time access to data without replication – reducing ETL overhead, empowering self-service, and ensuring consistent metrics. Business teams are continuing to explore and advance data-driven solutions, supporting Meridian’s shift to a governed lakehouse architecture.

Microsoft Power Platform Solution Architect Certification Companion: Mastering the PL-600 Certification

2025-06-18 · O'Reilly Data Science Books O'Reilly Amazon

book

by Loganathan K

AI/ML BI Data Modelling Microsoft Power BI business-intelligence data data-science microsoft-power-platform pl-600-microsoft-power-platform-solution-architect pl-600: microsoft power platform solution architect

This comprehensive guide book equips you with the knowledge and confidence needed to prep for the exam and thrive as a Power Platform Solution Architect. The book starts with a foundation for successful solution architecture, emphasizing essential skills such as requirements gathering, governance, and security. You will learn to navigate customer discovery, translate business needs into technical requirements, and design solutions that address both functional and non-functional needs. The second part of the book delves into the Microsoft Power Platform ecosystem, offering an in-depth look at its core components—Power Apps, Power Automate, Power BI, Microsoft Copilot, and Robotic Process Automation (RPA). Detailed insights into data modeling, security strategies, and AI integration will guide you in building scalable, secure solutions. Coverage of application life cycle management, which empowers solution architects to design, implement, and deploy Power Platform solutions effectively, is discussed next. You will then go through real-world scenarios, giving you a practical understanding of the challenges and considerations in managing Power Platform projects within a business context. The book concludes with strategies for continuous learning and resources for professional development, including practice questions to assess knowledge and readiness for the PL-600 exam. After reading the book, you will be ready to take the exam and become a successful Power Platform Solution Architect. What You Will Learn Understand the Solution Architect's role, responsibilities, and strategic approaches to successfully navigate projects Master the basics of Power Platform Solution Architecture Understand governance, security, and integration concepts in real-world scenarios Design and deploy effective business solutions using Power Platform components Gain the skills necessary to prep for the PL-600 certification exam Who This Book Is For Professionals pursuing Microsoft PL-600 Solution Architect certification and IT consultants and developers transitioning to solution architect roles

Guest Keynote: Winning the Numbers Game in our Digital World

2025-06-17 · gartner-data-analytics-apac-2025

talk

by Adam Spencer (Guest Keynote Speaker)

AI/ML LLM

Join award-winning broadcaster and national comedy champion, Adam Spencer to learn how technologies like AI, Cyber Security and ChatGPT are disrupting business and how you can win in this world. In this thought-provoking and funny presentation, Adam will share:
• How Artificial Intelligence will impact every industry and how to harness its potential
• The crucial role every worker plays in your cyber security
• The business potential of holding a supercomputer in your hands
• How to keep up if the pace of digital disruption feels overwhelming

How Coinbase Developed an End-to-End ML Platform on Snowflake

2025-06-17 · Summit 2025 - On Demand Watch

session

AI/ML Snowflake

Join this session to learn how Coinbase builds end-to-end ML workflows on top of Snowflake’s platform for optimal data security, governance and price performance. Using features such as Snowflake Feature Store and Snowflake Model Registry, Coinbase now automates batch and online inference on predictive ML models to quickly and accurately unban users who were initially incorrectly flagged as suspected fraud or bots, resulting in an improved user experience and increased revenue.

What's New: Governance and Security for Data and AI with Horizon Catalog

2025-06-17 · Summit 2025 - On Demand

session

AI/ML Data Quality Snowflake

Learn how to streamline governance and security for data and AI with Snowflake's latest updates to Horizon Catalog. Join us for new product overviews and live demos covering Trust Center, sensitive data, data quality, lineage and more.

Airia: Accelerating Your AI Journey with Secure and Powerful Agents

2025-06-17 · gartner-data-analytics-apac-2025

talk

by Richard Davies (Airia)

AI/ML

Embark on a transformative AI journey with our session focused on deploying AI agents that deliver immediate ROI while ensuring robust data security. We’ll delve into advanced AI orchestration techniques that not only enhance system efficiency but also improve employee productivity. By incorporating TRiSM principles, you’ll learn how to develop AI applications that are both trustworthy and risk-managed. Whether you are just beginning your AI journey or seeking to expand your existing framework, this session offers practical insights to transform AI potential into meaningful business outcomes.

Protegrity: Bridging Security and Business Innovation in The AI and Data Sharing Era

2025-06-17 · gartner-data-analytics-apac-2025

talk

by Randy Cook (Protegrity)

AI/ML Analytics Cloud Computing

Struggling to balance AI-driven innovation with security and compliance? Organisations are racing to leverage AI, advanced analytics, and the cloud for a competitive edge—but growing regulations and data protection requirements create significant challenges.
This session will provide practical strategies to maximise AI potential while safeguarding sensitive data, ensuring compliance, and mitigating risk.
Learn how to harness innovation without compromise and position your organisation for success in a rapidly evolving digital landscape.

Roundtable: Unlocking AI Safely – Don’t Let Data Security Be Your Downfall, moderated by Protegrity

2025-06-17 · gartner-data-analytics-apac-2025

roundtable

by Randy Cook (Protegrity)

AI/ML GenAI

Generative AI is revolutionizing businesses, but the secret behind its success is your data – sensitive, unstructured, and exposed. Companies are racing to deploy AI yet often overlook the risks of data compromise, exfiltration and compliance. As organizations adopt new data tools, they must rethink their approach to data security or risk disastrous consequences.
If you’re struggling with balancing innovation and security, this session will give you the blueprint to securely scale data & AI applications.

talk-data.com

Activity Trend

Top Events

Top Speakers

Navigating Secure and Cost-Efficient Flink Batch on Kubernetes with Airflow

Securing Airflow CLI with API

Security made us do it: Airflow’s new Task Execution Architecture

Using SonarQube in a Microservice Migration

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

Hide the broccoli: delivering security without devs throwing a tantrum

Management and Security with OpenShift Platform Plus + Multi-Cluster management

PostgreSQL Mistakes and How to Avoid Them

Beyond RBAC: Elevating Your Security Controls

Security Monitoring of OpenSearch with OpenSearch

The present, past and future of GitHub

Unstructured Data Quality: How to Improve Trust to Ensure AI-Ready Data

Denodo: Cracking the Code to Data Excellence: Meridian’s Path to Trusted, Smarter Decisions

Microsoft Power Platform Solution Architect Certification Companion: Mastering the PL-600 Certification

Guest Keynote: Winning the Numbers Game in our Digital World

How Coinbase Developed an End-to-End ML Platform on Snowflake

What's New: Governance and Security for Data and AI with Horizon Catalog

Airia: Accelerating Your AI Journey with Secure and Powerful Agents

Protegrity: Bridging Security and Business Innovation in The AI and Data Sharing Era

Roundtable: Unlocking AI Safely – Don’t Let Data Security Be Your Downfall, moderated by Protegrity