In this presentation, we discuss how we built a fully managed workflow orchestration system at Salesforce using Apache Airflow to facilitate dependable data lake infrastructure on the public cloud. We touch upon how we utilized kubernetes for increased scalability and resilience, as well as the most effective approaches for managing and scaling data pipelines. We will also talk about how we addressed data security and privacy, multitenancy, and interoperability with other internal systems. We discuss how we use this system to empower users with the ability to effortlessly build reliable pipelines that incorporate failure detection, alerting, and monitoring for deep insights through monitoring, removing the undifferentiated heavy lifting associated with running and managing their own orchestration engines. Lastly, we elaborate on how we integrated our in-house CI/CD pipelines to enable effective DAG and dependency management, further enhancing the system’s capabilities.
talk-data.com
Topic
Cyber Security
2078
tagged
Activity Trend
Top Events
Airflow’s KubernetesExecutor has supported multi_namespace_mode for long time. This feature is great at allowing Airflow jobs to run in different namespaces on the same Kubernetes clusters for better isolation and easier management. However, this feature requires cluster-role for the Airflow scheduler, which can create security problems or be a blocker for some users. PR https://github.com/apache/airflow/pull/28047 , which will become available in Airflow 2.6.0, resolves this issue by allowing Airflow users to specify multi_namespace_mode_namespace_list when using multi_namespace_mode, so that no cluster-role is needed and user only needs to ensure the Scheduler has permissions to certain namespaces rather than all namespaces on the Kubernetes cluster. This talk aims to help you better understand KubernetesExecutor and how to set it up in a more secure manner.
Kiwi.com started using Airflow in June 2016 as an orchestrator for several people in the company. The need for the tool grew and the monolithic instance was used by 30+ teams having 500+ DAGs active resulting in 3.5 million tasks/month successfully finished. At first, we moved to using a monolithic Airflow environment, but our needs quickly changed as we wanted to support a data mesh architecture within kiwi.com. By leveraging Astronomer on GCP, we were able to move from a monolithic Airflow environment to many smaller instances of Airflow. This talk will go into how to handle things like DAG dependencies, observability, and stakeholder management. Furthermore, we’ll talk about security, particularly how GCP’s workload identity helped us achieve a passwordless Airflow experience.
We’ve heard a lot in the last few years about insecurity in the open source software ecosystem, whether it be vulnerabilities, supply chain attacks or malware. Has open source become suddenly fraught with security problems? Or is it maybe, possibly… actually doing great? Let’s delve into the collaborative nature of our open-source ecosystems, and explore how transparency, peer review, and community have created a robust security posture. We’ll examine real-world examples, dispel myths, and reveal the inherent strengths of open source in fostering a secure and resilient software ecosystem.
Data platform teams often find themselves in a situation where they have to provide Airflow as a service to downstream teams, as more users and use cases in their organization require an orchestrator. In these situations, it’s giving each team it’s own Airflow environment can unlock velocity and actually be lower overhead to maintain than a monolithic environment. This talk will be about things to keep in mind when building an Airflow service that supports several environments, persona of users, and use cases. Namely, we’ll discuss principles to keep in mind when balancing centralized control over the data platform with decentralized teams using Airflow in a way that they’ll need. This will include things around observability, developer productivity, security, and infrastructure. We’ll also talk about day 2 concerns around overheard, infrastructure maintenance, and other tradeoffs to consider.
This book emphasizes the idea of understanding the motivation of the advanced circuits’ design to establish the AI interface and to mitigate the security attacks in a better way for big data. It is for students, researchers, and professionals, faculty members and software developers who wish to carry out further research.
In "Building a Next-Gen SOC with IBM QRadar", you'll learn how to utilize IBM QRadar to create an efficient Security Operations Center (SOC). The book covers deploying QRadar in various environments, understanding its architecture, and leveraging its powerful features to detect and respond to real-time threats with confidence, ultimately enabling advanced security practices. What this Book will help me do Understand and deploy IBM QRadar in different environments, including on-premises and cloud. Leverage QRadar's features to analyze network traffic, detect threats, and enhance security monitoring. Effectively use QRadar rules and searches to identify, correlate, and respond to security events. Integrate AI technologies with QRadar to automate and improve threat management processes. Maintain, troubleshoot, and scale the QRadar environment to meet evolving security needs. Author(s) Ashish Kothekar is an experienced cybersecurity specialist with a deep understanding of IBM QRadar and SOC operations. He has dedicated his career to helping organizations implement effective security practices. Through his accessible writing and detailed examples, he aims to empower security professionals to maximize their use of QRadar. Who is it for? This book is perfect for SOC analysts, security engineers, and cybersecurity enthusiasts who want to enhance their security skills. Readers should have a basic knowledge of networking and cybersecurity principles. If you're looking to deepen your understanding of IBM QRadar and build a next-gen SOC, this book is for you.
Roya is a research scientist who is passionate about advancing artificial intelligence technologies. She is particularly interested in computer vision and pattern recognition and has developed machine learning solutions for various applications, including healthcare, assistive technology, and security. Roya is also an advocate for women's rights. She is a Google Women's Techmaker Ambassador…
Follow FC as he steals from the world’s most secure banks and government facilities—without breaking a single law In How I Rob Banks: And Other Such Places, renowned ethical hacker and social engineer FC delivers a gripping and often hilarious discussion of his work: testing the limits of physical bank security by trying to “steal” money, data, and anything else he can get his hands on. In the book, you’ll explore the secretive world of physical assessments and follow FC as he breaks into banks and secure government locations to identify security flaws and loopholes. The author explains how banks and other secure facilities operate, both digitally and physically, and shows you the tools and techniques he uses to gain access to some of the world’s most locked-down buildings. You’ll also find: Strategies you can implement immediately to better secure your own company, home, and data against malicious actors Detailed photos, maps, and drawings to bring to life the unbelievable true stories contained inside An inside and candid look at a rarely examined industry through the eyes of one of its most respected penetration testers A can’t-miss account of real-life security exploits perfect for infosec pros, including red and blue teamers, pentesters, CIOs, CISSPs, and social engineers, How I Rob Banks also belongs in the hands of anyone who loves a great Ocean’s 11-style story pulled straight from the real world.
Today AI and Machine/Deep Learning have become the hottest areas in the information technology. This book aims to provide a complete picture on the challenges and solutions to the security issues in various applications. It explains how different attacks can occur in advanced AI tools and the challenges of overcoming those attacks.
Ben Harris, former Assistant Secretary for Economic Policy at the US Treasury, summarizes the latest proposal for raising the debt ceiling. An acceptable deal seems to be within reach but remains politically uncertain even as the x-date draws near. What are the alternatives if no deal is reached? And what could be the consequences for bondholders, Social Security recipients, and other stakeholders? Ben joins Mark and Cris in a round of the Inside Economics Statistics Game and provides his views on the future of retirement financing. For more on Ben Harris, click here For the full transcript, click here Follow Mark Zandi @MarkZandi, Cris deRitis @MiddleWayEcon, and Marisa DiNatale on LinkedIn for additional insight.
Questions or Comments, please email us at [email protected]. We would love to hear from you. To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.
Will AI replace data analyst? Is it going to happen? Has it already happened?
In this episode, I wanted to dig into if I actually think that's going to be the case.
📊 Come to my next free “How to Land Your First Data Job” training
🏫 Check out my 10-week data analytics bootcamp
Timestamps:
(01:00) - Brace yourself: AI has ARRIVED! 🤖
(01:44) - Is your data analyst job in jeopardy? AI vs. Human Skills 💼
(02:54) - The art of data analysis using AI 🚀
(03:29) - AI is just a tool 🧮
(04:01) - AI is just like Google 🔓
(05:55) - Job Security? Get good at AI 🌟
(07:18) - Humans make decisions ✨
(08:00) - Join the Ultimate Data Bootcamp! 📚💪
Connect with Avery:
📺 Subscribe on YouTube
🎙Listen to My Podcast
👔 Connect with me on LinkedIn
🎵 TikTok
Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!
To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more
If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.
👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa
This textbook surveys new and emergent methods for doing research in critical security studies, filling a gap in the literature. The 2nd edition has been revised and updated.
ABOUT THE TALK: It's been a decade since the Harvard Business Review (HBR) article Data Scientist the Sexiest Job of the 21st Century. It's one of the 100 most downloaded articles in the history of HBR and shows how far we've come in a decade. From building companies to the White House, to leading the COVID response, DJ Patil shares key lessons he wishes he knew a decade ago.
ABOUT THE SPEAKER: DJ Patil is an entrepreneur, investor, scientist, and leader in public policy. He is the former U.S. Chief Data Scientist. He has held senior roles in industry, academia, and government and his work has been featured in two Michael Lewis books (The Fifth Risk and Premonition). As a General Partner at GreatPoint Ventures he focuses on building companies in healthcare, enterprise technologies, and national security.
ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.
Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.
FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/
ABOUT THE TALK: One in five kids has a mental or behavioral disorder, but only 15% have access to care, and the current supply of trained therapists barely covers that demand. Happypillar is a digital therapeutic app that provides evidence-proven behavioral intervention to all at scale. Learn how we combine ML, ASR, NLP, and other technologies with the expertise of our founding clinical play therapist to offer accurate and real-time personalized feedback, all with compliant security processes and the strictest privacy controls.
ABOUT THE SPEAKER: Mady Mantha is a Product and ML Engineering Leader and the Co-Founder & CTO at Happypillar. As a Director of Conversational AI at Sirius, Mady led the team that built Walmart’s conversational AI.
ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.
Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.
FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/
ABOUT THE TALK: Data exchange is vital for business partnerships, but current practices are manual, prone to leaks, hard to validate, monitor, and audit.
Tune in to this talk for an overview of data sharing methods, security comparisons, simplicity, and speed. Discover best practices and solutions to overcome challenges.
ABOUT THE SPEAKER: Pardis Noorzad is CEO at General Folders. She led a data team at Twitter, covering a variety of consumer products. Pardis has also built products in growth stage fintech and digital health and early stage AI platform companies.
ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.
Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.
FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil
This episode was recorded during the Miller Center’s 2023 William and Carol Stevenson Conference, U.S. China Tech Competiton: Has Democracy Met its Match?
For more info on this conference, as well as to watch the video versions, follow this link: https://millercenter.org/news-events/events/us-china-tech-competition-has-democracy-met-its-match
This episode features the first panel discussion from the conference entitled:
Apps, platforms, and surveillance How might apps and other technology platforms play a role in Chinese government data-gathering efforts? What are potential policy responses to the increasingly complex data flows between the United States and China? This panel addresses the long-term stability of U.S. technology infrastructure and related concerns for U.S. national security.
Josh Chin, Kara Frederick, Shanthi Kalathil, Aynne Kokas (moderator)
In this episode, we're thrilled to have Dr. Aynne Kokas, a C.K. Yen Professor at the Miller Center and an associate professor of media studies at the University of Virginia. Kokas’ research examines Sino-U.S. media and technology relations. Dr. Kokas is also the author of the critically acclaimed book "Trafficking Data: How China Is Winning the Battle for Digital Sovereignty," which we will be referring to frequently throughout this conversation. We will also touch on a few topics that were discussed in her recent conference at the Miller Center titled "U.S.-China Tech Competition: Has Democracy Met Its Match?"
During the event, Dr. Kokas and other experts discussed a variety of issues related to the ongoing tech competition between the US and China. For example, they explored the ways in which apps and other technology platforms may be used by the Chinese government for data-gathering purposes, and examined potential policy responses to the increasingly complex data flows between the two countries. Additionally, they discussed the long-term stability of US technology infrastructure and its implications for national security. In addition, there were panels that discussed the digital economy, climate, tech infrastructure, and political influence between China and the US.
In this episode we'll be discussing data policy for US-China technology, a topic that has become increasingly relevant in recent years as the two countries continue to compete for dominance in the tech industry. We'll delve into the differences in approach to data policy between China and the United States, the implications of these differences, and how China's digital silk road initiative is expanding its influence over the global digital economy.
We'll also discuss the challenges of balancing economic benefits against concerns about national security and human rights, and the future of the technology industry in light of these trends.
Links:
U.S.–China tech competition: Has democracy met its match?
Aynne Kokas website: https://www.aynnekokas.com/
Trafficking Data: How China Is Winning the Battle for Digital Sovereignty
Prepare for Microsoft Exam PL-900. Demonstrate your real-world knowledge of the fundamentals of Microsoft Power Platform, including its business value, core components, and the capabilities and advantages of Power BI, Power Apps, Power Automate, and Power Virtual Agents. Designed for business users, functional consultants, and other professionals, this Exam Ref focuses on the critical thinking and decision-making acumen needed for success at the Microsoft Certified: Power Platform Fundamentals level. Focus on the expertise measured by these objectives: Describe the business value of Power Platform Identify the Core Components of Power Platform Demonstrate the capabilities of Power BI Demonstrate the capabilities of Power Apps Demonstrate the capabilities of Power Automate Demonstrate the capabilities of Power Virtual Agents This Microsoft Exam Ref: Organizes its coverage by exam objectives Features strategic, what-if scenarios to challenge you Assumes you are a business user, functional consultant, or other professional who wants to improve productivity by automating business processes, analyzing data, creating simple app experiences, or developing business enhancements to Microsoft cloud solutions. About the Exam Exam PL-900 focuses on knowledge needed to describe the value of Power Platform services and of extending solutions; describe Power Platform administration and security; describe Common Data Service, Connectors, and AI Builder; identify common Power BI components; connect to and consume data; build basic dashboards with Power BI; identify common Power Apps components; build basic canvas and model-driven apps; describe Power Apps portals; identify common Power Automate components; build basic flows; describe Power Virtual Agents capabilities; and build and publish basic chatbots. About Microsoft Certification Passing this exam fulfills your requirements for the Microsoft Certified: Power Platform Fundamentals certification, demonstrating your understanding of Power Platforms core capabilitiesfrom business value and core product capabilities to building simple apps, connecting data sources, automating basic business processes, creating dashboards, and creating chatbots. With this certification, you can move on to earn specialist certifications covering more advanced aspects of Power Apps and Power BI, including Microsoft Certified: Power Platform App Maker Associate and Power Platform Data Analyst Associate. See full details at: microsoft.com/learn
This IBM® Redpaper Product Guide describes the IBM FlashSystem® 7300 solution, which is a next-generation IBM FlashSystem control enclosure. It combines the performance of flash and a Non-Volatile Memory Express (NVMe)-optimized architecture with the reliability and innovation of IBM FlashCore® technology and the rich feature set and high availability (HA) of IBM Spectrum® Virtualize. To take advantage of artificial intelligence (AI)-enhanced applications, real-time big data analytics, and cloud architectures that require higher levels of system performance and storage capacity, enterprises around the globe are rapidly moving to modernize established IT infrastructures. However, for many organizations, staff resources, and expertise are limited, and cost-efficiency is a top priority. These organizations have important investments in existing infrastructure that they want to maximize. They need enterprise-grade solutions that optimize cost-efficiency while simplifying the pathway to modernization. IBM FlashSystem 7300 is designed specifically for these requirements and use cases. It also delivers a cyber resilience without compromising application performance. IBM FlashSystem 7300 provides a rich set of software-defined storage (SDS) features that are delivered by IBM Spectrum Virtualize, including the following examples: Data reduction and deduplication Dynamic tiering Thin-provisioning Snapshots Cloning Replication and data copy services Cyber resilience Transparent Cloud Tiering (TCT) IBM HyperSwap® including 3-site replication for high availability Scale-out and scale-up configurations further enhance capacity and throughput for better availability With the release of IBM Spectrum Virtualize V8.5, extra functions and features are available, including support for new third-generation IBM FlashCore Modules Non-Volatile Memory Express (NVMe) type drives within the control enclosure, and 100 Gbps Ethernet adapters that provide NVMe Remote Direct Memory Access (RDMA) options. New software features include GUI enhancements, security enhancements including multifactor authentication and single sign-on, and Fibre Channel (FC) portsets.