talk-data.com talk-data.com

P

Speaker

Parnab Basak

3

talks

Solutions Architect Amazon Web Services

Filter by Event / Source

Talks & appearances

3 activities · Newest first

Search activities →

In the last few years Large Language Models (LLMs) have risen to prominence as outstanding tools capable of transforming businesses. However, bringing such solutions and models to the business-as-usual operations is not an easy task. In this session, we delve into the operationalization of generative AI applications using MLOps principles, leading to the introduction of foundation model operations (FMOps) or LLM operations using Apache Airflow. We further zoom into aspects of expected people and process mindsets, new techniques for model selection and evaluation, data privacy, and model deployment. Additionally, know how you can use the prescriptive features of Apache Airflow to aid your operational journey. Whether you are building using out of the box models (open-source or proprietary), creating new foundation models from scratch, or fine-tuning an existing model, with the structured approaches described you can effectively integrate LLMs into your operations, enhancing efficiency and productivity without causing disruptions in the cloud or on-premises.

Today, all major cloud service providers and 3rd party providers include Apache Airflow as a managed service offering in their portfolios. While these cloud based solutions help with the undifferentiated heavy lifting of environment management, some data teams are also looking to operate self-managed Airflow instances to satisfy specific differentiated capabilities. In this session, we would talk about: Why should you might need to run self managed Airflow The available deployment options (with emphasis on Airflow on Kubernetes) How to deploy Airflow on Kubernetes using automation (Helm Charts & Terraform) Developer experience (sync DAGs using automation) Operator experience (Observability) Owned responsibilities and Tradeoffs A thorough understanding would help you understand the end-to-end perspectives of operating a highly available and scalable self managed Airflow environment to meet your ever growing workflow needs.

Organizations need to effectively manage large volumes of complex, business-critical workloads across multiple applications and platforms. Choosing the right workflow orchestration tool is important as it can help teams effectively automate the configuration, coordination, integration, and data management processes on several applications and systems. Currently there are a lot of tools (both open sourced and proprietary) available for orchestrating tasks and data workflows with automation features. Each of them claim to focus on ensuring a centralized, repeatable, reproducible, and efficient workflows coordination. Choosing one among them is an arduous task as it requires an in-depth understanding of the capabilities that these tools have to offer that translate to your specific engineering needs. Apache Airflow which is a powerful and widely-used open-source workflow management system (WMS) designed to programmatically author, schedule, orchestrate, and monitor data pipelines and workflows. In this talk, understand how Apache Airflow compares with other popular orchestration tools in-terms of architecture, scalability, management, observability, automation, native features, cost, available integrations and more. Get a head-to-head comparison of what’s possible as we dissect capabilities of the tools against the other. This comparative analysis will help you in your decision making process, whether you are planning to migrate an existing system or evaluating for your first enterprise orchestration platform.