The integration between dbt and Airflow is a popular topic in the community, both in previous editions of Airflow Summit, in Coalesce and the #airflow-dbt Slack channel. Astronomer Cosmos ( https://github.com/astronomer/astronomer-cosmos/ ) stands out as one of the libraries that strives to enhance this integration, having over 300k downloads per month. During its development, we’ve encountered various performance challenges in terms of scheduling and task execution. While we’ve managed to address some, others remain to be resolved. This talk describes how Cosmos works, the improvements made over the last 1.5 years, and the roadmap. It also aims to collect feedback from the community on how we can further improve the experience of running dbt in Airflow.
talk-data.com
Topic
GitHub
661
tagged
Activity Trend
Top Events
Jupyter Notebooks are widely used by data scientists and engineers to prototype and experiment with data. However these engineers are often required to work with other data or platform engineers to productionize these experiments due to the complexity in navigating infrastructure and systems. In this talk, we will deep dive into this PR https://github.com/apache/airflow/pull/34840 and share how airflow can be leveraged as a platform to execute notebook pipelines (python, scala or spark) in dynamic environments like Kubernetes for various heterogeneous use cases. We will demonstrate how data scientists can use a Jupyter extension to easily build and manage such pipelines which are executed using Airflow streamlining data science workflow development and supercharging productivity
In this episode, Conor and Bryce chat about how to implement the std::merge in parallel. Link to Episode 188 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-06-11 Date Released: 2024-06-28 C++ std::mergethrust::copy_ifthrust::permutation_iteratorIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
In this episode, Conor and Bryce chat about how to implement the Top N (std::nth_element) algorithm in parallel. Link to Episode 187 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-06-11 Date Released: 2024-06-21 Craft Conf 2024C++ std::nth_elementC++ std::priority_queueIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don’t), where industry insights meet laid-back banter. Whether you’re a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let’s get into the heart of data, unplugged style! In this episode, join us along with guests Vitale and David as we explore: Euro 2024 Predictions with AI: Using Snowflake's machine learning models for data-driven predictions and sharing our own predictions. Can animals predict wins better than ML models?Tech in football: From VAR to connected ball technology, is it all a good idea?Nvidia overtaking Apple and Microsoft as the biggest tech corporation? Discussing Nvidia's leap to surpass Apple and Microsoft, and the implications for the GPU market and AI development.Unity Catalog vs. Polaris: Comparing Unity+Delta with Polaris+Iceberg and their roles in data cataloging and management. Explore the details on GitHub Unity Catalog, YouTube, and insights on LinkedIn. Databricks Data and AI Summit recap: Discussing the biggest announcements from the summit, including Mosaic AI integration, serverless options, and the open-source unity catalog.Exploring BM25: Discussing the BM25 algorithm and its advancements over traditional TF-IDF for document classification.
In this episode, Conor and Bryce chat about how to get started in programming. Link to Episode 186 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-06-07 & 2024-06-12 Date Released: 2024-06-14 Swift Programming LanguageBoost C++ LibrariesBoost SpiritNDC Oslo ConferenceCraft Conf 2024The Power of Function Composition - NDC Oslo - Conor HoekstraCityStrides.comcity-strides-hacking GitHub RepoHookStar Scrabble TrainerBeautiful Python Refactoring II - Conor Hoekstra - code::dive 2022 (Scrabble Talk)Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
Turn the avalanche of raw data from Azure Data Explorer, Azure Monitor, Microsoft Sentinel, and other Microsoft data platforms into actionable intelligence with KQL (Kusto Query Language). Experts in information security and analysis guide you through what it takes to automate your approach to risk assessment and remediation, speeding up detection time while reducing manual work using KQL. This accessible and practical guidedesigned for a broad range of people with varying experience in KQLwill quickly make KQL second nature for information security. Solve real problems with Kusto Query Language and build your competitive advantage: Learn the fundamentals of KQLwhat it is and where it is used Examine the anatomy of a KQL query Understand why data summation and aggregation is important See examples of data summation, including count, countif, and dcount Learn the benefits of moving from raw data ingestion to a more automated approach for security operations Unlock how to write efficient and effective queries Work with advanced KQL operators, advanced data strings, and multivalued strings Explore KQL for day-to-day admin tasks, performance, and troubleshooting Use KQL across Azure, including app services and function apps Delve into defending and threat hunting using KQL Recognize indicators of compromise and anomaly detection Learn to access and contribute to hunting queries via GitHub and workbooks via Microsoft Entra ID
In this episode, Conor and Bryce catch up via phone tag and chat about an algorithm. Link to Episode 185 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-06-04 to 2024-06-06 Date Released: 2024-06-07 Craft Conf 2024ADSP Episode 157: The Roc Programming Language with Richard Feldman'Declarative Thinking, Declarative Practice' - Kevlin Henney [ ACCU 2016 ]Dave Thomas YOW! Vector Talk (can't find the link)C++Now 2019: Conor Hoekstra “Algorithm Intuition”Better Algorithm Intuition - Conor Hoekstra @code_report - Meeting C++ 2019C++ std::nth_elementC++ std::partial_sort_copythrust::partitionIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
In the fast-paced work environments we are used to, the ability to quickly find and understand data is essential. Data professionals can often spend more time searching for data than analyzing it, which can hinder business progress. Innovations like data catalogs and automated lineage systems are transforming data management, making it easier to ensure data quality, trust, and compliance. By creating a strong metadata foundation and integrating these tools into existing workflows, organizations can enhance decision-making and operational efficiency. But how did this all come to be, who is driving better access and collaboration through data? Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like GitHub for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. A pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally. Prukalpa previously co-founded SocialCops, world leading data for good company (New York Times Global Visionary, World Economic Forum Tech Pioneer). SocialCops is behind landmark data projects including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations. She was awarded Economic Times Emerging Entrepreneur for the Year, Forbes 30u30, Fortune 40u40, Top 10 CNBC Young Business Women 2016, and a TED Speaker. In the episode, Richie and Prukalpa explore challenges within data discoverability, the inception of Atlan, the importance of a data catalog, personalization in data catalogs, data lineage, building data lineage, implementing data governance, human collaboration in data governance, skills for effective data governance, product design for diverse audiences, regulatory compliance, the future of data management and much more. Links Mentioned in the Show: AtlanConnect with Prukalpa[Course] Artificial Intelligence (AI) StrategyRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
Overview slides on GitHub developer productivity, how AI benefits developers, and how those benefits apply to security, teeing up a demo on Learning, Remediation, and Prevention (10 minutes) followed by a 20-minute demo.
In this episode, Conor and Bryce chat with Doug Gregor from Apple about the Swift programming language! Link to Episode 184 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute. Show Notes Date Recorded: 2024-04-29 Date Released: 2024-05-31 Swift Programming LanguageSwift ActorsD Programming LanguageRust Programming LanguageFearless Concurrency? Understanding Concurrent Programming Safety in Real-World Rust SoftwareSwift Protocols2022 LLVM Dev Mtg: Implementing Language Support for ABI-Stable Software Evolution in Swift and LLVMOxide Episode - Discovering the XZ Backdoor with Andres FreundSwift Algorithms LibraryIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
In this episode, Conor and Bryce chat with Doug Gregor from Apple about the Swift programming language! Link to Episode 183 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.
Show Notes
Date Recorded: 2024-04-29 Date Released: 2024-05-24 Swift Programming LanguageWWDC 2014 Swift AnnouncementSwift on LanguishIntro Song Info
Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
In this episode, Conor and Bryce chat with Doug Gregor from Apple about C++11 Variadic Templates, C++11 std::tuple, C++17 std::variant, Swift and more! Link to Episode 182 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.
Show Notes
Date Recorded: 2024-04-29 Date Released: 2024-05-17 C++11 Variadic Templates / Parameter Packs / ExpansionC++26 Pack IndexingC++11 std::tupleC++17 std::variantC++11 Digit SeparatorsSwift Programming LanguageHPX (High Performance ParalleX)Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
We talked about:
Erum's Background Omdena Academy and Erum’s Role There Omdena’s Community and Projects Course Development and Structure at Omdena Academy Student and Instructor Engagement Engagement and Motivation The Role of Teaching in Community Building The Importance of Communities for Career Building Advice for Aspiring Instructors and Freelancers DS and ML Talent Market Saturation Resources for Learning AI and Community Building Erum’s Resource Recommendations
Links:
LinkedIn: https://www.linkedin.com/in/erum-afzal-64827b24/
Twitter: https://twitter.com/Erum55449739
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
In this episode, Conor and Bryce chat with Doug Gregor from Apple about the history of C++0x Concepts (part 2). Link to Episode 181 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.
Show Notes
Date Recorded: 2024-04-29 Date Released: 2024-05-10 C++20 ConceptsSwift Programming LanguageElements of ProgrammingTecton: A Language for Manipulating Generic ObjectsGeneric Programming by David Musser and Alexander StepanovOriginal paper on concepts for C++0x (Stroustrup and Dos Reis)C++ Concepts vs Rust Traits vs Haskell Typeclasses vs Swift Protocols - Conor Hoekstra - ACCU 2021Paper on the implementation of concepts in ConceptGCC (Gregor, Siek)C++0x Concepts proposal that explains the model (Gregor, Stroustrup)Language wording for concepts that went into C++0xDoug’s last-ditch effort to bring back a simpler C++0x Concepts model using archetypes for type checkingJeremy Siek’s extensive C++0x Concepts writeupType-Soundness and Optimization in the Concepts ProposalIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
We talked about:
Vincent’s Background SciKit Learn’s History and Company Formation Maintaining and Transitioning Open Source Projects Teaching and Learning Through Open Source Role of Developer Relations and Content Creation Teaching Through Calm Code and The Importance of Content Creation Current Projects and Future Plans for Calm Code Data Processing Tricks and The Importance of Innovation Learning the Fundamentals and Changing the Way You See a Problem Dev Rel and Core Dev in One Why :probabl. Needs a Dev Rel Exploration of Skrub and Advanced Data Processing Personal Insights on SciKit Learn and Industry Trends Vincent’s Upcoming Projects
Links:
probabl. YouTube channel: https://www.youtube.com/@UCIat2Cdg661wF5DQDWTQAmg Calmcode website: https://calmcode.io/ probabl. website: https://probabl.ai/
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
In this episode, Conor and Bryce chat with Doug Gregor from Apple about the history of C++0x Concepts. Link to Episode 180 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Douglas Gregor is is a Distinguished Engineer at Apple working on the Swift programming language, compiler, and related libraries and tools. He is code owner emeritus of the Clang compiler (part of the LLVM project), a former member of the ISO C++ committee, and a co-author on the second edition of C++ Templates: The Complete Guide. He holds a Ph.D. in computer science from Rensselaer Polytechnic Institute.
Show Notes
Date Recorded: 2024-04-29 Date Released: 2024-05-03 C++20 ConceptsSwift Programming LanguageElements of ProgrammingTecton: A Language for Manipulating Generic ObjectsGeneric Programming by David Musser and Alexander StepanovOriginal paper on concepts for C++0x (Stroustrup and Dos Reis)C++ Concepts vs Rust Traits vs Haskell Typeclasses vs Swift Protocols - Conor Hoekstra - ACCU 2021Paper on the implementation of concepts in ConceptGCC (Gregor, Siek)C++0x Concepts proposal that explains the model (Gregor, Stroustrup)Language wording for concepts that went into C++0xDoug’s last-ditch effort to bring back a simpler C++0x Concepts model using archetypes for type checkingJeremy Siek’s extensive C++0x Concepts writeupIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style!
In this episode, we're thrilled to have special guest Mehdi Ouazza diving into a plethora of hot tech topics: Mehdi Ouazza's Insights into his career, online community and working with DuckDB and MotherDuck.Demystifying DevRel: Definitions and distinctions in the realm of tech influence (dive deeper here).Terraform's Licensing Shift: Reactions to HashiCorp's recent changes and its new IBM collaboration, more details here.Github Copilot Workspace: Exploring the latest in AI-powered coding assistance, comparing with devin.ai and CodySnowflake's Arctic LLM: Discussing the latest enterprise AI capabilities and their real-world applications. Read more about Arctic - what it excels at, and how its performance was measuredMore legal kerfuffle in the GenAI realm: The ongoing legal debates around AI's use in creative industries, highlighted by a dispute over Drake’s use of late rapper Tupac’s AI-generated voice in diss track & the licensing deal between Financial Times and OpenAIFuture of Data Engineering: Examining the integration of LLMs into data engineering tools. Insights on prompt-based feature engineering and Databricks' English SDKAI in Music Creation: A little bonus with an AI generated song about Murilo, created with Suno
Links:
Biodiversity and Artificial Intelligence pdf: https://www.gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
In this episode, Conor and Bryce chat about the CheckGrade problem, ACCU, CppNorth and have a guest appearance from Bryce’s mom! Link to Episode 179 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow Notes Date Recorded: 2024-04-17 Date Released: 2024-04-26 CheckGrade TweetRuff (Python Tool)Software Unscripted Episode 69: Making Parsing I/O Bound with Daniel LemireThe simdjson libraryCppNorthComposition Intuition - Conor Hoekstra - CppNorth 2023Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8