For three years we at LOVOO, a market-leading dating app, have been using the Google Cloud managed version of Airflow, a product we’ve been familiar with since its Alpha release. We took a calculated risk and integrated the Alpha into our product, and, luckily, it was a match. Since then, we have been leveraging this software to build out not only our data pipeline, but also boost the way we do analytics and BI. The speaker will present an overview of the software’s usability for Pipeline Error Alerting through BashOperators that communicate with Slack and will touch upon how they built their Analytics Pipeline (deployment and growth) and currently batch big amounts of data from different sources effectively using Airflow. We will also showcase our PythonOperators-driven RedShift to BigQuery data migration process, as well as offer a guide for creating fully dynamic tasks inside DAG.
talk-data.com
Topic
BI
Business Intelligence (BI)
1211
tagged
Activity Trend
Top Events
Financial Times is increasing its digital revenue by allowing business people to make data-driven decisions. Providing an Airflow based platform where data engineers, data scientists, BI experts and others can run language agnostic jobs was a huge swing. One of the most successful steps in the platform’s development was building our own execution environment, allowing stakeholders to self deploy jobs without cross team dependencies on top of the unlimited scale of Kubernetes. In this talk we share how we have integrated and extended Airflow at Financial Times. The main topics we will cover include: Providing team level security isolation Removing cross team dependencies Creating execution environment for independently creating and deploying R, Python, JAVA, Spark, etc jobs Reducing latency when sharing data between task instances Integrating all these features on top of Kubernetes
This is an audio blog on BI on the Cloud Data Lake and how to improve the productivity of data engineers. We'll dive deeper into the question; what’s the best measure of success for data pipeline efficiency? This is part 2 of a two part blog.
Originally published at: https://www.eckerson.com/articles/business-intelligence-on-the-cloud-data-lake-part-2-improving-the-productivity-of-data-engineers
Free Data Storytelling Training Register before it sells out again! Our BI Data Storytelling Mastery Accelerator 3-Day Live Workshop new dates are finally available. Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of the workshop, you'll leave with a clear BI delivery action plan. Register today! In this episode, you'll learn: [00:40] Transformational Stories and Lessons Learned: 3 ways to use data storytelling in data science [19:50] Storytelling as a means of visualization. [26:52] How to think about answering the questions with data storytelling. For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/56
Enjoyed the Show? Please leave us a review on iTunes.
"Learn Grafana 7.0" is the ultimate beginner's guide to leveraging Grafana's capabilities for analytics and interactive dashboards. You'll master real-time data monitoring, visualization, and learn how to query and explore metrics with a hands-on approach to Grafana 7.0's new features. What this Book will help me do Learn to install and configure Grafana from scratch, preparing you for real-world data analysis tasks. Navigate and utilize the Graph panel in Grafana effectively, ensuring clear and actionable visual insights. Incorporate advanced dashboard features such as annotations, templates, and links to enhance data monitoring. Integrate Grafana with major cloud providers like AWS and Azure for robust monitoring solutions. Implement secure user authentication and fine-tuned permissions for managing teams and sharing insights safely. Author(s) None Salituro, the author of "Learn Grafana 7.0," is an experienced data visualization expert with years of experience in software development and analytics. Salituro focuses on creating understandable and accessible resources for developers and analysts of all skill levels, bringing a hands-on practical approach to technical learning. Who is it for? This book is perfect for data analysts, business intelligence developers, and administrators looking to build skills in data visualization and monitoring with Grafana 7.0. If you're eager to create interactive dashboards and learn practical applications of Grafana's features, this book is for you. Beginners to Grafana are fully accommodated, though familiarity with data visualization principles is beneficial. For those seeking to monitor cloud services like AWS with Grafana, this book is indispensable.
Originally from the sunny sprawling suburbs of San Diego, California, after graduating from the University of California, San Diego, Diana Gremore packed up all of her things and moved to New York City, sleeping on a friend’s couch and giving herself just two months to make it in the Big Apple. And she did. For the past three and a half years, Diana has worked at Paradigm (not including the year and a half she was a part of AM Only, which was acquired by Paradigm in 2017), one of the entertainment industry’s most important and highly regarded talent agencies. While she now holds the illustrious title of Business Intelligence Analyst, it’s a role she created for herself, working her way up from receptionist and office manager. But “start small and don’t skip steps” isn’t just an axiom embodied by Diana’s own career, it’s something she encourages artists and their teams to think about when approaching their own growth trajectories — especially during the uncertainty of live music in a post-COVID world. Connect With Diana:https://www.instagram.com/dhgremore/https://www.linkedin.com/in/diana-gremore-494b6625/ Connect With Us:http://podcast.chartmetric.com/http://chartmetric.com/https://blog.chartmetric.comhttps://smarturl.it/chartmetric_social
Summary The majority of analytics platforms are focused on use internal to an organization by business stakeholders. As the availability of data increases and overall literacy in how to interpret it and take action improves there is a growing need to bring business intelligence use cases to a broader audience. GoodData is a platform focused on simplifying the work of bringing data to employees and end users. In this episode Sheila Jung and Philip Farr discuss how the GoodData platform is being used, how it is architected to provide scalable and performant analytics, and how it integrates into customer’s data platforms. This was an interesting conversation about a different approach to business intelligence and the importance of expanded access to data.
Announcements
Hello and welcome to the Data Engineering Podcast, the show about modern data management What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise. When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! GoodData is revolutionizing the way in which companies provide analytics to their customers and partners. Start now with GoodData Free that makes our self-service analytics platform available to you at no cost. Register today at dataengineeringpodcast.com/gooddata Your host is Tobias Macey and today I’m interviewing Sheila Jung and Philip Farr about how GoodData is building a platform that lets you share your analytics outside the boundaries of your organization
Interview
Introduction How did you get involved in the area of data management? Can you start by describing what you are building at GoodData and some of its origin story? The business intelligence market has been around for decades now and there are dozens of options with different areas of focus. What are the factors that might motivate me to choose GoodData over the other contenders in the space? What are the use cases and industries that you focus on supporting with GoodData? How has the market of business intelligence tools evolved in recent years?
What are the contributing trends in technology and business use cases that are driving that change?
What are some of the ways that your customers are embedding analytics into their own products? What are the differences in processing and serving capabilities between an internally used business intelligence tool, and one that is used for embedding into externally used systems?
What unique challenges are posed by the embedded analytics use case? How do you approach topics such as security, access control, and latency in a multitenant analytics platform?
What guidelines have you found to be most useful when addressing the concerns of accuracy and interpretability of the data being presented? How is the GoodData platform architected?
What are the complexities that you have had to design around in order to provide performant access to your customers’ data sources in an interactive use case? What are the off-the-shelf components that you have been able to integrate into the platform,
This audio blog is about business intelligence on the cloud data lake and why it arose and how to architect for it. This is Part 1 of a two part blog series.
Originally published at: https://www.eckerson.com/articles/business-intelligence-on-the-cloud-data-lake-part-1-why-it-arose-and-how-to-architect-for-it
Free Data Storytelling Training Register before it sells out again! Our BI Data Storytelling Mastery Accelerator 3-Day Live Workshop new dates are finally available. Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of the workshop, you'll leave with a clear BI delivery action plan. Register today! In this episode, you'll learn: [01:50] Transformational Stories and Lessons Learned: Yves's 3 things you can do to become irreplaceable. [06:39] User Expectations: The importance of making yourself stick in your role during the pandemic and lockdowns. [22:47] - How consulting allows Yves to bypass age discrimination. For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/54
Enjoyed the Show? Please leave us a review on iTunes.
Deliver eye-catching and insightful business intelligence with Microsoft Power BI Desktop. This new edition has been updated to cover all the latest features of Microsoft’s continually evolving visualization product. New in this edition is help with storytelling—adapted to PCs, tablets, and smartphones—and the building of a data narrative. You will find coverage of templates and JSON style sheets, data model annotations, and the use of composite data sources. Also provided is an introduction to incorporating Python visuals and the much awaited Decomposition Tree visual. Pro Power BI Desktop shows you how to use source data to produce stunning dashboards and compelling reports that you mold into a data narrative to seize your audience’s attention. Slice and dice the data with remarkable ease and then add metrics and KPIs to project the insights that create your competitive advantage. Convert raw data into clear, accurate, and interactive information with Microsoft’s free self-service BI tool. This book shows you how to choose from a wide range of built-in and third-party visualization types so that your message is always enhanced. You will be able to deliver those results on PCs, tablets, and smartphones, as well as share results via the cloud. The book helps you save time by preparing the underlying data correctly without needing an IT department to prepare it for you. What You Will Learn Deliver attention-grabbing information, turning data into insight Find new insights as you chop and tweak your data as never before Build a data narrative through interactive reports with drill-through and cross-page slicing Mash up data from multiple sources into a cleansed and coherent data model Build interdependent charts, maps, and tables to deliver visually stunninginformation Create dashboards that help in monitoring key performance indicators of your business Adapt delivery to mobile devices such as phones and tablets Who This Book Is For Power users who are ready to step up to the big leagues by going beyond what Microsoft Excel by itself can offer. The book also is for line-of-business managers who are starved for actionable data needed to make decisions about their business. And the book is for BI analysts looking for an easy-to-use tool to analyze data and share results with C-suite colleagues they support.
Organizations can make data science a repeatable, predictable tool, which business professionals use to get more value from their data Enterprise data and AI projects are often scattershot, underbaked, siloed, and not adaptable to predictable business changes. As a result, the vast majority fail. These expensive quagmires can be avoided, and this book explains precisely how. Data science is emerging as a hands-on tool for not just data scientists, but business professionals as well. Managers, directors, IT leaders, and analysts must expand their use of data science capabilities for the organization to stay competitive. Smarter Data Science helps them achieve their enterprise-grade data projects and AI goals. It serves as a guide to building a robust and comprehensive information architecture program that enables sustainable and scalable AI deployments. When an organization manages its data effectively, its data science program becomes a fully scalable function that’s both prescriptive and repeatable. With an understanding of data science principles, practitioners are also empowered to lead their organizations in establishing and deploying viable AI. They employ the tools of machine learning, deep learning, and AI to extract greater value from data for the benefit of the enterprise. By following a ladder framework that promotes prescriptive capabilities, organizations can make data science accessible to a range of team members, democratizing data science throughout the organization. Companies that collect, organize, and analyze data can move forward to additional data science achievements: Improving time-to-value with infused AI models for common use cases Optimizing knowledge work and business processes Utilizing AI-based business intelligence and data visualization Establishing a data topology to support general or highly specialized needs Successfully completing AI projects in a predictable manner Coordinating the use of AI from any compute node. From inner edges to outer edges: cloud, fog, and mist computing When they climb the ladder presented in this book, businesspeople and data scientists alike will be able to improve and foster repeatable capabilities. They will have the knowledge to maximize their AI and data assets for the benefit of their organizations.
Free Data Storytelling Training Note: Upcoming BI Data Storytelling Mastery Accelerator 3-Day Live Workshops have been canceled due to the COVID-19 crisis. However, 2-Day Online Workshops (Livestream recordings) will be available. Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of the workshop, you'll leave with a clear BI delivery action plan. Register today! In this episode, you'll learn: [08:07] User Expectations: Why are they important to set? [18:05] Key Quote: For me, it was obviously a needed skill set to add. - Brent Warren [18:45] Swiss Knife: Unappreciated, underpaid, and non-existent type that rebuilt metrics and reports without help. For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/53
Enjoyed the Show? Please leave us a review on iTunes.
Use this guide to one of SQL Server 2019’s most impactful features—Big Data Clusters. You will learn about data virtualization and data lakes for this complete artificial intelligence (AI) and machine learning (ML) platform within the SQL Server database engine. You will know how to use Big Data Clusters to combine large volumes of streaming data for analysis along with data stored in a traditional database. For example, you can stream large volumes of data from Apache Spark in real time while executing Transact-SQL queries to bring in relevant additional data from your corporate, SQL Server database. Filled with clear examples and use cases, this book provides everything necessary to get started working with Big Data Clusters in SQL Server 2019. You will learn about the architectural foundations that are made up from Kubernetes, Spark, HDFS, and SQL Server on Linux. You then are shown how to configure and deploy Big Data Clusters in on-premises environments or in the cloud. Next, you are taught about querying. You will learn to write queries in Transact-SQL—taking advantage of skills you have honed for years—and with those queries you will be able to examine and analyze data from a wide variety of sources such as Apache Spark. Through the theoretical foundation provided in this book and easy-to-follow example scripts and notebooks, you will be ready to use and unveil the full potential of SQL Server 2019: combining different types of data spread across widely disparate sources into a single view that is useful for business intelligence and machine learning analysis. What You Will Learn Install, manage, and troubleshoot Big Data Clusters in cloud or on-premise environments Analyze large volumes of data directly from SQL Server and/or Apache Spark Manage data stored in HDFS from SQL Server as if it wererelational data Implement advanced analytics solutions through machine learning and AI Expose different data sources as a single logical source using data virtualization Who This Book Is For Data engineers, data scientists, data architects, and database administrators who want to employ data virtualization and big data analytics in their environments
Become the forensic analytics expert in your organization using effective and efficient data analysis tests to find anomalies, biases, and potential fraud—the updated new edition Forensic Analytics reviews the methods and techniques that forensic accountants can use to detect intentional and unintentional errors, fraud, and biases. This updated second edition shows accountants and auditors how analyzing their corporate or public sector data can highlight transactions, balances, or subsets of transactions or balances in need of attention. These tests are made up of a set of initial high-level overview tests followed by a series of more focused tests. These focused tests use a variety of quantitative methods including Benford’s Law, outlier detection, the detection of duplicates, a comparison to benchmarks, time-series methods, risk-scoring, and sometimes simply statistical logic. The tests in the new edition include the newly developed vector variation score that quantifies the change in an array of data from one period to the next. The goals of the tests are to either produce a small sample of suspicious transactions, a small set of transaction groups, or a risk score related to individual transactions or a group of items. The new edition includes over two hundred figures. Each chapter, where applicable, includes one or more cases showing how the tests under discussion could have detected the fraud or anomalies. The new edition also includes two chapters each describing multi-million-dollar fraud schemes and the insights that can be learned from those examples. These interesting real-world examples help to make the text accessible and understandable for accounting professionals and accounting students without rigorous backgrounds in mathematics and statistics. Emphasizing practical applications, the new edition shows how to use either Excel or Access to run these analytics tests. The book also has some coverage on using Minitab, IDEA, R, and Tableau to run forensic-focused tests. The use of SAS and Power BI rounds out the software coverage. The software screenshots use the latest versions of the software available at the time of writing. This authoritative book: Describes the use of statistically-based techniques including Benford’s Law, descriptive statistics, and the vector variation score to detect errors and anomalies Shows how to run most of the tests in Access and Excel, and other data analysis software packages for a small sample of the tests Applies the tests under review in each chapter to the same purchasing card data from a government entity Includes interesting cases studies throughout that are linked to the tests being reviewed. Includes two comprehensive case studies where data analytics could have detected the frauds before they reached multi-million-dollar levels Includes a continually-updated companion website with the data sets used in the chapters, the queries used in the chapters, extra coverage of some topics or cases, end of chapter questions, and end of chapter cases. Written by a prominent educator and researcher in forensic accounting and auditing, the new edition of Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations is an essential resource for forensic accountants, auditors, comptrollers, fraud investigators, and graduate students.
Introducing Microsoft SQL Server 2019 is the must-have guide for database professionals eager to leverage the latest advancements in SQL Server 2019. This book covers the features and capabilities that make SQL Server 2019 a powerful tool for managing and analyzing data both on-premises and in the cloud. What this Book will help me do Understand the new features introduced in SQL Server 2019 and their practical applications. Confidently manage and analyze relational, NoSQL, and big data within SQL Server 2019. Implement containerization for SQL Server using Docker and Kubernetes. Migrate and integrate your databases effectively to use Power BI Report Server. Query data from Hadoop Distributed File System with Azure Data Studio. Author(s) The authors of 'Introducing Microsoft SQL Server 2019' are subject matter experts including Kellyn Gorman, Allan Hirt, and others. With years of professional experience in database management and SQL Server, they bring a wealth of practical insight and knowledge to the book. Their experience spans roles as administrators, architects, and educators in the field. Who is it for? This book is aimed at database professionals such as DBAs, architects, and big data engineers who are currently using earlier versions of SQL Server or other database platforms. It is particularly well-suited for professionals aiming to understand and implement SQL Server 2019's new features. Readers should have basic familiarity with SQL Server and RDBMS concepts. If you're looking to explore SQL Server 2019 to improve data management and analytics in your organization, this book is for you.
Artificial intelligence can yield powerful results when applied to business intelligence. Whether it’s pattern recognition in words, numbers, and big datasets or optimizing processes and expediting outcomes, AI is becoming a critical business component. In this report, Michael Norris from IBM explains how to drive AI adoption in your company. What does it mean to infuse AI into BI? It means business users can discover actionable, easy-to-understand insights on their own, independently from IT—even while remaining within the organization’s secure and governed IT architecture. Explore how AI in BI helps you to "get to the why" when analyzing and optimizing the insights you discover. Learn how AI-infused business intelligence: Enables line-of-business users to easily discover data-driven insights without requiring specialized data science expertise Allows users to ask questions in plain language with intuitive exploration tools to gain deeper insight into their data Provides recommended visualizations and dashboards to present compelling, concise, and explainable data Prepares datasets for analysis to free up IT analysts and line-of-business users
Free Data Storytelling Training Register for three live training at webinars.bidatastorytelling.com and download our FREE 50-page Analytics Design Guide!
In this episode, you'll learn:
[09:10] Mary Ann's BI Master Class—Three fundamental storytelling lessons [10:46] What is AliMed? Manufacturer and distributor of medical supplies. [13:13] Mary Ann's Inspiration: How do you get into statistics as a career? For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/46
Enjoyed the Show?
Please leave us a review on iTunes.
Free Live Training Register for our upcoming live training at webinars.bidatastorytelling.com and download our FREE 50-page Analytics Design Guide! In this episode, you'll learn: Design is important. Just take a look at some of the Corona Virus visuals out there. In this fun, short episode I discuss how bad design can affect you in ways you may not know!
[12:09] - "Design is one of those things that keeps the cash flow coming when it comes to BI projects." [15:46] - "You cannot afford to ignore it, no matter how good your data is, if it's ugly, it doesn't matter. " For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/45
Enjoyed the Show?
Please leave us a review on iTunes.
"DAX Cookbook: Over 120 recipes to enhance your business with analytics, reporting, and business intelligence" is the ultimate guidebook for mastering DAX (Data Analysis Expressions) in business intelligence, Power BI, and SQL Server Analysis Services. With hands-on examples and extensive recipes, it enables professionals to solve real-world data challenges effectively. What this Book will help me do Understand how to create tailored calculations for dates, time, and duration to enhance data insights. Develop key performance indicators (KPIs) and advanced business metrics for strategic decision-making. Master text and numerical data transformations to construct dynamic dashboards and reports. Optimize data models and DAX queries for improved performance and analytics accuracy. Learn to handle and debug calculations, and implement complex statistical and mathematical measures. Author(s) Greg Deckler is a seasoned business intelligence professional with extensive experience in using DAX and Power BI to provide actionable insights. As a recognized expert in the field, Greg brings practical knowledge of developing scalable BI solutions. His teaching approach is rooted in clarity and real-world application, making complex topics accessible to learners of all levels. Who is it for? This book is perfect for business professionals, BI developers, and data analysts with basic knowledge of the DAX language and associated tools. If you are looking to enhance your DAX skills and solve tough analytical challenges, this book is tailored for you. It's highly relevant for those aiming to optimize business intelligence workflows and improve data-driven decisions.
You know me — I love community! Being a part of the BI community has changed my life and it can change your too for the better if you choose the right community, and understand how to use it to your advantage. Listen and learn.
Today's guest is Allen Hillery, editor of Nightingale, a data visualization society journal. Allen describes why community is important and what you can do to give and take within the community. Recently, he interviewed me and wrote a very popular article on Medium titled, "Mico Yuk on the Importance of Community and the Paradigm Shift in Business Intelligence."
In this episode, you'll learn: [09:25] Allen's Background: Writer, editor, and adjunct professor passionate about storytelling with data. [10:40] Data Business Communities: First, there were not enough, now why there's too many to choose from. [11:03] Priorities Put in Place: Passing of family members led to self-discovery and fulfillment through data storytelling journey. For full show notes, and the links mentioned visit: bibrainz.com/podcast/44 Sponsor The next BI Data Storytelling Mastery Accelerator 3-Day Live workshop is live! Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of three days, you'll leave with a clear BI delivery action plan. Register today! Enjoyed the Show? Please leave us a review on iTunes.