talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

Summary The PostgreSQL database is massively popular due to its flexibility and extensive ecosystem of extensions, but it is still not the first choice for high performance analytics. Swarm64 aims to change that by adding support for advanced hardware capabilities like FPGAs and optimized usage of modern SSDs. In this episode CEO and co-founder Thomas Richter discusses his motivation for creating an extension to optimize Postgres hardware usage, the benefits of running your analytics on the same platform as your application, and how it works under the hood. If you are trying to get more performance out of your database then this episode is for you!

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to dataengineeringpodcast.com/tidydata today and get started for free with no credit card required. Your host is Tobias Macey and today I’m interviewing Thomas Richter about Swarm64, a PostgreSQL extension to improve parallelism and add support for FPGAs

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what Swarm64 is?

How did the business get started and what keeps you motivated?

What are some of the common bottlenecks that users of postgres run into? What are the use cases and workloads that gain the most benefit from increased parallelism in the database engine? By increasing the processing throughput of the database, how does that impact disk I/O and what are some options for avoiding bottlenecks in the persistence layer? Can you describe how Swarm64 is implemented?

How has the product evolved since you first began working on it?

How has the evolution of postgres impacted your product direction?

What are some of the notable challenges that you have dealt with as a result of upstream changes in postgres?

How has the hardware landscape evolved and how does that affect your prioritization of features and improvements? What are some of the other extensions in the postgres ecosystem that are most commonly used alongside Swarm64?

Which extensions conflict with yours and how does that impact potential adoption?

In addition to your work to optimize performance of the postres engine, you also provide support for using an FPGA as a co-processor. What are the benefits that an FPGA provides over and above a CPU or GPU architecture?

What are the available options for provisioning hardware in a datacenter or the cloud that has access to an FPGA? Most people are familiar with the relevant attributes for selecting a CPU or GPU, what are the specifications that they should be looking at when selecting an FPGA?

For users who are adopting Swarm64, how does it impact the way they should be thinking of their data models? What is involved in migrating an existing database to use Swarm64? What are some of the most interesting, unexpected, or

This audio blog is about how the CHOP’s data and analytics (DnA) team uses near real-time data and information to decide how to marshal its resources to contain the pandemic. The culmination of all of this work has been an enterprise COVID-19 dashboard that is distributed to enterprise leadership daily. Originally published at: https://www.eckerson.com/articles/chop-harnesses-the-power-of-data-analytics-to-address-the-covid-19-pandemic

podcast_episode
by Mico Yuk (Data Storytelling Academy) , Kristen Kehrer (Data Moves Me, LLC)

Free Data Storytelling Training Attend our FREE 'How to be the Chief Data Storyteller in your Org - Part 2 using our Analytics Design Guide' training at webinars.bidatastorytelling.com and download the FREE 50-page Guide! In this episode, you'll learn: [02:30] Extra Step: Why quality assurance (QA) and substantive storytelling matters. [03:16] Key Quote: Data scientists are brilliant, but I see a lot of struggle with how to communicate that brilliance.-Mico Yuk [07:50] Kristen and Company: Working together to communicate and transform ML marketing and storytelling.
For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/51

Enjoyed the Show?  Please leave us a review on iTunes.

Forensic Analytics, 2nd Edition

Become the forensic analytics expert in your organization using effective and efficient data analysis tests to find anomalies, biases, and potential fraud—the updated new edition Forensic Analytics reviews the methods and techniques that forensic accountants can use to detect intentional and unintentional errors, fraud, and biases. This updated second edition shows accountants and auditors how analyzing their corporate or public sector data can highlight transactions, balances, or subsets of transactions or balances in need of attention. These tests are made up of a set of initial high-level overview tests followed by a series of more focused tests. These focused tests use a variety of quantitative methods including Benford’s Law, outlier detection, the detection of duplicates, a comparison to benchmarks, time-series methods, risk-scoring, and sometimes simply statistical logic. The tests in the new edition include the newly developed vector variation score that quantifies the change in an array of data from one period to the next. The goals of the tests are to either produce a small sample of suspicious transactions, a small set of transaction groups, or a risk score related to individual transactions or a group of items. The new edition includes over two hundred figures. Each chapter, where applicable, includes one or more cases showing how the tests under discussion could have detected the fraud or anomalies. The new edition also includes two chapters each describing multi-million-dollar fraud schemes and the insights that can be learned from those examples. These interesting real-world examples help to make the text accessible and understandable for accounting professionals and accounting students without rigorous backgrounds in mathematics and statistics. Emphasizing practical applications, the new edition shows how to use either Excel or Access to run these analytics tests. The book also has some coverage on using Minitab, IDEA, R, and Tableau to run forensic-focused tests. The use of SAS and Power BI rounds out the software coverage. The software screenshots use the latest versions of the software available at the time of writing. This authoritative book: Describes the use of statistically-based techniques including Benford’s Law, descriptive statistics, and the vector variation score to detect errors and anomalies Shows how to run most of the tests in Access and Excel, and other data analysis software packages for a small sample of the tests Applies the tests under review in each chapter to the same purchasing card data from a government entity Includes interesting cases studies throughout that are linked to the tests being reviewed. Includes two comprehensive case studies where data analytics could have detected the frauds before they reached multi-million-dollar levels Includes a continually-updated companion website with the data sets used in the chapters, the queries used in the chapters, extra coverage of some topics or cases, end of chapter questions, and end of chapter cases. Written by a prominent educator and researcher in forensic accounting and auditing, the new edition of Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations is an essential resource for forensic accountants, auditors, comptrollers, fraud investigators, and graduate students.

Free Data Storytelling Training Attend our FREE 'How to be the Chief Datastoryteller in your Org - Part 2 using our Analytics Design Guide' training at webinars.bidatastorytelling.com and download the FREE 50-page Guide! In this episode, you'll learn: [03:21] New Language, New Book for Reporting: Music to your ears! [05:24] Keys to Success: Visual Consistency, Certified Tools, and Management Support. [17:55] Hichert's impact and influence on how dashboards and reports are created.
For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/50
  Enjoyed the Show?  Please leave us a review on iTunes.

Free Data Storytelling Training Attend our FREE 'How to be the Chief Datastoryteller in your Org - Part 2 using our Analytics Design Guide' training at webinars.bidatastorytelling.com and download the FREE 50-page Guide! In this episode, you'll learn: [12:10] Programming Powers: Liza got into tech to be a winner and take things apart. [24:45] Why did Liza decide to sign-up for BIDF course? Learn from mistakes made when projects failed. [41:28] Step 3 - KPIs: Clearly define and develop skills to capture data metrics. For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/49
  Enjoyed the Show?  Please leave us a review on iTunes.

Summary Data is a critical element to every role in an organization, which is also what makes managing it so challenging. With so many different opinions about which pieces of information are most important, how it needs to be accessed, and what to do with it, many data projects are doomed to failure. In this episode Chris Bergh explains how taking an agile approach to delivering value can drive down the complexity that grows out of the varied needs of the business. Building a DataOps workflow that incorporates fast delivery of well defined projects, continuous testing, and open lines of communication is a proven path to success.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! If DataOps sounds like the perfect antidote to your pipeline woes, DataKitchen is here to help. DataKitchen’s DataOps Platform automates and coordinates all the people, tools, and environments in your entire data analytics organization – everything from orchestration, testing and monitoring to development and deployment. In no time, you’ll reclaim control of your data pipelines so you can start delivering business value instantly, without errors. Go to dataengineeringpodcast.com/datakitchen today to learn more and thank them for supporting the show! Your host is Tobias Macey and today I’m welcoming back Chris Bergh to talk about ways that DataOps principles can help to reduce organizational complexity

Interview

Introduction How did you get involved in the area of data management? How are typical data and analytic teams organized? What are their roles and structure? Can you start by giving an outline of the ways that complexity can manifest in a data organization?

What are some of the contributing factors that generate this complexity? How does the size or scale of an organization and their data needs impact the segmentation of responsibilities and roles?

How does this organizational complexity play out within a single team? For example between data engineers, data scientists, and production/operations? How do you approach the definition of useful interfaces between different roles or groups within an organization?

What are your thoughts on the relationship between the multivariate complexities of data and analytics workflows and the software trend toward microservices as a means of addressing the challenges of organizational communication patterns in the software lifecycle?

How does this organizational complexity play out between multiple teams? For example between centralized data team and line of business self service teams? Isn’t organizational complexity just ‘the way it is’? Is there any how in getting out of meetings and inter team conflict? What are some of the technical elements that are most impactful in reducing the time to delivery for different roles? What are some strategies that you have found to be useful for maintaining a connection to the business need throughout the different stages of the data lifecycle? What are some of the signs or symptoms of problematic complexity that individuals and organizations should keep an eye out for? What role can automated testing play in improving this process? How do the current set of tools contribute to the fragmentation of data wor

Introducing Microsoft SQL Server 2019

Introducing Microsoft SQL Server 2019 is the must-have guide for database professionals eager to leverage the latest advancements in SQL Server 2019. This book covers the features and capabilities that make SQL Server 2019 a powerful tool for managing and analyzing data both on-premises and in the cloud. What this Book will help me do Understand the new features introduced in SQL Server 2019 and their practical applications. Confidently manage and analyze relational, NoSQL, and big data within SQL Server 2019. Implement containerization for SQL Server using Docker and Kubernetes. Migrate and integrate your databases effectively to use Power BI Report Server. Query data from Hadoop Distributed File System with Azure Data Studio. Author(s) The authors of 'Introducing Microsoft SQL Server 2019' are subject matter experts including Kellyn Gorman, Allan Hirt, and others. With years of professional experience in database management and SQL Server, they bring a wealth of practical insight and knowledge to the book. Their experience spans roles as administrators, architects, and educators in the field. Who is it for? This book is aimed at database professionals such as DBAs, architects, and big data engineers who are currently using earlier versions of SQL Server or other database platforms. It is particularly well-suited for professionals aiming to understand and implement SQL Server 2019's new features. Readers should have basic familiarity with SQL Server and RDBMS concepts. If you're looking to explore SQL Server 2019 to improve data management and analytics in your organization, this book is for you.

Building a Unified Data Infrastructure

The vast majority of businesses today already have a documented data strategy. But only a third of these forward-thinking companies have evolved into data-driven organizations or even begun to move toward a data culture. Most have yet to treat data as a business asset, much less use data and analytics to compete in the marketplace. What’s the solution? This insightful report demonstrates the importance of creating a holistic data infrastructure approach. You’ll learn how data virtualization (DV), master data management (MDM), and metadata-management capabilities can help your organization meet business objectives. Chief data officers, enterprise architects, analytics leaders, and line-of-business executives will understand the benefits of combining these capabilities into a unified data platform. Explore three separate business contexts that depend on data: operations, analytics, and governance Learn a pragmatic and holistic approach to building a unified data infrastructure Understand the critical capabilities of this approach, including the ability to work with existing technology Apply six best practices for combining data management capabilities

ML Ops: Operationalizing Data Science

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren’t truly operational, these models can’t possibly do what you’ve trained them to do. This report introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach—Build, Manage, Deploy and Integrate, and Monitor—for creating ML-infused applications within your organization. You’ll learn how to: Fulfill data science value by reducing friction throughout ML pipelines and workflows Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action

Free Data Storytelling Training Attend our FREE 'How to be the Chief Datastoryteller in your Org - Part 2 using our Analytics Design Guide' training at webinars.bidatastorytelling.com and download the FREE 50-page Guide! In this episode, you'll learn: [02:18] Three key takeaways and five free headlines to get a response from Linkedin recruiters. [08:14] Benefits of Lock-down: No distractions, increased productivity. [01:09:10] How to prepare for Zoom interviews:   ​ For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/48
  Enjoyed the Show?  Please leave us a review on iTunes.

As of this writing, billions of consumers live in quarantine. They buy what they need online, comforting themselves with food, TV, and toilet paper. Nobody is splurging at the mall.

To say the least, it is an interesting time to analyze discretionary consumer behavior. As Director of the Voice of Consumer Analytics at Adidas, Tiankai helps measure and manage the perception of a consumer brand that is mentioned on social media an average of 260,000 times per day. An amateur musician, Tiankai went viral himself lately with his series of “Quarantunes,” songs such as “Self Quarantine” and “Parent in Quarantine,” that poke fun at our homebound predicament.

Tiankai recently spoke with Eckerson Group about the art and science of consumer analytics, the COVID-19 conundrum, and (of course) the role of creativity in modern data analysis.

Strategic Analytics: The Insights You Need from Harvard Business Review

Is your company ready for the next wave of analytics? Data analytics offer the opportunity to predict the future, use advanced technologies, and gain valuable insights about your business. But unless you're staying on top of the latest developments, your company is wasting that potential--and your competitors will be gaining speed while you fall behind. Strategic Analytics: The Insights You Need from Harvard Business Review will provide you with today's essential thinking about what data analytics are capable of, what critical talents your company needs to reap their benefits, and how to adopt analytics throughout your organization--before it's too late. Business is changing. Will you adapt or be left behind? Get up to speed and deepen your understanding of the topics that are shaping your company's future with the Insights You Need from Harvard Business Review series. Featuring HBR's smartest thinking on fast-moving issues--blockchain, cybersecurity, AI, and more--each book provides the foundational introduction and practical case studies your organization needs to compete today and collects the best research, interviews, and analysis to get it ready for tomorrow. You can't afford to ignore how these issues will transform the landscape of business and society. The Insights You Need series will help you grasp these critical ideas--and prepare you and your company for the future.

Free Data Storytelling Training Attend our FREE 'How to be the Chief Datastoryteller in your Org - Part 2 using our Analytics Design Guide' training at webinars.bidatastorytelling.com and download the FREE 50-page Guide! In this episode, you'll learn: [04:00] Are you literate or illiterate? The terminology is insulting and degrading. [09:25] Natural Rebel: Think differently about established ideas to get a new perspective. [47:44] "If you want your team to be data literate, the number one thing you can do is communicate with them using data." Donald Farmer For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/47
  Enjoyed the Show?  Please leave us a review on iTunes.

Building an Anonymization Pipeline

How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. Create anonymization solutions diverse enough to cover a spectrum of use cases Match your solutions to the data you use, the people you share it with, and your analysis goals Build anonymization pipelines around various data collection models to cover different business needs Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs Examine the ethical issues around the use of anonymized data

IBM z15 Technical Introduction

This IBM® Redbooks® publication introduces the latest member of the IBM Z® platform, the IBM z15™. It includes information about the Z environment and how it helps integrate data and transactions more securely. It also provides insight for faster and more accurate business decisions. The z15 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to any digital transformation. The z15 is designed for enhanced modularity, and occupies an industry-standard footprint. It is offered as a single air-cooled 19-inch frame called the z15 T02, or as a multi-frame (1 to 4 19-inch frames) called the z15 T01. Both z15 models excel at the following tasks: Using hybrid multicloud integration services Securing and protecting data with encryption everywhere Providing resilience with key to zero downtime Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Accelerating digital transformation with agile service delivery Revolutionizing business processes Blending open source and IBM Z technologies This book explains how this system uses innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and open source technologies. With the z15 as the base, applications can run in a trusted, reliable, and secure environment that improves operations and lessens business risk.

Free Data Storytelling Training Register for three live training at webinars.bidatastorytelling.com and download our FREE 50-page Analytics Design Guide! In this episode, you'll learn: [09:10] Mary Ann's BI Master Class—Three fundamental storytelling lessons [10:46] What is AliMed? Manufacturer and distributor of medical supplies. [13:13] Mary Ann's Inspiration: How do you get into statistics as a career? For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/46
  Enjoyed the Show?  Please leave us a review on iTunes.

Summary Knowledge graphs are a data resource that can answer questions beyond the scope of traditional data analytics. By organizing and storing data to emphasize the relationship between entities, we can discover the complex connections between multiple sources of information. In this episode John Maiden talks about how Cherre builds knowledge graphs that provide powerful insights for their customers and the engineering challenges of building a scalable graph. If you’re wondering how to extract additional business value from existing data, this episode will provide a way to expand your data resources.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. We have partnered with organizations such as ODSC, and Data Council. Upcoming events include ODSC East which has gone virtual starting April 16th. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing John Maiden about how Cherre is building and using a knowledge graph of commercial real estate information

Interview

Introduction How did you get involved in the area of data management? Can you start by describing what Cherre is and the role that data plays in the business? What are the benefits of a knowledge graph for making real estate investment decisions? What are the main ways that you and your customers are using the knowledge graph?

What are some of the challenges that you face in providing a usable interface for end-users to query the graph?

What technology are you using for storing and processing the graph?

What challenges do you face in scaling the complexity and analysis of the graph?

What are the main sources of data for the knowledge graph? What are some of the ways that messiness manifests in the data that you are using to populate the graph?

How are you managing cleaning of the data and how do you identify and process records that can’t be coerced into the desired structure? How do you handle missing attributes or extra attributes in a given record?

How did you approach the process of determining an effective taxonomy for records in the graph? What is involved in performing entity extraction on your data? What are some of the most interesting or unexpected questions that you have been able to ask and answer with the graph? What are some of the most interesting/unexpected/challenging lessons that you have learned in the process of working with this data? What are some of the near and medium term improvements that you have planned for your knowledge graph? What advice do you have for anyone who is interested in building a knowledge graph of their own?

Contact Info

LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for liste

End-to-end Data Analytics for Product Development

An interactive guide to the statistical tools used to solve problems during product and process innovation End to End Data Analytics for Product Development is an accessible guide designed for practitioners in the industrial field. It offers an introduction to data analytics and the design of experiments (DoE) whilst covering the basic statistical concepts useful to an understanding of DoE. The text supports product innovation and development across a range of consumer goods and pharmaceutical organizations in order to improve the quality and speed of implementation through data analytics, statistical design and data prediction. The book reviews information on feasibility screening, formulation and packaging development, sensory tests, and more. The authors – noted experts in the field – explore relevant techniques for data analytics and present the guidelines for data interpretation. In addition, the book contains information on process development and product validation that can be optimized through data understanding, analysis and validation. The authors present an accessible, hands-on approach that uses MINITAB and JMP software. The book: • Presents a guide to innovation feasibility and formulation and process development • Contains the statistical tools used to solve challenges faced during product innovation and feasibility • Offers information on stability studies which are common especially in chemical or pharmaceutical fields • Includes a companion website which contains videos summarizing main concepts Written for undergraduate students and practitioners in industry, End to End Data Analytics for Product Development offers resources for the planning, conducting, analyzing and interpreting of controlled tests in order to develop effective products and processes.