talk-data.com talk-data.com

Topic

IoT

Internet of Things (IoT)

connected_devices sensors data_collection

112

tagged

Activity Trend

11 peak/qtr
2020-Q1 2026-Q1

Activities

112 activities · Newest first

Creating captivating generative AI analytics demos is easy. But building a product that consistently delivers value and handles real-life data complexity is challenging. In fact, only 3%-10% of companies effectively utilize LLMs for production. Learn how Cox 2M, the commercial IoT division of Cox Communications has become able to make smarter, faster business decisions using one of the few production-ready implementations of generative AI. Please note: seating is limited and on a first-come, first served basis; standing areas are available

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Healthcare Big Data Analytics

This book highlights how optimized big data applications can be used for patient monitoring and clinical diagnosis. In fact, IoT-based applications are data-driven and mostly employ modern optimization techniques. The book also explores challenges, opportunities, and future research directions, discussing the stages of data collection and pre-processing, as well as the associated challenges and issues in data handling and setup.

Practical MongoDB Aggregations

Dive into the capabilities of the MongoDB aggregation framework with this official guide, "Practical MongoDB Aggregations". You'll learn how to design and optimize efficient aggregation pipelines for MongoDB 7.0, empowering you to handle complex data analysis and processing tasks directly within the database. What this Book will help me do Gain expertise in crafting advanced MongoDB aggregation pipelines for custom data workflows. Learn to perform time series analysis for financial datasets and IoT applications. Discover optimization techniques for working with sharded clusters and large datasets. Master array manipulation and other specific operations essential for MongoDB data models. Build pipelines that ensure data security and distribution while maintaining performance. Author(s) Paul Done, a recognized expert in MongoDB, brings his extensive experience in database technologies to this book. With years of practice in helping companies leverage MongoDB for big data solutions, Paul shares his deep knowledge in an accessible and logical manner. His approach to writing is hands-on, focusing on practical insights and clear explanations. Who is it for? This book is tailored for intermediate-level developers, database architects, data analysts, engineers, and scientists who use MongoDB. If you are familiar with MongoDB and looking to expand your understanding specifically around its aggregation capabilities, this guide is for you. Whether you're analyzing time series data or need to optimize pipelines for performance, you'll find actionable tips and examples here to suit your needs.

We all know that data, like wine and cheese, becomes more valuable when combined. And, just like wine and cheese, they can lead to serious headaches. Whether you are emailing Excel files around, capturing data from thousands of IoT-devices, or just joining your Google Analytics and sales data, you can benefit from following a structured process to minimize your headaches. After debugging yet another failed pipeline I have distilled my experience of building data ingestion pipelines in 8 simple (though not necessarily easy) steps from setting up triggers to archiving and retention.

Building a Fast Universal Data Access Platform

Your company relies on data to succeed—data that traditionally comes from a business's transactional processes, pulled from the transaction systems through an extract-transform-load (ETL) process into a warehouse for reporting purposes. But this data flow is no longer sufficient given the growth of the internet of things (IOT), web commerce, and cybersecurity. How can your company keep up with today's increasing magnitude of data and insights? Organizations that can no longer rely on data generated by business processes are looking outside their workflow for information on customer behavior, retail patterns, and industry trends. In this report, author Christopher Gardner examines the challenges of building a framework that provides universal access to data. You will: Learn the advantages and challenges of universal data access, including data diversity, data volume, and the speed of analytic operations Discover how to build a framework for data diversity and universal access Learn common methods for improving database and performance SLAs Examine the organizational requirements that a fast universal data access platform must meet Explore a case study that demonstrates how components work together to form a multiaccess, high-volume, high-performance interface About the author: Christopher Gardner is the campus Tableau application administrator at the University of Michigan, controlling security, updates, and performance maintenance.

Event Driven Real-Time Supply Chain Ecosystem Powered by Lakehouse

As the backbone of Australia’s supply chain, the Australia Rail Track Corporation (ARTC) plays a vital role in the management and monitoring of goods transportation across 8,500km of its rail network throughout Australia. ARTC provides weighbridges along their track which read train weights as they pass at speeds of up to 60 kilometers an hour. This information is highly valuable and is required both by ARTC and their customers to provide accurate haulage weight details, analyze technical equipment, and help ensure wagons have been loaded correctly.

A total of 750 trains run across a network of 8500 km in a day and generate real-time data at approximately 50 sensor platforms. With the help of structured streaming and Delta Lake, ARTC was able to analyze and store:

  • Precise train location
  • Weight of the train in real-time
  • Train crossing time to the second level
  • Train speed, temperature, sound frequency, and friction
  • Train schedule lookups

Once all the IoT data has been pulled together from an IoT event hub, it is processed in real-time using structured streaming and stored in Delta Lake. To understand the train GPS location, API calls are then made per minute per train from the Lakehouse. API calls are made in real-time to another scheduling system to lookup customer info. Once the processed/enriched data is stored in Delta Lake, an API layer was also created on top of it to expose this data to all consumers.

The outcome: increased transparency on weight data as it is now made available to customers; we built a digital data ecosystem that now ARTC’s customers use to meet their KPIs/ planning; the ability to determine temporary speed restrictions across the network to improve train scheduling accuracy and also schedule network maintenance based on train schedules and speed.

Talk by: Deepak Sekar and Harsh Mishra

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: AWS-Real Time Stream Data & Vis Using Databricks DLT, Amazon Kinesis, & Amazon QuickSight

Amazon Kinesis Data Analytics is a managed service that can capture streaming data from IoT devices. Databricks Lakehouse platform provides ease of processing streaming and batch data using Delta Live Tables. Amazon Quicksight with powerful visualization capabilities can provides various advanced visualization capabilities with direct integration with Databricks. Combining these services, customers can capture, process, and visualize data from hundreds and thousands of IoT sensors with ease.

Talk by: Venkat Viswanathan

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Leveraging IoT Data at Scale to Mitigate Global Water Risks Using Apache Spark™ Streaming and Delta

Every year, billions of dollars are lost due to water risks from storms, floods, and droughts. Water data scarcity and excess are issues that risk models cannot overcome, creating a world of uncertainty. Divirod is building a platform of water data by normalizing diverse data sources of varying velocity into one unified data asset. In addition to publicly available third-party datasets, we are rapidly deploying our own IoT sensors. These sensors ingest signals at a rate of about 100,000 messages per hour into preprocessing, signal-processing, analytics, and postprocessing workloads in one spark-streaming pipeline to enable critical real-time decision-making processes. By leveraging streaming architecture, we were able to reduce end-to-end latency from tens of minutes to just a few seconds.

We are leveraging Delta Lake to provide a single query interface across multiple tables of this continuously changing data. This enables data science and analytics workloads to always use the most current and comprehensive information available. In addition to the obvious schema transformations, we implement data quality metrics and datum conversions to provide a trustworthy unified dataset.

Talk by: Adam Wilson and Heiko Udluft

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Apache Spark™ Streaming and Delta Live Tables Accelerates KPMG Clients For Real Time IoT Insights

Unplanned downtime in manufacturing costs firms up to a trillion dollars annually. Time that materials spend sitting on a production line is lost revenue. Even just 15 hours of downtime a week adds up to over 800 hours of downtime yearly. The use of Internet of Things or IoT devices can cut this time down by providing details of machine metrics. However, IoT predictive maintenance is challenged by the lack of effective, scalable infrastructure and machine learning solutions. IoT data can be the size of multiple terabytes per day and can come in a variety of formats. Furthermore, without any insights and analysis, this data becomes just another table.

The KPMG Databricks IoT Accelerator is a comprehensive solution enabling manufacturing plant operators to have a bird’s eye view of their machines’ health and empowers proactive machine maintenance across their portfolio of IoT devices. The Databricks Accelerator ingests IoT streaming data at scale and implements the Databricks Medallion architecture while leveraging Delta Live Tables to clean and process data. Real time machine learning models are developed from IoT machine measurements and are managed in MLflow. The AI predictions and IoT device readings are compiled in the gold table powering downstream dashboards like Tableau. Dashboards inform machine operators of not only machines’ ailments, but action they can take to mitigate issues before they arise. Operators can see fault history to aid in understanding failure trends, and can filter dashboards by fault type, machine, or specific sensor reading. 

Talk by: MacGregor Winegard

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

The industrial environment offers a lot of interesting use cases for data enthusiasts. There are myriads of interesting challenges that can be solved by data scientists. However, collecting industrial data in general and industrial IoT (IIoT) data in particular, is cumbersome and not really appealing for anyone who just wants to work with data. Apache StreamPipes addresses this pitfall and allows anyone to extract data from IIoT data sources without messing around with (old-fashioned) protocols. In addition, StreamPipes newly developed Python client now gives Pythonistas the ability to programmatically access and work with them in a Pythonic way.

This talk will provide a basic introduction into the functionality of Apache StreamPipes itself, followed by a deeper discussion of the Python client. Finally, a live demo will show how IIoT data can be easily derived in Python and used directly for visualization and ML model training.

Send us a text Datatopics is a podcast presented by Kevin Missoorten to talk about the fuzzy and misunderstood concepts in the world of data, analytics, and AI and get to the bottom of things.

In today's episode - a second one on collaborative data ecosystems - , we're diving into the world of collaborative Intelligence covering topics like federated learning, swarm learning, Edge AI and more groundbreaking approaches that are transforming the landscape of machine learning.

Join our expert guests Thomas Huybrechts and Virginie Marelli as we explore the inner workings of this innovative approach. We'll delve into the core concepts of federated learning, including how it enables organizations to leverage the collective knowledge of distributed data while maintaining data privacy and security. We'll also discuss the practical applications of federated learning in various domains, such as healthcare, finance, and IoT, and how it is being used to address real-world challenges.

Datatopics is brought to you by Dataroots Music: The Gentlemen - DivKidThe thumbnail is generated by Midjourney

On today’s episode, we’re talking to Dylan Barrell, Chief Technology Officer at Deque Systems, Inc, a web accessibility software and services company aimed at giving everyone, regardless of ability, equal access to information, services and applications on the web.

We talk about:

  • Dylan’s background and what Deque does.
  • The importance of accessibility in software.
  • Dylan’s book, “Agile Accessibility Handbook,” and why he wrote it.
  • Are there any particular tools to identify accessibility issues in software?
  • Countries that are leading the way around SaaS accessibility.
  • Advice for smaller, newer SaaS companies to prioritize accessibility.
  • How tech trends like AI, the IoT and algorithms have impacted accessibility.

Dylan Barrell - https://www.linkedin.com/in/dylanbarrell/ Deque Systems - https://www.linkedin.com/company/deque-systems-inc/

This episode is brought to you by Qrvey

The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.

Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

saas #analytics #AWS #BI

Detecting Data Anomalies via an Inspection Layer

Let's face it, we can't get enough data these days and often ingest from various sources like vendors, IoT devices, and more. Unfortunately, you've likely encountered times when the data just isn't what you're expecting. For instance; when the data has nulls, duplicates, is arranged differently than the schema specification, or others - this can be a weak point for many data pipelines. We'll showcase a way to handle this using dbt native methods to implement an inspection layer to ensure erroneous data sets can be flagged and quarantined while the rest can load uninterrupted.

Check the slides here: https://docs.google.com/presentation/d/11Q9wwMfyz6xuxMXCPizFg4DKSY_zOIHPNOrsNI8oBn8/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

In today’s episode, we’re talking to Andy Serwatuk, Director of Solutions Architecture at Onix Networking Corp., a  Google Cloud Premier Partner enabling companies to effectively leverage the Google Cloud Platform across industries and use cases.

We discuss:

Andy’s background and how he started at Onix.The differences between SaaS and non-SaaS companies.Is Google Cloud a no-brainer for SaaS companies today?The value of outsourcing tasks to citizens.How can SaaS companies learn more about IoT and other emerging trends? …and much more.

This episode is brought to you by Qrvey

The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.

Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

saas  #analytics #AWS  #BI

Today I’m chatting with Katy Pusch, Senior Director of Product and Integration for Cox2M. Katy describes the lessons she’s learned around making sure that the “juice is always worth the squeeze” for new users to adopt data solutions into their workflow. She also explains the methodologies she’d recommend to data & analytics professionals to ensure their IOT and data products are widely adopted. Listen in to find out why this former analyst turned data product leader feels it’s crucial to focus on more than just delivering data or AI solutions, and how spending more time upfront performing qualitative research on users can wind up being more efficient in the long run than jumping straight into development.

Highlights/ Skip to:

What Katy does at Cox2M, and why the data product manager role is so hard to define (01:07) Defining the value of the data in workflows and how that’s approached at Cox2M (03:13) Who buys from Cox2M and the customer problems that Katy’s product solves (05:57) How Katy approaches the zero-to-one process of taking IOT sensor data and turning it into a customer experience that provides a valuable solution (08:00) What Katy feels best motivates the adoption of a new solution for users (13:21) Katy describes how she spends more time upfront before development to ensure she’s solving the right problems for users (16:13) Katy’s views on the importance of data science & analytics pros being able to communicate in the language of their audience (20:47) The differences Katy sees between designing data products for sophisticated data users vs a broader audience (24:13) The methods Katy uses to effectively perform qualitative research and her triangulation method to surface the real needs of end users (27:29) Katy’s views on the most valuable skills for future data product managers (35:24)

Quotes from Today’s Episode “I’ve had the opportunity to get a little bit closer to our customers than I was in the beginning parts of my tenure here at Cox2M. And it’s just like a SaaS product in the sense that the quality of your data is still dependent on your customers’ workflows and their ability to engage in workflows that supply accurate data. And it’s been a little bit enlightening to realize that the same is true for IoT.” – Katy Pusch (02:11)

“Providing insights to executives that are [simply] interesting is not really very impactful. You want to provide things that are actionable and that drive the business forward.” – Katy Pusch (4:43)

“So, there’s one side of it, which is [the] happy path: figure out a way to embed your product in the customer’s existing workflow. That’s where the most success happens. But in the situation we find ourselves in right now with [this IoT solution], we do have to ask them to change their workflow.”-- Katy Pusch (12:46)

“And the way to communicate [the insight to other stakeholders] is not with being more precise with your numbers [or adding] statistics. It’s just to communicate the output of your analysis more clearly to the person who needs to be able to make a decision.” -- Katy Pusch (23:15)

“You have to define ‘What decision is my user making on a repeated basis that is worth building something that it does automatically?’ And so, you say, ‘What are the questions that my user needs answers to on a repeated basis?’ … At its essence, you’re answering three or four questions for that user [that] have to be the most important [...] questions for your user to add value. And that can be a difficult thing to derive with confidence.” – Katy Pusch (25:55)

“The piece of workflow [on the IOT side] that’s really impactful there is we’re asking for an even higher degree of change management in that case because we’re asking them to attach this device to their vehicle, and then detach it at a different point in time and there’s a procedure in the solution to allow for that, but someone at the dealership has to engage in that process. So, there’s a change management in the workflow that the juice has to be worth the squeeze to encourage a customer to embark in that journey with you.” – Katy Pusch (12:08)

“Finding people in your organization who have the appetite to be cross-functionally educated, particularly in a data arena, is very important [to] help close some of those communication gaps.” – Katy Pusch (37:03)

Nixtla: Deep Learning for Time Series Forecasting

Time series forecasting has a wide range of applications: finance, retail, healthcare, IoT, etc. Recently deep learning models such as ESRNN or N-BEATS have proven to have state-of-the-art performance in these tasks. Nixtlats is a python library that we have developed to facilitate the use of these state-of-the-art models to data scientists and developers, so that they can use them in productive environments. Written in pytorch, its design is focused on usability and reproducibility of experiments. For this purpose, nixtlats has several modules:

Data: contains datasets of various time series competencies. Models: includes state-of-the-art models. Evaluation: has various loss functions and evaluation metrics.

Objective:

  • To introduce attendees to the challenges of time series forecasting with deep learning.
  • Commercial applications of time series forecasting.
  • Describe nixtlats, their components and best practices for training and deploying state-of-the-art models in production.
  • Reproduction of state-of-the-art results using nixtlats from the winning model of the M4 time series competition (ESRNN).

Project repository: https://github.com/Nixtla/nixtlats.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Revolutionizing agriculture with AI: Delivering smart industrial solutions built upon a Lakehouse

John Deere is leveraging big data and AI to deliver ‘smart’ industrial solutions that are revolutionizing agriculture and construction, driving sustainability and ultimately helping to feed the world. The John Deere Data Factory that is built upon the Databricks Lakehouse Platform is at the core of this innovation. It ingests petabytes of data and trillions of records to give data teams fast, reliable access to standardized data sets supporting 100s of ML and analytics use cases across the organization. From IoT sensor-enabled equipment driving proactive alerts that prevent failures, to precision agriculture that maximizes field output, to optimizing operations in the supply chain, finance and marketing, John Deere is providing advanced products, technology and services for customers who cultivate, harvest, transform, enrich, and build upon the land.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Big Data Analytics and Machine Intelligence in Biomedical and Health Informatics

BIG DATA ANALYTICS AND MACHINE INTELLIGENCE IN BIOMEDICAL AND HEALTH INFORMATICS Provides coverage of developments and state-of-the-art methods in the broad and diversified data analytics field and applicable areas such as big data analytics, data mining, and machine intelligence in biomedical and health informatics. The novel applications of Big Data Analytics and machine intelligence in the biomedical and healthcare sector is an emerging field comprising computer science, medicine, biology, natural environmental engineering, and pattern recognition. Biomedical and health informatics is a new era that brings tremendous opportunities and challenges due to the plentifully available biomedical data and the aim is to ensure high-quality and efficient healthcare by analyzing the data. The 12 chapters in??Big Data Analytics and Machine Intelligence in Biomedical and Health Informatics??cover the latest advances and developments in health informatics, data mining, machine learning, and artificial intelligence. They have been organized with respect to the similarity of topics addressed, ranging from issues pertaining to the Internet of Things (IoT) for biomedical engineering and health informatics, computational intelligence for medical data processing, and Internet of Medical Things??(IoMT). New researchers and practitioners working in the field will benefit from reading the book as they can quickly ascertain the best performing methods and compare the different approaches. Audience Researchers and practitioners working in the fields of biomedicine, health informatics, big data analytics, Internet of Things, and machine learning.