talk-data.com talk-data.com

Topic

Tableau

data_visualization bi analytics

162

tagged

Activity Trend

11 peak/qtr
2020-Q1 2026-Q1

Activities

162 activities · Newest first

Tableau 2019.x Cookbook

Discover the ultimate guide to Tableau 2019.x that offers over 115 practical recipes to tackle business intelligence and data analysis challenges. This book takes you from the basics to advanced techniques, empowering you to create insightful dashboards, leverage powerful analytics, and seamlessly integrate with modern cloud data platforms. What this Book will help me do Master both basic and advanced functionalities of Tableau Desktop to effectively analyze and visualize data. Understand how to create impactful dashboards and compelling data stories for drive decision-making. Deploy advanced analytical tools including R-based forecasting and statistical techniques with Tableau. Set up and utilize Tableau Server in multi-node environments on Linux and Windows. Utilize Tableau Prep to efficiently clean, shape, and transform data for seamless integration into Tableau workflows. Author(s) The authors of the Tableau 2019.x Cookbook are recognized industry professionals with rich expertise in business intelligence, data analytics, and Tableau's ecosystem. Dmitry Anoshin and his co-authors bring hands-on experience from various industries to provide actionable insights. They focus on delivering practical solutions through structured learning paths. Who is it for? This book is tailored for data analysts, BI developers, and professionals equipped with some knowledge of Tableau wanting to enhance their skills. If you're aiming to solve complex analytics challenges or want to fully utilize the capabilities of Tableau products, this book offers the guidance and knowledge you need.

Summary

The past year has been an active one for the timeseries market. New products have been launched, more businesses have moved to streaming analytics, and the team at Timescale has been keeping busy. In this episode the TimescaleDB CEO Ajay Kulkarni and CTO Michael Freedman stop by to talk about their 1.0 release, how the use cases for timeseries data have proliferated, and how they are continuing to simplify the task of processing your time oriented events.

Introduction

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m welcoming Ajay Kulkarni and Mike Freedman back to talk about how TimescaleDB has grown and changed over the past year

Interview

Introduction How did you get involved in the area of data management? Can you refresh our memory about what TimescaleDB is? How has the market for timeseries databases changed since we last spoke? What has changed in the focus and features of the TimescaleDB project and company? Toward the end of 2018 you launched the 1.0 release of Timescale. What were your criteria for establishing that milestone?

What were the most challenging aspects of reaching that goal?

In terms of timeseries workloads, what are some of the factors that differ across varying use cases?

How do those differences impact the ways in which Timescale is used by the end user, and built by your team?

What are some of the initial assumptions that you made while first launching Timescale that have held true, and which have been disproven? How have the improvements and new features in the recent releases of PostgreSQL impacted the Timescale product?

Have you been able to leverage some of the native improvements to simplify your implementation? Are there any use cases for Timescale that would have been previously impractical in vanilla Postgres that would now be reasonable without the help of Timescale?

What is in store for the future of the Timescale product and organization?

Contact Info

Ajay

@acoustik on Twitter LinkedIn

Mike

LinkedIn Website @michaelfreedman on Twitter

Timescale

Website Documentation Careers timescaledb on GitHub @timescaledb on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

TimescaleDB Original Appearance on the Data Engineering Podcast 1.0 Release Blog Post PostgreSQL

Podcast Interview

RDS DB-Engines MongoDB IOT (Internet Of Things) AWS Timestream Kafka Pulsar

Podcast Episode

Spark

Podcast Episode

Flink

Podcast Episode

Hadoop DevOps PipelineDB

Podcast Interview

Grafana Tableau Prometheus OLTP (Online Transaction Processing) Oracle DB Data Lake

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Good Charts Workbook

Talk. Sketch. Prototype. Repeat. You know right away when you see an effective chart or graphic. It hits you with an immediate sense of its meaning and impact. But what actually makes it clearer, sharper, and more effective? If you're ready to create your own "good charts"--data visualizations that powerfully communicate your ideas and research and that advance your career—the Good Charts Workbook is the hands-on guide you've been looking for. The original Good Charts changed the landscape by helping readers understand how to think visually and by laying out a process for creating powerful data visualizations. Now, the Good Charts Workbook provides tools, exercises, and practical insights to help people in all kinds of enterprises gain the skills they need to get started. Harvard Business Review Senior Editor and dataviz expert Scott Berinato leads you, step-by-step, through the key challenges in creating good charts—controlling color, crafting for clarity, choosing chart types, practicing persuasion, capturing concepts—with warm-up exercises and mini-challenges for each. The Workbook includes helpful prompts and reminders throughout, as well as white space for users to practice the Good Charts talk-sketch-prototype process. Good Charts Workbook is the must-have manual for better understanding the dataviz around you and for creating better charts to make your case more effectively.

Tableau 10 Complete Reference

Explore and understand data with the powerful data visualization techniques of Tableau, and then communicate insights in powerful ways Key Features Apply best practices in data visualization and chart types exploration Explore the latest version of Tableau Desktop with hands-on examples Understand the fundamentals of Tableau storytelling Book Description Graphical presentation of data enables us to easily understand complex data sets. Tableau 10 Complete Reference provides easy-to-follow recipes with several use cases and real-world business scenarios to get you up and running with Tableau 10. This Learning Path begins with the history of data visualization and its importance in today's businesses. You'll also be introduced to Tableau - how to connect, clean, and analyze data in this visual analytics software. Then, you'll learn how to apply what you've learned by creating some simple calculations in Tableau and using Table Calculations to help drive greater analysis from your data. Next, you'll explore different advanced chart types in Tableau. These chart types require you to have some understanding of the Tableau interface and understand basic calculations. You'll study in detail all dashboard techniques and best practices. A number of recipes specifically for geospatial visualization, analytics, and data preparation are also covered. Last but not least, you'll learn about the power of storytelling through the creation of interactive dashboards in Tableau. Through this Learning Path, you will gain confidence and competence to analyze and communicate data and insights more efficiently and effectively by creating compelling interactive charts, dashboards, and stories in Tableau. This Learning Path includes content from the following Packt products: Learning Tableau 10 - Second Edition by Joshua N. Milligan Getting Started with Tableau 2018.x by Tristan Guillevin What you will learn Build effective visualizations, dashboards, and story points Build basic to more advanced charts with step-by-step recipes Become familiar row-level, aggregate, and table calculations Dig deep into data with clustering and distribution models Prepare and transform data for analysis Leverage Tableau's mapping capabilities to visualize data Use data storytelling techniques to aid decision making strategy Who this book is for Tableau 10 Complete Reference is designed for anyone who wants to understand their data better and represent it in an effective manner. It is also used for BI professionals and data analysts who want to do better at their jobs. Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Summary

When your data lives in multiple locations, belonging to at least as many applications, it is exceedingly difficult to ask complex questions of it. The default way to manage this situation is by crafting pipelines that will extract the data from source systems and load it into a data lake or data warehouse. In order to make this situation more manageable and allow everyone in the business to gain value from the data the folks at Dremio built a self service data platform. In this episode Tomer Shiran, CEO and co-founder of Dremio, explains how it fits into the modern data landscape, how it works under the hood, and how you can start using it today to make your life easier.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Tomer Shiran about Dremio, the open source data as a service platform

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what Dremio is and how the project and business got started?

What was the motivation for keeping your primary product open source? What is the governance model for the project?

How does Dremio fit in the current landscape of data tools?

What are some use cases that Dremio is uniquely equipped to support? Do you think that Dremio obviates the need for a data warehouse or large scale data lake?

How is Dremio architected internally?

How has that architecture evolved from when it was first built?

There are a large array of components (e.g. governance, lineage, catalog) built into Dremio that are often found in dedicated products. What are some of the strategies that you have as a business and development team to manage and integrate the complexity of the product?

What are the benefits of integrating all of those capabilities into a single system? What are the drawbacks?

One of the useful features of Dremio is the granular access controls. Can you discuss how those are implemented and controlled? For someone who is interested in deploying Dremio to their environment what is involved in getting it installed?

What are the scaling factors?

What are some of the most exciting features that have been added in recent releases? When is Dremio the wrong choice? What have been some of the most challenging aspects of building, maintaining, and growing the technical and business platform of Dremio? What do you have planned for the future of Dremio?

Contact Info

Tomer

@tshiran on Twitter LinkedIn

Dremio

Website @dremio on Twitter dremio on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Dremio MapR Presto Business Intelligence Arrow Tableau Power BI Jupyter OLAP Cube Apache Foundation Hadoop Nikon DSLR Spark ETL (Extract, Transform, Load) Parquet Avro K8s Helm Yarn Gandiva Initiative for Apache Arrow LLVM TLS

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Summary One of the most complex aspects of managing data for analytical workloads is moving it from a transactional database into the data warehouse. What if you didn’t have to do that at all? MemSQL is a distributed database built to support concurrent use by transactional, application oriented, and analytical, high volume, workloads on the same hardware. In this episode the CEO of MemSQL describes how the company and database got started, how it is architected for scale and speed, and how it is being used in production. This was a deep dive on how to build a successful company around a powerful platform, and how that platform simplifies operations for enterprise grade data management. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data managementWhen you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute.You work hard to make sure that your data is reliable and accurate, but can you say the same about the deployment of your machine learning models? The Skafos platform from Metis Machine was built to give your data scientists the end-to-end support that they need throughout the machine learning lifecycle. Skafos maximizes interoperability with your existing tools and platforms, and offers real-time insights and the ability to be up and running with cloud-based production scale infrastructure instantaneously. Request a demo at dataengineeringpodcast.com/metis-machine to learn more about how Metis Machine is operationalizing data science.And the team at Metis Machine has shipped a proof-of-concept integration between the Skafos machine learning platform and the Tableau business intelligence tool, meaning that your BI team can now run the machine learning models custom built by your data science team. If you think that sounds awesome (and it is) then join the free webinar with Metis Machine on October 11th at 2 PM ET (11 AM PT). Metis Machine will walk through the architecture of the extension, demonstrate its capabilities in real time, and illustrate the use case for empowering your BI team to modify and run machine learning models directly from Tableau. Go to metismachine.com/webinars now to register.Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch.Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chatYour host is Tobias Macey and today I’m interviewing Nikita Shamgunov about MemSQL, a newSQL database built for simultaneous transactional and analytic workloadsInterview IntroductionHow did you get involved in the area of data management?Can you start by describing what MemSQL is and how the product and business first got started?What are the typical use cases for customers running MemSQL?What are the benefits of integrating the ingestion pipeline with the database engine? What are some typical ways that the ingest capability is leveraged by customers?How is MemSQL architected and how has the internal design evolved from when you first started working on it?Where does it fall on the axes of the CAP theorem?How much processing overhead is involved in the conversion from the column oriented data stored on disk to the row oriented data stored in memory?Can you describe the lifecycle of a write transaction?Can you discuss the techniques that are used in MemSQL to optimize for speed and overall system performance?How do you mitigate the impact of network latency throughout the cluster during query planning and execution?How much of the implementation of MemSQL is using custom built code vs. open source projects?What are some of the common difficulties that your customers encounter when building on top of or migrating to MemSQL?What have been some of the most challenging aspects of building and growing the technical and business implementation of MemSQL?When is MemSQL the wrong choice for a data platform?What do you have planned for the future of MemSQL? Contact Info @nikitashamgunov on TwitterLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links MemSQLNewSQLMicrosoft SQL ServerSt. Petersburg University of Fine Mechanics And OpticsCC++In-Memory DatabaseRAM (Random Access Memory)Flash StorageOracle DBPostgreSQLPodcast EpisodeKafkaKinesisWealth ManagementData WarehouseODBCS3HDFSAvroParquetData Serialization Podcast EpisodeBroadcast JoinShuffle JoinCAP TheoremApache ArrowLZ4S2 Geospatial LibrarySybaseSAP HanaKubernetes The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Getting Started with Tableau 2018.x

Dive into the world of data visualization with "Getting Started with Tableau 2018.x." This comprehensive guide introduces you to both the fundamental and advanced functionalities of Tableau 2018.x, making it easier to create impactful data visualizations. Learn to unlock Tableau's full potential through practical examples and clear explanations. What this Book will help me do Understand the new Tableau 2018.x features like density, extensions, and transparency and how to leverage them. Learn how to connect to data sources, perform transformations, and build efficient data models to support your analysis. Master visualization techniques to design effective and insightful dashboards tailored to business needs. Explore advanced concepts such as calculations, cross-database joins, and data blending to handle complex scenarios. Develop the confidence to publish and interact with content on Tableau Server and share your insights effectively. Author(s) None Guillevin and None Pires are data visualization experts with extensive experience using Tableau. They aim to make data analysis accessible through hands-on examples and easy-to-follow explanations. Their writing balances clear instruction with practical application, making advanced concepts understandable for all readers. Who is it for? This book is ideal for beginners or experienced BI professionals who wish to gain expertise in Tableau 2018.x. It caters to aspiring analysts and business professionals looking to answer complex business-specific questions through data visualization. Regardless of prior experience in Tableau or other BI tools, this book provides value through a structured learning approach.

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Taylor Udell (Heap) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

Business Intelligence. It's a term that's been around for a few decades, but that is every bit as difficult to nail down as "data science," "big data," or a jellyfish. Think too hard about it, and you might actually find yourself struggling to define "analytics!" With the latest generation of BI tools, though, it's a topic that is making the rounds at cocktail parties the world over! (Cocktail parties just aren't what they used to be.) On this episode, the crew snags Taylor Udell from Heap to join in a discussion on the subject, and Moe (unsuccessfully) attempts to end the episode after six minutes. Possibly because neither Tableau nor Superset can definitively prove where avocado toast originated (but Wikipedia backs her up). But we all know Tim can't be shut up that quickly, right?! For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

In this episode, Wayne Eckerson and Jen Underwood explore a new era of analytics. Data volumes and complexity have exceeded the limits of current manual drag-and-drop analytics solutions. Data moves at the speed of light while speed-to-insight lags farther and farther behind. It is time to explore intelligent, next generation, machine-powered analytics to retain your competitive edge. It is time to combine the best of the human mind and machine.

Underwood is an analytics expert and founder of Impact Analytic. She is a former product manager at Microsoft who spearheaded the design and development of the reinvigorated version of Power BI, which has since become a market leading BI tool. Underwood is an IBM Analytics Insider, SAS contributor, former Tableau Zen Master, Top 10 Women Influencer and active analytics community member. She is keenly interested in the intersection of data visualization and data science and writes and speaks persuasively about these topics.

Visual Data Storytelling with Tableau, First edition

Tell Insightful, Actionable Business Stories with Tableau, the World’s Leading Data Visualization Tool! Visual Data Storytelling with Tableau brings together knowledge, context, and hands-on skills for telling powerful, actionable data stories with Tableau. This full-color guide shows how to organize data and structure analysis with storytelling in mind, embrace exploration and visual discovery, and articulate findings with rich data, carefully curated visualizations, and skillfully crafted narrative. You don’t need any visualization experience. Each chapter illuminates key aspects of design practice and data visualization, and guides you step-by-step through applying them in Tableau. Through realistic examples and classroom-tested exercises, Professor Lindy Ryan helps you use Tableau to analyze data, visualize it, and help people connect more intuitively and emotionally with it. Whether you’re an analyst, executive, student, instructor, or journalist, you won’t just master the tools: you’ll learn to craft data stories that make an immediate impact--and inspire action. Learn how to: Craft more powerful stories by blending data science, genre, and visual design Ask the right questions upfront to plan data collection and analysis Build storyboards and choose charts based on your message and audience Direct audience attention to the points that matter most Showcase your data stories in high-impact presentations Integrate Tableau storytelling throughout your business communication Explore case studies that show what to do--and what not to do Discover visualization best practices, tricks, and hacks you can use with any tool Includes coverage up through Tableau 10

Summary

Business Intelligence software is often cumbersome and requires specialized knowledge of the tools and data to be able to ask and answer questions about the state of the organization. Metabase is a tool built with the goal of making the act of discovering information and asking questions of an organizations data easy and self-service for non-technical users. In this episode the CEO of Metabase, Sameer Al-Sakran, discusses how and why the project got started, the ways that it can be used to build and share useful reports, some of the useful features planned for future releases, and how to get it set up to start using it in your environment.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Sameer Al-Sakran about Metabase, a free and open source tool for self service business intelligence

Interview

Introduction How did you get involved in the area of data management? The current goal for most companies is to be “data driven”. How would you define that concept?

How does Metabase assist in that endeavor?

What is the ratio of users that take advantage of the GUI query builder as opposed to writing raw SQL?

What level of complexity is possible with the query builder?

What have you found to be the typical use cases for Metabase in the context of an organization? How do you manage scaling for large or complex queries? What was the motivation for using Clojure as the language for implementing Metabase? What is involved in adding support for a new data source? What are the differentiating features of Metabase that would lead someone to choose it for their organization? What have been the most challenging aspects of building and growing Metabase, both from a technical and business perspective? What do you have planned for the future of Metabase?

Contact Info

Sameer

salsakran on GitHub @sameer_alsakran on Twitter LinkedIn

Metabase

Website @metabase on Twitter metabase on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Expa Metabase Blackjet Hadoop Imeem Maslow’s Hierarchy of Data Needs 2 Sided Marketplace Honeycomb Interview Excel Tableau Go-JEK Clojure React Python Scala JVM Redash How To Lie With Data Stripe Braintree Payments

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Practical Tableau

Whether you have some experience with Tableau software or are just getting started, this manual goes beyond the basics to help you build compelling, interactive data visualization applications. Author Ryan Sleeper, one of the worldâ??s most qualified Tableau consultants, complements his web posts and instructional videos with this guide to give you a firm understanding of how to use Tableau to find valuable insights in data. Over five sections, Sleeperâ??recognized as a Tableau Zen Master, Tableau Public Visualization of the Year author, and Tableau Iron Viz Championâ??provides visualization tips, tutorials, and strategies to help you avoid the pitfalls and take your Tableau knowledge to the next level. Practical Tableau sections include: Fundamentals: get started with Tableau from the beginning Chart types: use step-by-step tutorials to build a variety of charts in Tableau Tips and tricks: learn innovative uses of parameters, color theory, how to make your Tableau workbooks run efficiently, and more Framework: explore the INSIGHT framework, a proprietary process for building Tableau dashboards Storytelling: learn tangible tactics for storytelling with data, including specific and actionable tips you can implement immediately

Summary

As communications between machines become more commonplace the need to store the generated data in a time-oriented manner increases. The market for timeseries data stores has many contenders, but they are not all built to solve the same problems or to scale in the same manner. In this episode the founders of TimescaleDB, Ajay Kulkarni and Mike Freedman, discuss how Timescale was started, the problems that it solves, and how it works under the covers. They also explain how you can start using it in your infrastructure and their plans for the future.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Ajay Kulkarni and Mike Freedman about Timescale DB, a scalable timeseries database built on top of PostGreSQL

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what Timescale is and how the project got started? The landscape of time series databases is extensive and oftentimes difficult to navigate. How do you view your position in that market and what makes Timescale stand out from the other options? In your blog post that explains the design decisions for how Timescale is implemented you call out the fact that the inserted data is largely append only which simplifies the index management. How does Timescale handle out of order timestamps, such as from infrequently connected sensors or mobile devices? How is Timescale implemented and how has the internal architecture evolved since you first started working on it?

What impact has the 10.0 release of PostGreSQL had on the design of the project? Is timescale compatible with systems such as Amazon RDS or Google Cloud SQL?

For someone who wants to start using Timescale what is involved in deploying and maintaining it? What are the axes for scaling Timescale and what are the points where that scalability breaks down?

Are you aware of anyone who has deployed it on top of Citus for scaling horizontally across instances?

What has been the most challenging aspect of building and marketing Timescale? When is Timescale the wrong tool to use for time series data? One of the use cases that you call out on your website is for systems metrics and monitoring. How does Timescale fit into that ecosystem and can it be used along with tools such as Graphite or Prometheus? What are some of the most interesting uses of Timescale that you have seen? Which came first, Timescale the business or Timescale the database, and what is your strategy for ensuring that the open source project and the company around it both maintain their health? What features or improvements do you have planned for future releases of Timescale?

Contact Info

Ajay

LinkedIn @acoustik on Twitter Timescale Blog

Mike

Website LinkedIn @michaelfreedman on Twitter Timescale Blog

Timescale

Website @timescaledb on Twitter GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Timescale PostGreSQL Citus Timescale Design Blog Post MIT NYU Stanford SDN Princeton Machine Data Timeseries Data List of Timeseries Databases NoSQL Online Transaction Processing (OLTP) Object Relational Mapper (ORM) Grafana Tableau Kafka When Boring Is Awesome PostGreSQL RDS Google Cloud SQL Azure DB Docker Continuous Aggregates Streaming Replication PGPool II Kubernetes Docker Swarm Citus Data

Website Data Engineering Podcast Interview

Database Indexing B-Tree Index GIN Index GIST Index STE Energy Redis Graphite Prometheus pg_prometheus OpenMetrics Standard Proposal Timescale Parallel Copy Hadoop PostGIS KDB+ DevOps Internet of Things MongoDB Elastic DataBricks Apache Spark Confluent New Enterprise Associates MapD Benchmark Ventures Hortonworks 2σ Ventures CockroachDB Cloudflare EMC Timescale Blog: Why SQL is beating NoSQL, and what this means for the future of data

The intro and outro music is from a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug?utm_source=rss&utm_medium=rss" target="_blank"…

The data available to marketers -- literally at their fingertips by way of a few mouse clicks -- has exploded over the last decade. Yet, while there is more data -- and it is more accessible -- than it has ever been, the way we think about and use data has hardly evolved at all. With the recent advances in cloud computing and processing power, the industry is abuzz with talk of machine learning and artificial intelligence. How, then, will we get from the world of Microsoft Excel (or Tableau) to a world where "the machines" are automatically and dynamically optimizing all aspects of our marketing?

Learning Google BigQuery

If you're ready to untap the potential of data analytics in the cloud, 'Learning Google BigQuery' will take you from understanding foundational concepts to mastering advanced techniques of this powerful platform. Through hands-on examples, you'll learn how to query and analyze massive datasets efficiently, develop custom applications, and integrate your results seamlessly with other tools. What this Book will help me do Understand the fundamentals of Google Cloud Platform and how BigQuery operates within it. Migrate enterprise-scale data seamlessly into BigQuery for further analytics. Master SQL techniques for querying large-scale datasets in BigQuery. Enable real-time data analytics and visualization with tools like Tableau and Python. Learn to create dynamic datasets, manage partition tables and use BigQuery APIs effectively. Author(s) None Berlyant, None Haridass, and None Brown are specialists with years of experience in data science, big data platforms, and cloud technologies. They bring their expertise in data analytics and teaching to make advanced concepts accessible. Their hands-on approach and real-world examples ensure readers can directly apply the skills they acquire to practical scenarios. Who is it for? This book is tailored for developers, analysts, and data scientists eager to leverage cloud-based tools for handling and analyzing large-scale datasets. If you seek to gain hands-on proficiency in working with BigQuery or want to enhance your organization's data capabilities, this book is a fit. No prior BigQuery knowledge is needed, just a willingness to learn.

Summary

We have tools and platforms for collaborating on software projects and linking them together, wouldn’t it be nice to have the same capabilities for data? The team at data.world are working on building a platform to host and share data sets for public and private use that can be linked together to build a semantic web of information. The CTO, Bryon Jacob, discusses how the company got started, their mission, and how they have built and evolved their technical infrastructure.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Continuous delivery lets you get new features in front of your users as fast as possible without introducing bugs or breaking production and GoCD is the open source platform made by the people at Thoughtworks who wrote the book about it. Go to dataengineeringpodcast.com/gocd to download and launch it today. Enterprise add-ons and professional support are available for added peace of mind. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers This is your host Tobias Macey and today I’m interviewing Bryon Jacob about the technology and purpose that drive data.world

Interview

Introduction How did you first get involved in the area of data management? What is data.world and what is its mission and how does your status as a B Corporation tie into that? The platform that you have built provides hosting for a large variety of data sizes and types. What does the technical infrastructure consist of and how has that architecture evolved from when you first launched? What are some of the scaling problems that you have had to deal with as the amount and variety of data that you host has increased? What are some of the technical challenges that you have been faced with that are unique to the task of hosting a heterogeneous assortment of data sets that intended for shared use? How do you deal with issues of privacy or compliance associated with data sets that are submitted to the platform? What are some of the improvements or new capabilities that you are planning to implement as part of the data.world platform? What are the projects or companies that you consider to be your competitors? What are some of the most interesting or unexpected uses of the data.world platform that you are aware of?

Contact Information

@bryonjacob on Twitter bryonjacob on GitHub LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

data.world HomeAway Semantic Web Knowledge Engineering Ontology Open Data RDF CSVW SPARQL DBPedia Triplestore Header Dictionary Triples Apache Jena Tabula Tableau Connector Excel Connector Data For Democracy Jonathan Morgan

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Advanced Analytics with R and Tableau

In "Advanced Analytics with R and Tableau," you will learn how to combine the statistical computing power of R with the excellent data visualization capabilities of Tableau to perform advanced analysis and present your findings effectively. This book guides you through practical examples to understand topics such as classification, clustering, and predictive analytics while creating compelling visual dashboards. What this Book will help me do Integrate advanced statistical computations in R with Tableau's visual analysis for comprehensive analytics. Master making R function calls from Tableau through practical applications such as RServe integration. Develop predictive and classification models in R, visualized wonderfully in Tableau dashboards. Understand clustering and unsupervised learning concepts, applied to real-world datasets for business insights. Leverage the combination of Tableau and R for making impactful, data-driven decisions in your organization. Author(s) Ruben Oliva Ramos, Jen Stirrup, and Roberto Rösler are accomplished professionals with extensive experience in data science and analytics. Their combined expertise brings practical insights into combining R and Tableau for advanced analytics. Advocates for hands-on learning, they emphasize clarity and actionable knowledge in their writing. Who is it for? "Advanced Analytics with R and Tableau" is ideal for business analysts, data scientists, and Tableau professionals eager to expand their capabilities into advanced analytics. Readers should be familiar with Tableau and have basic knowledge of R, though the book starts with accessible examples. If you're looking to enhance your analytics with R's statistical power seamlessly integrated into Tableau, this book is for you.

Big Data Visualization

Dive into 'Big Data Visualization' and uncover how to tackle the challenges of visualizing vast quantities of complex data. With a focus on scalable and dynamic techniques, this guide explores the nuances of effective data analysis. You'll master tools and approaches to display, interpret, and communicate data in impactful ways. What this Book will help me do Understand the fundamentals of big data visualization, including unique challenges and solutions. Explore practical techniques for using D3 and Python to visualize and detect anomalies in big data. Learn to leverage dashboards like Tableau to present data insights effectively. Address and improve data quality issues to enhance analysis accuracy. Gain hands-on experience with real-world use cases for tools such as Hadoop and Splunk. Author(s) James D. Miller is an IBM-certified expert specializing in data analytics and visualization. With years of experience handling massive datasets and extracting actionable insights, he is dedicated to sharing his expertise. His practical approach is evident in how he combines tool mastery with a clear understanding of data complexities. Who is it for? This book is designed for data analysts, data scientists, and others involved in interpreting and presenting big datasets. Whether you are a beginner looking to understand big data visualization or an experienced professional seeking advanced tools and techniques, this guide suits your needs perfectly. A foundational knowledge in programming languages like R and big data platforms such as Hadoop is recommended to maximize your learning.

Tableau Cookbook - Recipes for Data Visualization

"Tableau Cookbook - Recipes for Data Visualization" walks you through the features and tools of Tableau, one of the industry-leading platforms for building data visualizations. Using over 50 hands-on recipes, you'll learn to create professional dashboards and storyboards to effectively present data trends and patterns. What this Book will help me do Understand the Tableau interface and connect it to various data sources. Build basic and advanced charts, from bar graphs to histograms and maps. Design interactive dashboards that link multiple visual components. Utilize parameters and calculations for advanced data visualizations. Integrate multiple data sources and leverage Tableau's data blending features. Author(s) Shweta Savale brings years of experience in data visualization and analytics to her writing of this cookbook. As a Tableau expert, Shweta has taught and consulted with professionals across industries, empowering them to gain insights from data. Her step-by-step instructional style makes learning both engaging and approachable. Who is it for? This book caters to both beginners looking to learn Tableau from scratch and advanced users needing a quick reference guide. It's perfect for data professionals, analysts, and anyone seeking to visualize and interpret data effectively. If you're looking to simplify Tableau's functions or sharpen your visualization skills, this book is for you.

Pro Tableau: A Step-by-Step Guide

Leverage the power of visualization in business intelligence and data science to make quicker and better decisions. Use statistics and data mining to make compelling and interactive dashboards. This book will help those familiar with Tableau software chart their journey to being a visualization expert. Pro Tableau demonstrates the power of visual analytics and teaches you how to: Connect to various data sources such as spreadsheets, text files, relational databases (Microsoft SQL Server, MySQL, etc.), non-relational databases (NoSQL such as MongoDB, Cassandra), R data files, etc. Write your own custom SQL, etc. Perform statistical analysis in Tableau using R Use a multitude of charts (pie, bar, stacked bar, line, scatter plots, dual axis, histograms, heat maps, tree maps, highlight tables, box and whisker, etc.) What you'll learn Connect to various data sources such as relational databases (Microsoft SQL Server, MySQL), non-relational databases (NoSQL such as MongoDB, Cassandra), write your own custom SQL, join and blend data sources, etc. Leverage table calculations (moving average, year over year growth, LOD (Level of Detail), etc. Integrate Tableau with R Tell a compelling story with data by creating highly interactive dashboards Who this book is for All levels of IT professionals, from executives responsible for determining IT strategies to systems administrators, to data analysts, to decision makers responsible for driving strategic initiatives, etc. The book will help those familiar with Tableau software chart their journey to a visualization expert.