talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

In this podcast @DanDeGrazia from @IBM spoke with @Vishaltx from @AnalyticsWeek to discuss the mingling of chief data scientist with open sources. He sheds light into some of the big opportunities in open source and how businesses could work together to achieve progress in data science. Dan also shared the importance of smooth communication for success as a data scientist.

TIMELINE: 0:29 Dan's journey. 9:40 Dan's role in IBM. 11:26 Tips on staying consistent while creating a database. 16:23 Chief data scientist and open-source put together. 20:28 The state of open source when it comes to data. 23:50 Evaluating the market to understand business requirements. 29:19 Future of data and open-source market. 33:23 Exciting opportunities in data. 37:06 Data scientist's role in integrating business and data. 49:41 Ingredients of a successful data scientist. 53:04 Data science and trust issues. 59:35 Human element behind data. 1:01:20 Dan's success mantra. 1:06:52 Key takeaways.

Dan's Recommended Read: The Five Temptations of a CEO, Anniversary Edition: A Leadership Fable by Patrick Lencioni https://amzn.to/2Jcm5do What Every BODY is Saying: An Ex-FBI Agent8217;s Guide to Speed-Reading People by Joe Navarro, Marvin Karlins https://amzn.to/2J1RXxO

Podcast Link: https://futureofdata.org/where-chief-data-scientist-open-source-meets-dandegrazia-futureofdata-podcast/

Dan's BIO: Dan has almost 30 years of experience working with large data sets. Starting with the unusual work of analyzing potential jury pools in the 1980s, Dan also did some of the first PC based voter registration analytics in the Chicago area, including putting the first complete list of registered voters on a PC (as hard as that is to imagine today a 50-megabyte hard drive on DOS systems was staggering). Interested in almost anything new and technical, he worked at The Chicago Board of Trade. He taught himself BASIC to write algorithms while working as an Arbitrager in financial futures. After the military, Dan moved to San Francisco. He worked with several small companies and startups designing and implementing some of the first PC-based fax systems (who cares now!), enterprise accounting software, and early middleware connections using the early 3GL/4GL languages. Always perusing the technical edge cases, Dan worked for InfoBright, a Column store Database startup in the US and EMEA, at Lingotek, an In-Q-Tel funded company working in large data set translations and big data analytics companies like Datameer and his current position as a Chief Data Scientist for Open Source in the IBM Channels organization. Dan's current just for fun Project is working to create an app that will record and analyze bird songs and provide the user with information on the bird and the specifics of the current song.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Introduction to IBM Common Data Provider for z Systems

IBM Common Data Provider for z Systems collects, filters, and formats IT operational data in near real-time and provides that data to target analytics solutions. IBM Common Data Provider for z Systems enables authorized IT operations teams using a single web-based interface to specify the IT operational data to be gathered and how it needs to be handled. This data is provided to both on- and off-platform analytic solutions, in a consistent, consumable format for analysis. This Redpaper discusses the value of IBM Common Data Provider for z Systems, provides a high-level reference architecture for IBM Common Data Provider for z Systems, and introduces key components of the architecture. It shows how IBM Common Data Provider for z Systems provides operational data to various analytic solutions. The publication provides high-level integration guidance, preferred practices, tips on planning for IBM Common Data Provider for z Systems, and example integration scenarios.

In this podcast, @BillFranksGA talks about the ingredients of a successful analytics ecosystem. He shared his analytics journey, his perspective on how other businesses are engaging in data analytics practice. He also sheds some light on best practices that businesses could adopt to execute a successful data strategy.

Timeline: 0:28 Bill's journey. 4:00 Bill's journey as an analyst. 9:29 Maturity of the analytics market. 11:56 Business, IT, and Data. 16:18 Introducing centralized analytics practice in an enterprise. 19:50 Tips and strategies for chief data officers to deliver the goods. 26:07 What do businesses don't get about data analytics? 29:40 Is the future aligned with data or analytics. 34:25 Importance for leadership to understand analytics. 36:35 The role of analytics professionals in the age of AI. 41:42 Upgrading analytics models. 47:50 How much should a business experiment on AI. 55:25 Evaluating blockchain. 59:50 Bill's success mantra. 1:05:25 Bill's favorite reads. 1:07:17 Key takeaway.

Podcast Link: https://futureofdata.org/billfranksga-on-the-ingredients-of-successful-analytics-ecosystem-futureofdata-podcast/

Bill's BIO: Bill Franks is Chief Analytics Officer for The International Institute For Analytics (IIA), where he provides perspective on trends in the analytics and big data space and helps clients understand how IIA can support their efforts to improve analytic performance. He also serves on the advisory boards of multiple university and professional analytic programs. He has held a range of executive positions in the analytics space in the past, including several years as Chief Analytics Officer for Teradata (NYSE: TDC).

Bill is the author of the book Taming The Big Data Tidal Wave (John Wiley & Sons). In the book, he applies his two decades of experience working with clients on large-scale analytics initiatives to outline what it takes to succeed in today’s world of big data and analytics. The book made Tom Peter’s list of 2014 “Must Read” books and also the Top 10 Most Influential Translated Technology Books list from CSDN in China.

His focus has always been to help translate complex analytics into terms that business users can understand and to then help an organization implement the results effectively within their processes. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations.

He earned a Bachelor’s degree in Applied Statistics from Virginia Tech and a Master’s degree in Applied Statistics from North Carolina State University.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Data pipelines become chaotic with pressures of agile, democratization, self-service, and organizational “pockets” of analytics. From enterprise BI to self-service analysis, data pipeline management should ensure analysis results are traceable, reproducible, and of production strength. Robust data pipelines rely on eight critical components.

Originally published at https://www.eckerson.com/articles/the-complexities-of-modern-data-pipelines

Location Analytics for Business

It’s estimated that 80 percent of an organization’s data contains location attributes, but many don’t understand how to unlock the potential of this data for their organizations to make better decisions. You have just been handed the keys by finding this book. Readers will unlock these methods by learning about location analytics as well as taking a deep dive into the Planned Grocery® platform created in part by the author. The Planned Grocery® location analytics platform has been mentioned in the Wall Street Journal (twice), Forbes, Bloomberg, and Business Insider. A sampling of clients of Planned Grocery® include: Philips Edison and Company, Just Fresh, Slate Retail REIT, Wegmans, and Whole Foods. The practical information in this book is designed to prepare you to recognize and take advantage of situations where you and your organization can become more successful using location analytics. This will be accomplished by taking you through an explanation of the fundamentals of location analytics, by looking at various case studies, by learning how to identify and analyze spatial data sets, and by learning about the companies that are doing interesting work in this space.

In this podcast @AndyPalmer from @Tamr sat with @Vishaltx from @AnalyticsWeek to talk about the emergence/need/market for Data Ops, a specialized capability emerging from merging data engineering and dev ops ecosystem due to increased convoluted data silos and complicated processes. Andy shared his journey on what some of the businesses and their leaders are doing wrong and how businesses need to rethink their data silos to future proof themselves. This is a good podcast for any data leader thinking about cracking the code on getting high-quality insights from data.

Timelines: 0:28 Andy's journey. 4:56 What's Tamr? 6:38 What's Andy's role in Tamr. 8:16 What's data ops? 13:07 Right time for business to incorporate data ops. 15:56 Data exhaust vs. data ops. 21:05 Tips for executives in dealing with data. 23:15 Suggestions for businesses working with data. 25:48 Creating buy-in for experimenting with new technologies. 28:47 Using data ops for the acquisition of new companies. 31:58 Data ops vs. dev ops. 36:40 Big opportunities in data science. 39:35 AI and data ops. 44:28 Parameters for a successful start-up. 47:49 What still surprises Andy? 50:19 Andy's success mantra. 52:48 Andy's favorite reads. 54:25 Final remarks.

Andy's Recommended Read: Enlightenment Now: The Case for Reason, Science, Humanism, and Progress by Steven Pinker https://amzn.to/2Lc6WqK The Three-Body Problem by Cixin Liu and Ken Liu https://amzn.to/2rQyPvp

Andy's BIO: Andy Palmer is a serial entrepreneur who specializes in accelerating the growth of mission-driven startups. Andy has helped found and/or fund more than 50 innovative companies in technology, health care, and the life sciences. Andy’s unique blend of strategic perspective and disciplined tactical execution is suited to environments where uncertainty is the rule rather than the exception. Andy has a specific passion for projects at the intersection of computer science and the life sciences.

Most recently, Andy co-founded Tamr, a next-generation data curation company, and Koa Labs, a start-up club in the heart of Harvard Square, Cambridge, MA.

Specialties: Software, Sales & Marketing, Web Services, Service Oriented Architecture, Drug Discovery, Database, Data Warehouse, Analytics, Startup, Entrepreneurship, Informatics, Enterprise Software, OLTP, Science, Internet, eCommerce, Venture Capital, Bootstrapping, Founding Team, Venture Capital firm, Software companies, early-stage venture, corporate development, venture-backed, venture capital fund, world-class, stage venture capital

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Podcast link: https://futureofdata.org/emergence-of-dataops-age-andypalmer-futureofdata-podcast/

Wanna Join? If you or any you know wants to join in, Register your interest and email at [email protected]

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this episode, Wayne Eckerson and Jen Underwood explore a new era of analytics. Data volumes and complexity have exceeded the limits of current manual drag-and-drop analytics solutions. Data moves at the speed of light while speed-to-insight lags farther and farther behind. It is time to explore intelligent, next generation, machine-powered analytics to retain your competitive edge. It is time to combine the best of the human mind and machine.

Underwood is an analytics expert and founder of Impact Analytic. She is a former product manager at Microsoft who spearheaded the design and development of the reinvigorated version of Power BI, which has since become a market leading BI tool. Underwood is an IBM Analytics Insider, SAS contributor, former Tableau Zen Master, Top 10 Women Influencer and active analytics community member. She is keenly interested in the intersection of data visualization and data science and writes and speaks persuasively about these topics.

Getting Started with Kudu

Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator—either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu column-oriented data store, you can easily perform fast analytics on fast data. This practical guide shows you how. Begun as an internal project at Cloudera, Kudu is an open source solution compatible with many data processing frameworks in the Hadoop environment. In this book, current and former solutions professionals from Cloudera provide use cases, examples, best practices, and sample code to help you get up to speed with Kudu. Explore Kudu’s high-level design, including how it spreads data across servers Fully administer a Kudu cluster, enable security, and add or remove nodes Learn Kudu’s client-side APIs, including how to integrate Apache Impala, Spark, and other frameworks for data manipulation Examine Kudu’s schema design, including basic concepts and primitives necessary to make your project successful Explore case studies for using Kudu for real-time IoT analytics, predictive modeling, and in combination with another storage engine

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Walt Hickey (Numlock Newsletter) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

Once upon a time, there was some data. And that data cried out to be extracted and analyzed and packaged up like the most exquisite of gifts and then presented gloriously to an eager and excited group of stakeholders. But, alas! Will this data story have a happy ending? Perhaps. Perhaps not! And that's the subject of this episode. Sort of. Our intrepid hosts ask the question, "How can we communicate more effectively by applying the tricks of the data journalism trade?" To answer that question, Walt Hickey, late of fivethirtyeight.com and now the founder and curator of the daily Numlock Newsletter, joins the gang to chat about how he combined an education in applied mathematics with an interest in news media to become a data journalist. Along the way, the discussion explores how Walt's insights can be applied to business analytics. And there's a terrible analogy about meat that gets butchered along the way (thanks, Tim!). For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

PySpark Cookbook

Dive into the world of big data processing and analytics with the "PySpark Cookbook". This book provides over 60 hands-on recipes for implementing efficient data-intensive solutions using Apache Spark and Python. By mastering these recipes, you'll be equipped to tackle challenges in large-scale data processing, machine learning, and stream analytics. What this Book will help me do Set up and configure PySpark environments effectively, including working with Jupyter for enhanced interactivity. Understand and utilize DataFrames for data manipulation, analysis, and transformation tasks. Develop end-to-end machine learning solutions using the ML and MLlib modules in PySpark. Implement structured streaming and graph-processing solutions to analyze and visualize data streams and relationships. Deploy PySpark applications to the cloud infrastructure efficiently using best practices. Author(s) This book is co-authored by None Lee and None Drabas, who are experienced professionals in data processing and analytics leveraging Python and Apache Spark. With their deep technical expertise and a passion for teaching through practical examples, they aim to make the complex concepts of PySpark accessible to developers of varied experience levels. Who is it for? This book is ideal for Python developers who are keen to delve into the Apache Spark ecosystem. Whether you're just starting with big data or have some experience with Spark, this book provides practical recipes to enhance your skills. Readers looking to solve real-world data-intensive challenges using PySpark will find this resource invaluable.

Streaming Change Data Capture

There are many benefits to becoming a data-driven organization, including the ability to accelerate and improve business decision accuracy through the real-time processing of transactions, social media streams, and IoT data. But those benefits require significant changes to your infrastructure. You need flexible architectures that can copy data to analytics platforms at near-zero latency while maintaining 100% production uptime. Fortunately, a solution already exists. This ebook demonstrates how change data capture (CDC) can meet the scalability, efficiency, real-time, and zero-impact requirements of modern data architectures. Kevin Petrie, Itamar Ankorion, and Dan Potter—technology marketing leaders at Attunity—explain how CDC enables faster and more accurate decisions based on current data and reduces or eliminates full reloads that disrupt production and efficiency. The book examines: How CDC evolved from a niche feature of database replication software to a critical data architecture building block Architectures where data workflow and analysis take place, and their integration points with CDC How CDC identifies and captures source data updates to assist high-speed replication to one or more targets Case studies on cloud-based streaming and streaming to a data lake and related architectures Guiding principles for effectively implementing CDC in cloud, data lake, and streaming environments The Attunity Replicate platform for efficiently loading data across all major database, data warehouse, cloud, streaming, and Hadoop platforms

Hortonworks Data Platform with IBM Spectrum Scale: Reference Guide for Building an Integrated Solution

This IBM® Redpaper™ publication provides guidance on building an enterprise-grade data lake by using IBM Spectrum™ Scale and Hortonworks Data Platform for performing in-place Hadoop or Spark-based analytics. It covers the benefits of the integrated solution, and gives guidance about the types of deployment models and considerations during the implementation of these models. Hortonworks Data Platform (HDP) is a leading Hadoop and Spark distribution. HDP addresses the complete needs of data-at-rest, powers real-time customer applications, and delivers robust analytics that accelerate decision making and innovation. IBM Spectrum Scale™ is flexible and scalable software-defined file storage for analytics workloads. Enterprises around the globe have deployed IBM Spectrum Scale to form large data lakes and content repositories to perform high-performance computing (HPC) and analytics workloads. It can scale performance and capacity both without bottlenecks.

podcast_episode
by Wayne Eckerson (Eckerson Group) , Carl Gerber (Various financial services and manufacturing firms (currently independent consultant and Eckerson Group partner))

In this podcast, Carl Gerber and Wayne Eckerson discuss Gerber’s top five data governance best practices: Motivation, Assessment, Data Assets Catalog, CxO Alliance, and Data Quality.

Gerber is a long-time chief data officer and data leader at several large, diverse financial services and manufacturing firms, who is now an independent consultant and an Eckerson Group partner.

He helps large organizations develop data strategies, modernize analytics, and establish enterprise data governance programs that ensure data quality, operational efficiency, regulatory compliance, and business outcomes. He also mentors and coaches Chief Data Officers and fills that role on an interim basis.

Summary

Web and mobile analytics are an important part of any business, and difficult to get right. The most frustrating part is when you realize that you haven’t been tracking a key interaction, having to write custom logic to add that event, and then waiting to collect data. Heap is a platform that automatically tracks every event so that you can retroactively decide which actions are important to your business and easily build reports with or without SQL. In this episode Dan Robinson, CTO of Heap, describes how they have architected their data infrastructure, how they build their tracking agents, and the data virtualization layer that enables users to define their own labels.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Dan Robinson about Heap and their approach to collecting, storing, and analyzing large volumes of data

Interview

Introduction How did you get involved in the area of data management? Can you start by giving a brief overview of Heap? One of your differentiating features is the fact that you capture every interaction on web and mobile platforms for your customers. How do you prevent the user experience from suffering as a result of network congestion, while ensuring the reliable delivery of that data? Can you walk through the lifecycle of a single event from source to destination and the infrastructure components that it traverses to get there? Data collected in a user’s browser can often be messy due to various browser plugins, variations in runtime capabilities, etc. How do you ensure the integrity and accuracy of that information?

What are some of the difficulties that you have faced in establishing a representation of events that allows for uniform processing and storage?

What is your approach for merging and enriching event data with the information that you retrieve from your supported integrations?

What challenges does that pose in your processing architecture?

What are some of the problems that you have had to deal with to allow for processing and storing such large volumes of data?

How has that architecture changed or evolved over the life of the company? What are some changes that you are anticipating in the near future?

Can you describe your approach for synchronizing customer data with their individual Redshift instances and the difficulties that entails? What are some of the most interesting challenges that you have faced while building the technical and business aspects of Heap? What changes have been necessary as a result of GDPR? What are your plans for the future of Heap?

Contact Info

@danlovesproofs on twitter [email protected] @drob on github heapanalytics.com / @heap on twitter https://heapanalytics.com/blog/category/engineering?utm_source=rss&utm_medium=rss

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data manageme

Under the 'guise of a discussion about making the leap into a new technology, this bonus mini-episode (hopefully) clears up the on-going confusion about the Kiss Sisters. Moe sat down with her big sister, Michele, to chat about jumping into learning an entirely new skill when time is short, expectations are high, and the learning curve is steep. The specific example they chat about is Michele's dive into Google Analytics data in BigQuery using SQL, but the tips and thoughts are applicable to any new and intimidating platform.

In this podcast, Harsh Tiwari, Former CDO CUNA Mutual Group, sheds light on data science leadership in the financial / risk sector. He shares some key takeaway insights for aspiring leaders to take for managing large enterprise data science practice. He shared the importance of collaborations and a growth mindset via a partnership. He discussed his "So what" approach to problem-solving. This podcast is great for any listener willing to understand some best practices for being a data-driven leader.

Timeline: 0:28 Harsh's journey. 5:44 Harsh's current role. 10:17 Ideal location for a chief data officer. 14:42 Ideal CDO role and placement. 20:15 Capital One's best practices in managing data. 25:28 How are the credit unions and regional banks placed in terms of data management. 31:20 Introducing data to well-performing banks. 38:05 Getting started as a CDO in a bank. 43:21 Checklist for a business to hire a CDO. 48:35 Keeping oneself sane during the technological disruption. 54:13 Harsh's success mantra. 58:51 Harsh's favorite read. 1:02:14 Parting thoughts.

Harsh's Recommended Read: Good to Great: Why Some Companies Make the Leap and Others Don't by Jim Collins https://amzn.to/2I7DHGM

Podcast Link: https://futureofdata.org/harsh-tiwari-talks-about-fabric-of-data-driven-leader-in-financial-sector-futureofdata-podcast/

Harsh's BIO: Harsh Tiwari is the Senior Vice President and Chief Data Officer for CUNA Mutual Group in Madison, Wisconsin. His primary responsibilities include leading enterprise-wide data initiatives providing strategy and policy guidance for data acquisition, usage, and management. He joined the company in July 2015. Before joining CUNA Mutual Group, Harsh spent many years working in information technology, analytics, and data intelligence. He worked at Capital One Financial Group in Plano, Texas, for 17 years, where he most recently focused on creating an effective data and business intelligence environment to manage risks across the company as the Head of Risk Management Data and Business Intelligence. He has also served as the Divisional CIO for Small Business Credit Card and Consumer Lending, Head of Portfolio and Delivery Management, Head of Auto Finance Data and Business Intelligence, Business Information Officer of Capital One Canada, and Analyst –Senior Manager of Small Business Data & System Analysis.

A native of India, Harsh earned a B.S. in Mechanical engineering from Mysore University in Mysore, Karnataka, India, and an M.B.A. in Finance / MIS Drexel University in Philadelphia, Pennsylvania. In his spare time, Harsh enjoys golfing and spending time with his wife, Rashmi, and their son, who is 12, and a daughter, who is 8.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Next-Generation Big Data: A Practical Guide to Apache Kudu, Impala, and Spark

Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies. Next-Generation Big Data takes a holistic approach, covering the most important aspects of modern enterprise big data. The book covers not only the main technology stack but also the next-generation tools and applications used for big data warehousing, data warehouse optimization, real-time and batch data ingestion and processing, real-time data visualization, big data governance, data wrangling, big data cloud deployments, and distributed in-memory big data computing. Finally, the book has an extensive and detailed coverage of big data case studies from Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard. What You’ll Learn Install Apache Kudu, Impala, and Spark to modernize enterprise data warehouse and business intelligence environments, complete with real-world, easy-to-follow examples, and practical advice Integrate HBase, Solr, Oracle, SQL Server, MySQL, Flume, Kafka, HDFS, and Amazon S3 with Apache Kudu, Impala, and Spark Use StreamSets, Talend, Pentaho, and CDAP for real-time and batch data ingestion and processing Utilize Trifacta, Alteryx, and Datameer for data wrangling and interactive data processing Turbocharge Spark with Alluxio, a distributed in-memory storage platform Deploy big data in the cloud using Cloudera Director Perform real-time data visualization and time series analysis using Zoomdata, Apache Kudu, Impala, and Spark Understand enterprise big data topics such as big data governance, metadata management, data lineage, impact analysis, and policy enforcement, and how to use Cloudera Navigator to perform common data governance tasks Implement big data use cases such as big data warehousing, data warehouse optimization, Internet of Things, real-time data ingestion and analytics, complex event processing, and scalable predictive modeling Study real-world big data case studies from innovative companies, including Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard Who This Book Is For BI and big data warehouse professionals interested in gaining practical and real-world insight into next-generation big data processing and analytics using Apache Kudu, Impala, and Spark; and those who want to learn more about other advanced enterprise topics

Implementing IBM FlashSystem 900 Model AE3

Abstract Today’s global organizations depend on being able to unlock business insights from massive volumes of data. Now, with IBM® FlashSystem 900 Model AE3, powered by IBM FlashCore® technology, they can make faster decisions based on real-time insights and unleash the power of the most demanding applications, including online transaction processing (OLTP) and analytics databases, virtual desktop infrastructures (VDIs), technical computing applications, and cloud environments. This IBM Redbooks® publication introduces clients to the IBM FlashSystem® 900 Model AE3. It provides in-depth knowledge of the product architecture, software and hardware, implementation, and hints and tips. Also illustrated are use cases that show real-world solutions for tiering, flash-only, and preferred-read, and also examples of the benefits gained by integrating the FlashSystem storage into business environments. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand how to implement this new and exciting technology.