talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

In this conversation, Angela & Joshua from Booz Allen sat down with Vishal to discuss the Mathematical Corporation. They discussed some of the stories, challenges, and opportunities facing any corporation on their journey to a Mathematical corporation.

Timeline: 0:29 Angela and Josh's introduction. 4:00 Inspiration behind their book. 8:20 Machine Intelligence. 12:27 Ideal fit for a business to embrace mathematical organization. 14:54 Convincing executives towards data science. 19:20 Working with data science skeptics. 23:02 How much percent data-driven should a company be? 26:38 Mathematical models being predictable in an unpredictable environment. 30:25 Mathematical organization response to open sources. 34:14 AI the job enabler. 36:52 Advice for small businesses to be data-driven. 39:36 A fully mature mathematical corporation. 44:16 Sleeping on the wheel. 45:28 An ideal reader for the book "Mathematical Corporation". 48:37 The aha moment while writing the book. 50:40 Creating awareness for business to be data-driven. 54:18 Takeaways from "Mathematical Corporation".

Their Book: The Mathematical Organization is out to help the business stay data-driven and competitive. You could download the book @ http://amzn.to/2hNsoaH

THE MATHEMATICAL CORPORATION: Where Machine Intelligence and Human Ingenuity Achieve the Impossible (PublicAffairs; June 6, 2017), by Booz Allen Hamilton machine intelligence experts Joshua Sullivan and Angela Zutavern, is the first book to show business leaders how to compete in this new era: by combining the mathematical smarts of machines with the intellect of visionary leaders.

About the Guest DR. JOSH SULLIVAN is SVP at Booz Allen Hamilton. One of the world’s leading experts in data science and machine intelligence, he was among the first people to hold the title “data scientist.” He has appeared on CNBC.

ANGELA ZUTAVERN is VP at Booz Allen Hamilton and pioneered the application of machine intelligence to leadership and strategy. Together, they’re radically transforming how Fortune 500 companies, nonprofits, and major government agencies perform by helping leaders shatter long-held constraints and reveal hidden truths in their organizations. They live in Washington, D.C.

Podcast link: https://futureofdata.org/angelazutavern-joshdsullivan-boozallen-discussed-mathematical-corporation-futureofdata/

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this session, Brett McLaughlin, Chief Data Strategist at Akamai, discussed his journey to creating a forecasting solution. He sheds light on some limitations, some innovative thinking, and some hacks that one could use to structure a good forecasting model.

Timeline: 0:29 Brett's journey. 15:06 Data scientist fulling the vision of the CEO. 24:25 Art of doing business and science of doing business. 29:23 Data science and mathematics. 34:55 Salesforce defining the value of algorithms. 38:14 Capturing feedback to improve data models. 46:14 First steps in building a futuristic data model. 54:27 Using algorithms to forecast. 1:01 Tips for data leaders to build a team.

Podcast link: https://futureofdata.org/discussing-forecasting-brett-mclaughlin-akabret-akamai/

Here's Brett's Bio: Twenty-one years of experience transforming business operations through more intelligent use of data. Expertise in leading organizations in data transformation, predictive analytics (e.g., forecasting, linear programming, operational simulations, etc.), world-class visualizations and interfaces, and tight integration into existing operations.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast, Robin discussed how an analytics organization functions in a collaborative culture. He shed some light on preparing a robust framework while working on policy rich setup. This talk is a must for anyone building an analytics organization with a culture-rich or policy rich environment.

Timeline: 0:29 Robin's journey. 6:02 Challenges in working as a chief data scientist. 9:50 Two breeds of data scientists. 13:38 Introducing data science into large companies. 16:57 Creating a center of excellence with data. 19:52 Challenges in working with a government agency. 22:57 Creating a self-serving system. 26:29 Defining chief data officer, chief analytics officer, chief data scientist. 28:28 Designing an architecture for a rapidly changing company culture. 31:39 Future of analytics and data leaders. 35:47 Art of doing business and science of doing business. 42:26 Perfect data science hire. 45:08 Closing remarks.

Podcast link: https://futureofdata.org/futureofdata-with-robin-thottungal-chief-data-scientist-at-epa/

Here's Robin's bio on his current EPA Role: - Leading the Data Analytics effort of 15,000+ member agency through providing strategic vision, program development, evangelizing the value of data-driven decision making, bringing a lean-start up approach to the public sector & building advanced data analytics platform capable of real-time/batch analysis.

-Serving as Chief data scientist for the agency, including directing, coordinating, and overseeing the division’s leadership of EPA’s multi­media data analytics, visualization, and predictive analysis work along with related tools, application development, and services.

-Develop and oversee the implementation of Agency policy on integration analysis of environmental data, including multi­media analysis and assessments of environmental quality, status, and trends.

-Develop, market, and implement tactical and strategic plans for the Agency’s data management, advanced data analytics, and predictive analysis work.

-Lead cross­federal, state, tribal, and local government data partnerships as well as information partnerships with other entities.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

The security challenges of a particular business may often be proportional to the amount of data they need to capture, process, and interpret. As businesses grow their security needs become ever more complex and challenging as the volume, velocity, and variety of data increases. Forward thinking organizations using data science to better process and interpret vast data stores both on-premise and in the cloud to identify threats and intrusions to their local networks and beyond.

Join us to participate in a dynamic discussion from practitioners with deep experience in the areas of data science or information security including:

• Bob Rudis, Chief Security Data Scientist, Rapid7, frequent blogger at rud.is, co-author of Data Driven Security, and ardent R open source contributor. Follow Bob on the web here. Previously, Bob was at Verizon and responsible for the Data Breach Investigations Report (DBIR) known in the security industry as "an unparalleled source of information on cybersecurity threats."

• Mark Gerner, Sr. Economic Data Scientist / Analytics Leader with 10+ years of experience designing, implementing, and communicating the results of analyses in support of customer engagement, strategic planning, and programmatic portfolio management related activities.

• Kalpesh Sheth, Co-founder & CEO, Yaxa, With 20+ years of technical expertise in data networking, network security, Intelligence Surveillance and Reconnaissance (ISR), and Cluster Computing. Before co-founding Yaxa, Sheth was Senior Technical Director at DRS Technologies (acquired by Finmeccanica S.p.A.), Director at RiverDelta Networks (acquired by Motorola and now part of Arris) and fifth employee of Digital Technology (acquired by Agilent Technologies). He is a co-author of VITA 41.6 an ANSI standard, and has spoken at numerous trade conferences as an expert panel member.

Venue Sponsor: @BoozAllen Media Sponsor: X.TAO.ai

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData Data Analytics Leadership Podcast Big Data Strategy

Practical Predictive Analytics

Dive into the world of predictive analytics with 'Practical Predictive Analytics.' This comprehensive guide walks you through analyzing current and historical data to predict future outcomes. Using tools like R and Spark, you will master practical skills, solve real-world challenges, and apply predictive analytics across domains like marketing, healthcare, and retail. What this Book will help me do Learn the six steps for successfully implementing predictive analytics projects. Acquire practical skills in data cleaning, input, and model deployment using tools like R and Spark. Understand core predictive analytics algorithms and their applications in various industries. Apply data analytics techniques to solve problems in fields such as healthcare and marketing. Master methods for handling big data analytics using Databricks and Spark for effective prediction. Author(s) The author, None Winters, is an experienced data scientist and technical educator. With extensive background in predictive analytics, Winters specializes in applying statistical methods and techniques to real-world consultation scenarios. Winters brings a practical and accessible approach to this text, ensuring that learners can follow along and apply their newfound expertise effectively. Who is it for? This book is ideal for statisticians and analysts with some programming background in languages like R, who want to master predictive analytics skills. It caters to intermediate learners who aim to enhance their ability to solve complex analytical problems. Whether you're looking to advance your career or improve your proficiency in data science, this book will serve as a valuable resource for learning and growth.

In this session, Jon talks about analytics in the agency business. He discussed best practices and some operational hacks to help leaders become successful in the world of analytics in the marketing domain(one of the early adopter of technology)

Timeline: 0:29 John's journey. 6:07 Use cases for the benchmark studies at L2. 7:16 The struggles and challenges in the digital industry. 11:30 How much data is good data. 14:55 Staying relevant during times of disruption. 20:18 Analysing data of various cultures for a global company. 24:30 Art of doing business and science of doing business. 27:22 Jon's current role. 30:06 How much of L2 in facing and out facing? 31:45 Qualifying a source/platform. 35:20 Integrating a new source into the existing algorithm. 38:16 Building classifiers. 40:00 Jon's leadership style. 43:00 Client facing a leadership. 45:12 Jon's magic data science hire. 47:28 Suggestion for starting a data practice in a dissimilar industry. 50:55 World without survey. 53:11 Future of data in the digital industry.

Podcast link: https://futureofdata.org/futureofdata-jon-gibs-chief-data-officer-l2-inc/

Bio- Jon Gibs is the Chief Data Officer and Chief Data Scientist at L2, a digital research, benchmarking, and advisory services company recently acquired by the Gartner Group. Prior to his time at L2, Jon founded and was the group vice president of data science and analytics at Huge, a digital agency in Brooklyn, and spent 10 years at Nielsen running its digital analytics practice.

Jon's graduate work has been in Geography and spatial statistics at The University at Buffalo.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Practical Data Science Cookbook, Second Edition - Second Edition

The Practical Data Science Cookbook, Second Edition provides hands-on, practical recipes that guide you through all aspects of the data science process using R and Python. Starting with setting up your programming environment, you'll work through a series of real-world projects to acquire, clean, analyze, and visualize data efficiently. What this Book will help me do Set up R and Python environments effectively for data science tasks. Acquire, clean, and preprocess data tailored to analysis with practical steps. Develop robust predictive and exploratory models for actionable insights. Generate analytic reports and share findings with impactful visualizations. Construct tree-based models and master random forests for advanced analytics. Author(s) Authored by a team of experienced professionals in the field of data science and analytics, this book reflects their collective expertise in tackling complex data challenges using programming. With backgrounds spanning industry and academia, the authors bring a practical, application-focused approach to teaching data science. Who is it for? This book is ideal for aspiring data scientists who want hands-on experience with real-world projects, regardless of prior experience. Beginners will gain step-by-step understanding of data science concepts, while seasoned professionals will appreciate the structured projects and use of R and Python for advanced analytics and modeling.

Advanced Object-Oriented Programming in R: Statistical Programming for Data Science, Analysis and Finance

Learn how to write object-oriented programs in R and how to construct classes and class hierarchies in the three object-oriented systems available in R. This book gives an introduction to object-oriented programming in the R programming language and shows you how to use and apply R in an object-oriented manner. You will then be able to use this powerful programming style in your own statistical programming projects to write flexible and extendable software. After reading Advanced Object-Oriented Programming in R, you'll come away with a practical project that you can reuse in your own analytics coding endeavors. You'll then be able to visualize your data as objects that have state and then manipulate those objects with polymorphic or generic methods. Your projects will benefit from the high degree of flexibility provided by polymorphism, where the choice of concrete method to execute depends on the type of data being manipulated. What You'll Learn Define and use classes and generic functions using R Work with the R class hierarchies Benefit from implementation reuse Handle operator overloading Apply the S4 and R6 classes Who This Book Is For Experienced programmers and for those with at least some prior experience with R programming language.

Development Workflows for Data Scientists

Data science teams often borrow best practices from software development, but since the product of a data science project is insight, not code, software development workflows are not a perfect fit. How can data scientists create workflows tailored to their needs? Through interviews with several data-driven organizations, this practical report reveals how data science teams are improving the way they define, enforce, and automate a development workflow. Data science workflows differ from team to team because their tasks, goals, and skills vary so much. In this report, author Ciara Byrne talked to teams from BinaryEdge, Airbnb, GitHub, Scotiabank, Fast Forward Labs, Datascope, and others about their approaches to the data science process, including their procedures for: Defining team structure and roles Asking interesting questions Examining previous work Collecting, exploring, and modeling data Testing, documenting, and deploying code to production Communicating the results With this report, you’ll also examine a complete data science workflow developed by the team from Swiss cybersecurity firm BinaryEdge that includes steps for preliminary data analysis, exploratory data analysis, knowledge discovery, and visualization.

Agile Data Science 2.0

Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools. Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You’ll learn an iterative approach that lets you quickly change the kind of analysis you’re doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization. Build value from your data in a series of agile sprints, using the data-value pyramid Extract features for statistical models from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future via classification and regression Translate predictions into actions Get feedback from users after each sprint to keep your project on track

Advanced Analytics with Spark, 2nd Edition

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming. You’ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques—including classification, clustering, collaborative filtering, and anomaly detection—to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you’ll find the book’s patterns useful for working on your own data applications. With this book, you will: Familiarize yourself with the Spark programming model Become comfortable within the Spark ecosystem Learn general approaches in data science Examine complete implementations that analyze large public data sets Discover which machine learning tools make sense for particular problems Acquire code that can be adapted to many uses

Data Science with Java

Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today’s data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java. You’ll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you’ll find code examples you can use in your applications. Examine methods for obtaining, cleaning, and arranging data into its purest form Understand the matrix structure that your data should take Learn basic concepts for testing the origin and validity of data Transform your data into stable and usable numerical values Understand supervised and unsupervised learning algorithms, and methods for evaluating their success Get up and running with MapReduce, using customized components suitable for data science algorithms

  • ERRATA (As Reported by Peter: "The book Peter mentioned (at 46:20) by Stuart Russell, "Do the Right Thing", was published in 2003, and not recently"

In this session Peter Morgan, CEO Deep Learning Partnership sat with Vishal Kumar, CEO AnalyticsWeek and shared his thoughts around Deep Learning, Machine Learning and Artificial Intelligence. They've discussed some of the best practices when it comes to picking right solution, right vendor and what are some of the keyword means.

Here's Peter's Bio: Peter Morgan is a scientist-entrepreneur starting out in high energy physics enrolled in the PhD program at the University of Massachusetts at Amherst. After leaving UMass, and founding my own company, Peter has moved into computer networks, designing, implementing and troubleshooting global IP networks for companies such as Cisco, IBM and BT Labs. After getting an MBA and dabbling in financial trading algorithms. Peter has worked for three years on an experiment lead by Stanford University to measure the mass of the neutrino. Since 2012. He had been working in Data Science and Deep Learning, founding an AI Solutions company in Jan 2016.

As an entrepreneur Peter has founded companies in the AI, social media, and music industries. He has also served on the advisory board of technology startups. Peter is a popular speaker at conferences, meetups and webinars. He has cofounded and currently organize meetups in the deep learning space. Peter has business experience in the USA, UK and Europe.

Today, as CEO of Deep Learning Partnership, He leads the strategic direction and business development across product and services. This includes sales and marketing, lead generation, client engagement, recruitment, content creation and platform development. Deep Learning technologies used include computer vision and natural language processing and frameworks like TensorFlow, Keras and MXnet. Deep Learning Partnership design and implement AI solutions for our clients across all business domains.

Interested in sharing your thought leadership with our global listeners? Register your interest @ http://play.analyticsweek.com/guest/

Metaprogramming in R: Advanced Statistical Programming for Data Science, Analysis and Finance

Learn how to manipulate functions and expressions to modify how the R language interprets itself. This book is an introduction to metaprogramming in the R language, so you will write programs to manipulate other programs. Metaprogramming in R shows you how to treat code as data that you can generate, analyze, or modify. R is a very high-level language where all operations are functions and all functions are data that can be manipulated. This book shows you how to leverage R's natural flexibility in how function calls and expressions are evaluated, to create small domain-specific languages to extend R within the R language itself. What You'll Learn Find out about the anatomy of a function in R Look inside a function call Work with R expressions and environments Manipulate expressions in R Use substitutions Who This Book Is For Those with at least some experience with R and certainly for those with experience in other programming languages

Learning Social Media Analytics with R

Explore the intricacies of using R for social media analytics with 'Learning Social Media Analytics with R'. This comprehensive guide introduces readers to tools and techniques to extract, analyze, and visualize data from popular platforms like Twitter and Facebook. Gain insights into advanced methods such as sentiment analysis, topic modeling, and social network analysis. What this Book will help me do Master the art of leveraging R to retrieve, process, and clean data from major social media platforms. Use actionable insights from sentiment analysis and topic modeling to improve decision-making processes. Develop an understanding of social network structures by analyzing community connections and user interactions. Create impactful data visualizations that showcase trends and insights effectively using the R ecosystem. Integrate advanced R packages such as ggplot2, dplyr, and caret to streamline data analysis workflows. Author(s) The authors of this book, None Sarkar, Karthik Ganapathy, Raghav Bali, and None Sharma, are experts in data science and R programming with extensive experience in the industry. They bring a passion for teaching and a clear, step-by-step methodology to help learners grasp complex concepts. Who is it for? This book is ideal for data scientists, analysts, IT professionals, and social media marketers who aim to gain actionable insights from social data. Whether you're a beginner or have some experience with R, this book is highly approachable and beneficial. Readers will find practical examples and comprehensive tutorials tailored for their level of expertise.

In this session, Nathaniel discussed how NFPA uses data to empower fire stations worldwide with data-driven insights. We discussed the future of fire in this tech-driven world.

Timeline: 0:29 Nathaniel's journey. 3:50 What's NFPA? 6:12 Nathaniel's role in NFPA. 8:50 Nathaniel's book. 12:21 The data science team at NFPA. 15:01 Working with the government. 18:50 Interesting use cases of NFPA. 25:49 Fining tuning the data model at NFPA. 28:11 NFPA alliance with the Insurance industry. 31:33 Recruiting an idea concept or tool. 33:16 How to approach NFPA? 36:03 Nathaniel's role: in facing or outfacing? 40:41 Suggestions for Non-profits to build a data science practice. 43:49 Putting together a data science team. 46:34 Predicting the fire outcome. 48:11 Closing remarks.

Podcast link: https://futureofdata.org/futureofdata-nathaniel-lin-chief-data-scientist-nfpa/

Bio- Nathaniel Lin has an extensive background in business and marketing analytics with strategic roles in both start-ups and Fortune 500 companies. He offers the National Fire Protection Association (NFPA) agency and client perspective gleaned from his work at Fidelity Investments, OgilvyOne, Aspen Marketing, and IBM Worldwide. During his tenure with IBM Asia Pacific, he also built and led a marketing analytics group that won a DMA/NCDM Gold Award in B2B Marketing.

Lin served as an adjunct professor of business analytics at Boston College and Georgia Tech College of Management. He is also the founder of two LinkedIn groups related to big data analytics and is the 2014 author of Applied Business Analytics – Integrating Business Process, Big Data, and Advanced Analytics. Lin has an MBA in Management of Technology/Sloan Fellows from MIT Sloan School of Management and earned both a Ph.D. In Environmental Engineering and an Honors B.S from Birmingham University in England.

Founded in 1896, NFPA is a global, nonprofit organization devoted to eliminating death, injury, property, and economic loss due to fire, electrical and related hazards. The association delivers information and knowledge through more than 300 consensus codes and standards, research, training, education, outreach, and advocacy; and partner with others who share an interest in furthering the NFPA mission. For more information, visit www.nfpa.org.

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Practical Statistics for Data Scientists

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that “learn” from data Unsupervised learning methods for extracting meaning from unlabeled data

Breaking Data Science Open

Over the past decade, data science has come out of the back office to become a force of change across the entire organization. At the forefront of this change is the open data science movement that advocates the use of open source tools in a powerful, connected ecosystem. This report explores how open data science can help your organization break free from the shackles of proprietary tools, embrace a more open and collaborative work style, and unleash new intelligent applications quickly. Authors Michele Chambers and Christine Doig explain how open source tools have helped bring about many facets of the data science evolution, including collaboration, self-service, and deployment. But you’ll discover that open data science is about more than tools; it’s about a new way of working as an organization. Learn how data science—particularly open data science—has become part of everyday business Understand how open data science engages people from other disciplines, not just statisticians Examine tools and practices that enable data science to be open across technical, operational, and organizational aspects Learn benefits of open data science, including rich resources, agility, transparency, and collective intelligence Explore case studies that demonstrate different ways to implement open data science Discover how open data science can help you break down department barriers and make bold market moves Michele Chambers, Chief Marketing Officer and VP Products at Continuum Analytics, is an entrepreneurial executive with over 25 years of industry experience. Prior to Continuum Analytics, Michele held executive leadership roles at several database and analytic companies, including Netezza, IBM, Revolution Analytics, MemSQL, and RapidMiner. Christine Doig is a senior data scientist at Continuum Analytics, where she's worked on several projects, including MEMEX, a DARPA-funded open data science project to help stop human trafficking. She has 5+ years of experience in analytics, operations research, and machine learning in a variety of industries.

In this week's episode of Data Skeptic, host Kyle Polich talks with guest Maura Church, Patreon's data science manager. Patreon is a fast-growing crowdfunding platform that allows artists and creators of all kinds build their own subscription content service. The platform allows fans to become patrons of their favorite artists- an idea similar the Renaissance times, when musicians would rely on benefactors to become their patrons so they could make more art. At Patreon, Maura's data science team strives to provide creators with insight, information, and tools, so that creators can focus on what they do best-- making art. On the show, Maura talks about some of her projects with the data science team at Patreon. Among the several topics discussed during the episode include: optical music recognition (OMR) to translate musical scores to electronic format, network analysis to understand the connection between creators and patrons, growth forecasting and modeling in a new market, and churn modeling to determine predictors of long time support. A more detailed explanation of Patreon's A/B testing framework can be found here Other useful links to topics mentioned during the show: OMR research Patreon blog Patreon HQ blog Amanda Palmer Fran Meneses

Mastering Spark for Data Science

Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products About This Book Develop and apply advanced analytical techniques with Spark Learn how to tell a compelling story with data science using Spark’s ecosystem Explore data at scale and work with cutting edge data science methods Who This Book Is For This book is for those who have beginner-level familiarity with the Spark architecture and data science applications, especially those who are looking for a challenge and want to learn cutting edge techniques. This book assumes working knowledge of data science, common machine learning methods, and popular data science tools, and assumes you have previously run proof of concept studies and built prototypes. What You Will Learn Learn the design patterns that integrate Spark into industrialized data science pipelines See how commercial data scientists design scalable code and reusable code for data science services Explore cutting edge data science methods so that you can study trends and causality Discover advanced programming techniques using RDD and the DataFrame and Dataset APIs Find out how Spark can be used as a universal ingestion engine tool and as a web scraper Practice the implementation of advanced topics in graph processing, such as community detection and contact chaining Get to know the best practices when performing Extended Exploratory Data Analysis, commonly used in commercial data science teams Study advanced Spark concepts, solution design patterns, and integration architectures Demonstrate powerful data science pipelines In Detail Data science seeks to transform the world using data, and this is typically achieved through disrupting and changing real processes in real industries. In order to operate at this level you need to build data science solutions of substance –solutions that solve real problems. Spark has emerged as the big data platform of choice for data scientists due to its speed, scalability, and easy-to-use APIs. This book deep dives into using Spark to deliver production-grade data science solutions. This process is demonstrated by exploring the construction of a sophisticated global news analysis service that uses Spark to generate continuous geopolitical and current affairs insights.You will learn all about the core Spark APIs and take a comprehensive tour of advanced libraries, including Spark SQL, Spark Streaming, MLlib, and more. You will be introduced to advanced techniques and methods that will help you to construct commercial-grade data products. Focusing on a sequence of tutorials that deliver a working news intelligence service, you will learn about advanced Spark architectures, how to work with geographic data in Spark, and how to tune Spark algorithms so they scale linearly. Style and approach This is an advanced guide for those with beginner-level familiarity with the Spark architecture and working with Data Science applications. Mastering Spark for Data Science is a practical tutorial that uses core Spark APIs and takes a deep dive into advanced libraries including: Spark SQL, visual streaming, and MLlib. This book expands on titles like: Machine Learning with Spark and Learning Spark. It is the next learning curve for those comfortable with Spark and looking to improve their skills.