talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

Statistical Analysis with R For Dummies

Understanding the world of R programming and analysis has never been easier Most guides to R, whether books or online, focus on R functions and procedures. But now, thanks to Statistical Analysis with R For Dummies, you have access to a trusted, easy-to-follow guide that focuses on the foundational statistical concepts that R addresses—as well as step-by-step guidance that shows you exactly how to implement them using R programming. People are becoming more aware of R every day as major institutions are adopting it as a standard. Part of its appeal is that it's a free tool that's taking the place of costly statistical software packages that sometimes take an inordinate amount of time to learn. Plus, R enables a user to carry out complex statistical analyses by simply entering a few commands, making sophisticated analyses available and understandable to a wide audience. Statistical Analysis with R For Dummies enables you to perform these analyses and to fully understand their implications and results. Gets you up to speed on the #1 analytics/data science software tool Demonstrates how to easily find, download, and use cutting-edge community-reviewed methods in statistics and predictive modeling Shows you how R offers intel from leading researchers in data science, free of charge Provides information on using R Studio to work with R Get ready to use R to crunch and analyze your data—the fast and easy way!

Monetizing Your Data

Transforming data into revenue generating strategies and actions Organizations are swamped with data—collected from web traffic, point of sale systems, enterprise resource planning systems, and more , but what to do with it? Monetizing your Data provides a framework and path for business managers to convert ever-increasing volumes of data into revenue generating actions through three disciplines: decision architecture, data science, and guided analytics. There are large gaps between understanding a business problem and knowing which data is relevant to the problem and how to leverage that data to drive significant financial performance. Using a proven methodology developed in the field through delivering meaningful solutions to Fortune 500 companies, this book gives you the analytical tools, methods, and techniques to transform data you already have into information into insights that drive winning decisions. Beginning with an explanation of the analytical cycle, this book guides you through the process of developing value generating strategies that can translate into big returns. The companion website, www.monetizingyourdata.com, provides templates, checklists, and examples to help you apply the methodology in your environment, and the expert author team provides authoritative guidance every step of the way. This book shows you how to use your data to: Monetize your data to drive revenue and cut costs Connect your data to decisions that drive action and deliver value Develop analytic tools to guide managers up and down the ladder to better decisions Turning data into action is key; data can be a valuable competitive advantage, but only if you understand how to organize it, structure it, and uncover the actionable information hidden within it through decision architecture and guided analytics. From multinational corporations to single-owner small businesses, companies of every size and structure stand to benefit from these tools, methods, and techniques; Monetizing your Data walks you through the translation and transformation to help you leverage your data into value creating strategies.

Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist

Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. What You Will Learn Perform data science and analytics using statistics and the R programming language Visualize and explore data, including working with large data sets found in big data Build an R package Test and check your code Practice version control Profile and optimize your code Who This Book Is For Those with some data science or analytics background, but not necessarily experience with the R programming language.

Think Like a Data Scientist

Think Like a Data Scientist presents a step-by-step approach to data science, combining analytic, programming, and business perspectives into easy-to-digest techniques and thought processes for solving real world data-centric problems. About the Technology Data collected from customers, scientific measurements, IoT sensors, and so on is valuable only if you understand it. Data scientists revel in the interesting and rewarding challenge of observing, exploring, analyzing, and interpreting this data. Getting started with data science means more than mastering analytic tools and techniques, however; the real magic happens when you begin to think like a data scientist. This book will get you there. About the Book Think Like a Data Scientist teaches you a step-by-step approach to solving real-world data-centric problems. By breaking down carefully crafted examples, you'll learn to combine analytic, programming, and business perspectives into a repeatable process for extracting real knowledge from data. As you read, you'll discover (or remember) valuable statistical techniques and explore powerful data science software. More importantly, you'll put this knowledge together using a structured process for data science. When you've finished, you'll have a strong foundation for a lifetime of data science learning and practice. What's Inside The data science process, step-by-step How to anticipate problems Dealing with uncertainty Best practices in software and scientific thinking About the Reader Readers need beginner programming skills and knowledge of basic statistics. About the Author Brian Godsey has worked in software, academia, finance, and defense and has launched several data-centric start-ups. Quotes Explains difficult concepts and techniques concisely and approachably. - Jenice Tom, CVS Health Goes beyond simple tools and techniques and helps you to conceptualize and solve challenging, real-world data science problems. - Casimir Saternos, Synchronoss Technologies A successful attempt to put the mind of a data scientist on paper. - David Krief, Altansia The book that changed my career path! - Nicolas Boulet-Lavoie, DL Innov

Data Science For Dummies, 2nd Edition

Your ticket to breaking into the field of data science! Jobs in data science are projected to outpace the number of people with data science skills—making those with the knowledge to fill a data science position a hot commodity in the coming years. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of an organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you'll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It's a big, big data world out there—let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

The Data Science Handbook

A comprehensive overview of data science covering the analytics, programming, and business skills necessary to master the discipline Finding a good data scientist has been likened to hunting for a unicorn: the required combination of technical skills is simply very hard to find in one person. In addition, good data science is not just rote application of trainable skill sets; it requires the ability to think flexibly about all these areas and understand the connections between them. This book provides a crash course in data science, combining all the necessary skills into a unified discipline. Unlike many analytics books, computer science and software engineering are given extensive coverage since they play such a central role in the daily work of a data scientist. The author also describes classic machine learning algorithms, from their mathematical foundations to real-world applications. Visualization tools are reviewed, and their central importance in data science is highlighted. Classical statistics is addressed to help readers think critically about the interpretation of data and its common pitfalls. The clear communication of technical results, which is perhaps the most undertrained of data science skills, is given its own chapter, and all topics are explained in the context of solving real-world data problems. The book also features: • Extensive sample code and tutorials using Python™ along with its technical libraries • Core technologies of “Big Data,” including their strengths and limitations and how they can be used to solve real-world problems • Coverage of the practical realities of the tools, keeping theory to a minimum; however, when theory is presented, it is done in an intuitive way to encourage critical thinking and creativity • A wide variety of case studies from industry • Practical advice on the realities of being a data scientist today, including the overall workflow, where time is spent, the types of datasets worked on, and the skill sets needed The Data Science Handbook is an ideal resource for data analysis methodology and big data software tools. The book is appropriate for people who want to practice data science, but lack the required skill sets. This includes software professionals who need to better understand analytics and statisticians who need to understand software. Modern data science is a unified discipline, and it is presented as such. This book is also an appropriate reference for researchers and entry-level graduate students who need to learn real-world analytics and expand their skill set. FIELD CADY is the data scientist at the Allen Institute for Artificial Intelligence, where he develops tools that use machine learning to mine scientific literature. He has also worked at Google and several Big Data startups. He has a BS in physics and math from Stanford University, and an MS in computer science from Carnegie Mellon.

Scala: Guide for Data Science Professionals

Scala will be a valuable tool to have on hand during your data science journey for everything from data cleaning to cutting-edge machine learning About This Book Build data science and data engineering solutions with ease An in-depth look at each stage of the data analysis process — from reading and collecting data to distributed analytics Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulations, and source code Who This Book Is For This learning path is perfect for those who are comfortable with Scala programming and now want to enter the field of data science. Some knowledge of statistics is expected. What You Will Learn Transfer and filter tabular data to extract features for machine learning Read, clean, transform, and write data to both SQL and NoSQL databases Create Scala web applications that couple with JavaScript libraries such as D3 to create compelling interactive visualizations Load data from HDFS and HIVE with ease Run streaming and graph analytics in Spark for exploratory analysis Bundle and scale up Spark jobs by deploying them into a variety of cluster managers Build dynamic workflows for scientific computing Leverage open source libraries to extract patterns from time series Master probabilistic models for sequential data In Detail Scala is especially good for analyzing large sets of data as the scale of the task doesn’t have any significant impact on performance. Scala’s powerful functional libraries can interact with databases and build scalable frameworks — resulting in the creation of robust data pipelines. The first module introduces you to Scala libraries to ingest, store, manipulate, process, and visualize data. Using real world examples, you will learn how to design scalable architecture to process and model data — starting from simple concurrency constructs and progressing to actor systems and Apache Spark. After this, you will also learn how to build interactive visualizations with web frameworks. Once you have become familiar with all the tasks involved in data science, you will explore data analytics with Scala in the second module. You’ll see how Scala can be used to make sense of data through easy to follow recipes. You will learn about Bokeh bindings for exploratory data analysis and quintessential machine learning with algorithms with Spark ML library. You’ll get a sufficient understanding of Spark streaming, machine learning for streaming data, and Spark graphX. Armed with a firm understanding of data analysis, you will be ready to explore the most cutting-edge aspect of data science — machine learning. The final module teaches you the A to Z of machine learning with Scala. You’ll explore Scala for dependency injections and implicits, which are used to write machine learning algorithms. You’ll also explore machine learning topics such as clustering, dimentionality reduction, Naïve Bayes, Regression models, SVMs, neural networks, and more. This learning path combines some of the best that Packt has to offer into one complete, curated package. It includes content from the following Packt products: Scala for Data Science, Pascal Bugnion Scala Data Analysis Cookbook, Arun Manivannan Scala for Machine Learning, Patrick R. Nicolas Style and approach A complete package with all the information necessary to start building useful data engineering and data science solutions straight away. It contains a diverse set of recipes that cover the full spectrum of interesting data analysis tasks and will help you revolutionize your data analysis skills using Scala. Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

2017 European Data Science Salary Survey

How do data science salaries for people in Europe compare to their counterparts in the rest of the world? Among the more than 1000 people who responded to O’Reilly’s 2016 Data Science Salary Survey, 359 live and work in various European countries as data scientists, analysts, engineers, and related professions. This report takes a deep dive into the survey results from respondents in various regions of Europe, including the tools they use, the compensation they receive, and the roles they play in their respective organizations. Even if you didn’t take part in the survey, you can still plug your own information into the survey’s simple linear model to see where you fit. With this report, you’ll learn: How salaries vary by country and specific regions in Europe Average size of companies by region How salary is affected by a country’s GDP Top industries for data scientists, including software, banking, finance, retail, and ecommerce Most commonly used tools vs tools used by respondents with above-average salaries Primary and secondary job tasks performed by survey respondents To stay up-to-date on this research, your participation is crucial. The survey is now open for the 2017 report; please take just 5 to 10 minutes to participate in the survey here.

In this episode, we talk about a high-level description of deep learning.  Kyle presents a simple game (pictured below), which is more of a puzzle really, to try and give  Linh Da the basic concept.  

Thanks to our sponsor for this week, the Data Science Association. Please check out their upcoming Dallas conference at dallasdatascience.eventbrite.com

Versioning isn't just for source code. Being able to track changes to data is critical for answering questions about data provenance, quality, and reproducibility. Daniel Whitenack joins me this week to talk about these concepts and share his work on Pachyderm. Pachyderm is an open source containerized data lake. During the show, Daniel mentioned the Gopher Data Science github repo as a great resource for any data scientists interested in the Go language. Although we didn't mention it, Daniel also did an interesting analysis on the 2016 world chess championship that complements our recent episode on chess well. You can find that post here Supplemental music is Lee Rosevere's Let's Start at the Beginning.   Thanks to Periscope Data for sponsoring this episode. More about them at periscopedata.com/skeptics

Somewhere along the spectrum of "logging into Google Analytics" and "the machines are in control" is the world of the power analyst who interacts with the data on the fly, applies statistics to large data sets, and develops interactive visualizations that go well beyond the capabilities of Excel. Those power analysts are operating on the fringes of the domain of the "data scientist" -- a role for which no one can really agree on a concrete definition! In this session, Tim -- who has never claimed to be and never will claim to be a data scientist -- will share what he has learned from trying to understand the scope and nature of that role. And, beyond that, how he has grown as a digital analyst, expanded his skills to "program with data" with R, and increased his value to the organizations with which he works as a result.

Logistic Regression is a popular classification algorithm. In this episode, we discuss how it can be used to determine if an audio clip represents one of two given speakers. It assumes an output variable (isLinhda) is a linear combination of available features, which are spectral bands in the discussion on this episode.   Keep an eye on the dataskeptic.com blog this week as we post more details about this project.   Thanks to our sponsor this week, the Data Science Association.  Please check out their upcoming conference in Dallas on Saturday, February 18th, 2017 via the link below.   dallasdatascience.eventbrite.com

In this session, Juan F. Gorricho, Chief Data & Analytics Officer, PFCU / Walt Disney, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey as an analytics executive, best practices, hacks for upcoming executives, and some challenges/opportunities he's observing as a Chief Data & Analytics Officer. Juan discussed creating a data-driven culture and what leaders could do to get buy-ins for building strong data science capabilities.

Timeline: 0:29 Juan's journey. 4:57 Defining International society of chief data officers. 7:08 Joining International society of CDO. 7:45 Being in a credit union and being a CDO. 10:33 Hacks to creating a data-driven culture. 16:25 Being a partner of Walt Disney. 19:20 Data sharing with Disney. 21:50 Data officers vs. analytics officer. 25:59 Getting the leadership onboard on data. 30:44 The business decision making of a CDO at PFCU. 33:33 Collaboration with IT. 37:48 Challenges Juan faces in his current role. 45:03 Building data solutions or buying data solutions? 49:05 Advice for data leaders.

Podcast link: https://futureofdata.org/analyticsweek-leadership-podcast-with-juan-f-gorricho-disney/

Here's Juan F. Gorricho Bio: Juan F. Gorricho is currently the Chief Data & Analytics Officer for Partners Federal Credit Union. In this role, Juan leads the data and analytics strategy development and execution for Partners, one of the top credit unions in the country, exclusively serving the more than 100,000 cast members of The Walt Disney Company. Juan has more than 15 years of experience in the data and analytics space, including multiple speaking engagements. In his prior roles with Disney, Juan led multiple multimillion-dollar projects to implement business intelligence and analytical solutions for key lines of business such as Labor Operations and Merchandise. Juan has an Industrial Engineering degree from Universidad de los Andes in Bogotá, Colombia, and an MBA from the Darden Graduate School of Business Administration at the University of Virginia. Juan is married and lives with his wife and two children in Orlando, Florida, United States of America.

Follow @jgorricho

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Strategies in Biomedical Data Science

An essential guide to healthcare data problems, sources, and solutions Strategies in Biomedical Data Science provides medical professionals with much-needed guidance toward managing the increasing deluge of healthcare data. Beginning with a look at our current top-down methodologies, this book demonstrates the ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and toolsets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. You'll dig into the unknown challenges that come along with every advance, and explore the ways in which healthcare data management and technology will inform medicine, politics, and research in the not-so-distant future. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigations provides necessary insight for forward-looking healthcare professionals. Big Data has been a topic of discussion for some time, with much attention focused on problems and management issues surrounding truly staggering amounts of data. This book offers a lifeline through the tsunami of healthcare data, to help the medical community turn their data management problem into a solution. Consider the data challenges personalized medicine entails Explore the available advanced analytic resources and tools Learn how bioinformatics as a service is quickly becoming reality Examine the future of IOT and the deluge of personal device data The sheer amount of healthcare data being generated will only increase as both biomedical research and clinical practice trend toward individualized, patient-specific care. Strategies in Biomedical Data Science provides expert insight into the kind of robust data management that is becoming increasingly critical as healthcare evolves.

We close out 2016 with a discussion of a basic interview question which might get asked when applying for a data science job. Specifically, how a library might build a model to predict if a book will be returned late or not.  

Pro Tableau: A Step-by-Step Guide

Leverage the power of visualization in business intelligence and data science to make quicker and better decisions. Use statistics and data mining to make compelling and interactive dashboards. This book will help those familiar with Tableau software chart their journey to being a visualization expert. Pro Tableau demonstrates the power of visual analytics and teaches you how to: Connect to various data sources such as spreadsheets, text files, relational databases (Microsoft SQL Server, MySQL, etc.), non-relational databases (NoSQL such as MongoDB, Cassandra), R data files, etc. Write your own custom SQL, etc. Perform statistical analysis in Tableau using R Use a multitude of charts (pie, bar, stacked bar, line, scatter plots, dual axis, histograms, heat maps, tree maps, highlight tables, box and whisker, etc.) What you'll learn Connect to various data sources such as relational databases (Microsoft SQL Server, MySQL), non-relational databases (NoSQL such as MongoDB, Cassandra), write your own custom SQL, join and blend data sources, etc. Leverage table calculations (moving average, year over year growth, LOD (Level of Detail), etc. Integrate Tableau with R Tell a compelling story with data by creating highly interactive dashboards Who this book is for All levels of IT professionals, from executives responsible for determining IT strategies to systems administrators, to data analysts, to decision makers responsible for driving strategic initiatives, etc. The book will help those familiar with Tableau software chart their journey to a visualization expert.

Apache Spark for Data Science Cookbook

In "Apache Spark for Data Science Cookbook," you'll delve into solving real-world analytical challenges using the robust Apache Spark framework. This book features hands-on recipes that cover data analysis, distributed machine learning, and real-time data processing. You'll gain practical skills to process, visualize, and extract insights from large datasets efficiently. What this Book will help me do Master using Apache Spark for processing and analyzing large-scale datasets effectively. Harness Spark's MLLib for implementing machine learning algorithms like classification and clustering. Utilize libraries such as NumPy, SciPy, and Pandas in conjunction with Spark for numerical computations. Apply techniques like Natural Language Processing and text mining using Spark-integrated tools. Perform end-to-end data science workflows, including data exploration, modeling, and visualization. Author(s) Nagamallikarjuna Inelu and None Chitturi bring their extensive experience working with data science and distributed computing frameworks like Apache Spark. Nagamallikarjuna specializes in applying machine learning algorithms to big data problems, while None has contributed to various big data system implementations. Together, they focus on providing practitioners with practical and efficient solutions. Who is it for? This book is primarily intended for novice and intermediate data scientists and analysts who are curious about using Apache Spark to tackle data science problems. Readers are expected to have some familiarity with basic data science tasks. If you want to learn practical applications of Spark in data analysis and enhance your big data analytics skills, this resource is for you.

In this session, Scott Zoldi, Chief Analytics Officer, FICO, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey as an analytics executive, best practices, and hacks for upcoming executives challenges/opportunities he's observing as a Chief Analytics Officer. Scott discussed creating the data-driven culture and what leaders could do to get buy-ins for building strong data science capabilities. Scott discussed his passion for security analytics. He shared some best practices to put-up a Cyber Security Center of Excellence. Scott also shared what traits future leaders should have.

Timeline:

0:29 Scott's journey. 5:10 On Falcon's fraud manager. 9:12 Area in secuity where AI works. 11:40 FICO's dealing with new products. 15:30 Centre of excellence for cyber security. 22:00 Should a center of excellence be inside out or in partnership? 28:22 The CAO role in FICO. 31:14 Is FICO in facing or out facing? 32:12 Being analytical in a gutt based organization. 35:54 Art of doing business and science of doing business. 38:22 Challenges as CAO in FICO. 41:09 Opportunity for data science in the security space. 45:54 Qualities required for a CAO. 48:54 Tips for a data scientist to get hired at FICO.

Podcast link: https://futureofdata.org/analyticsweek-leadership-podcast-with-scott-zoldi-cao-fico/

Here's Scott Zoldi's Bio: Scott Zoldi is Chief Analytics Officer at FICO, responsible for the analytic development of FICO’s product and technology solutions, including the FICO™ Falcon® Fraud Manager product, which protects about two-thirds of the world’s payment card transactions from fraud. While at FICO, Scott has been responsible for authoring 72 analytic patents, 36 patents granted, and 36 in process. Scott is actively involved in developing new analytic products and Big Data analytics applications, many of which leverage new streaming artificial intelligence innovations such as adaptive analytics, collaborative profiling, and self-learning models. Scott is most recently focused on the applications of streaming self-learning analytics for real-time detection of Cyber Security attacks and Money Laundering. Scott serves on two boards of directors, including Software San Diego and Cyber Center of Excellence. Scott received his Ph.D. in theoretical physics from Duke University.

Follow @scottzoldi

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

Principles of Data Science

If you've ever wondered how to bridge the gap between mathematics, programming, and actionable data insights, 'Principles of Data Science' is the guide for you. This book explores the full data science pipeline, providing you with tools and knowledge to transform raw data into impactful decisions. With practical lessons and hands-on tutorials, you'll master the essential skills of a data scientist. What this Book will help me do Understand and apply the five core steps of the data science process. Gain insight into data cleaning, visualization, and effective communication of results. Learn and implement foundational machine learning models using Python or R. Bridge gaps between mathematics, statistics, and programming to solve data-driven problems. Evaluate machine learning models using key metrics for better predictive capabilities. Author(s) The author, a seasoned data scientist with years of professional experience in analytics and software development, brings a rich perspective to the topic. Combining a strong foundation in mathematics with expertise in Python and R, they have worked on diverse real-world data projects. Their teaching philosophy emphasizes clarity and practical application, ensuring you not only gain knowledge but also know how to apply it effectively. Who is it for? This book is intended for individuals with a basic understanding of algebra and some programming experience in Python or R. It is perfect for programmers who wish to dive into the world of data science or for those with math skills looking to apply them practically. If you seek to turn raw data into valuable insights and predictions, this book is tailored for you.

Efficient R Programming

There are many excellent R resources for visualization, data science, and package development. Hundreds of scattered vignettes, web pages, and forums explain how to use R in particular domains. But little has been written on how to simply make R work effectively—until now. This hands-on book teaches novices and experienced R users how to write efficient R code. Drawing on years of experience teaching R courses, authors Colin Gillespie and Robin Lovelace provide practical advice on a range of topics—from optimizing the set-up of RStudio to leveraging C++—that make this book a useful addition to any R user’s bookshelf.