talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

Build powerful AI apps with Copilot in Microsoft Fabric | BRK225

Build new analytics and AI models and supercharge your intelligent app strategy across your organization. Increase developer velocity with Copilot in Fabric and empower your data scientists and data analysts with Semantic Link, bridging the world of business intelligence and AI. Train custom ML models with Azure ML and Fabric Data Science, democratizing AI across lines-of-business and increasing collaboration between data professionals and ML professionals.

To learn more, please check out these resources: * https://aka.ms/Ignite23CollectionsBRK225H * https://info.microsoft.com/ww-landing-contact-me-for-events-m365-in-person-events.html?LCID=en-us&ls=407628-contactme-formfill * https://aka.ms/azure-ignite2023-dataaiblog

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Justyna Lucznik * Nellie Gustafsson * Misha Desai * Thasmika Gokal * Abhishek Narain * Alex Powers * Alex van Grootel * Ed Donahue * Lukasz Pawlowski * Raj RIkhy * Wilson Lee

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

BRK225 | English (US) | Data

MSIgnite

Near Extensions and Alignment of Data in R(superscript)n

Near Extensions and Alignment of Data in Rn Comprehensive resource illustrating the mathematical richness of Whitney Extension Problems, enabling readers to develop new insights, tools, and mathematical techniques Near Extensions and Alignment of Data in Rn demonstrates a range of hitherto unknown connections between current research problems in engineering, mathematics, and data science, exploring the mathematical richness of near Whitney Extension Problems, and presenting a new nexus of applied, pure and computational harmonic analysis, approximation theory, data science, and real algebraic geometry. For example, the book uncovers connections between near Whitney Extension Problems and the problem of alignment of data in Euclidean space, an area of considerable interest in computer vision. Written by a highly qualified author, Near Extensions and Alignment of Data in Rn includes information on: Areas of mathematics and statistics, such as harmonic analysis, functional analysis, and approximation theory, that have driven significant advances in the field Development of algorithms to enable the processing and analysis of huge amounts of data and data sets Why and how the mathematical underpinning of many current data science tools needs to be better developed to be useful New insights, potential tools, and mathematical techniques to solve problems in Whitney extensions, signal processing, shortest paths, clustering, computer vision, optimal transport, manifold learning, minimal energy, and equidistribution Providing comprehensive coverage of several subjects, Near Extensions and Alignment of Data in Rn is an essential resource for mathematicians, applied mathematicians, and engineers working on problems related to data science, signal processing, computer vision, manifold learning, and optimal transport.

O período pós pandemia e a volta das empresas ao regime presencial, potencializou a possibilidade de novos formatos de trabalhos mais flexíveis, na área de dados.

Por isso, exploramos as novas fronteiras do ambiente profissional, para desvendar os segredos por trás da eficácia no trabalho remoto, explorar a dinâmica das equipes híbridas e entender como a presença física pode influenciar o ambiente de dados.

Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conversamos com as co-fundadoras da comunidade Mulheres em Dados, conheçam elas: Marcela Galeotti, Analista de Dados Sênior da MasterClass ; Raquel Reis, Analista de Dados Sênior na Hotmart ; e a Tassia Giovanelli, Analista de Dados no PicPay.

Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!

Conheça nossas convidadas:

Marcela Galeotti, Analista de Dados Sênior da MasterClass ;  Raquel Reis, Analista de Dados Sênior na Hotmart ;  Tassia Giovanelli, Analista de Dados no PicPay.

Nossa Bancada Data Hackers:

Paulo Vasconcellos — Co-founder Monique Femme — Head of Community Management  Gabriel Lages — Co-founder

Links de referências:

Participe e responda a pesquisa State of Data: http://www.stateofdata.com.br/podcast Comunidade Mulheres em Dados (Instagram): https://www.instagram.com/mulheresemdados/ Comunidade Mulheres em Dados (Linkedin): https://www.linkedin.com/company/mulheresemdados/?originalSubdomain=br

Fundamentals of Data Science

Fundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researchers and professionals at different levels a good understanding of the concepts of data science, machine learning, data mining and analytics. Users will find the authors’ research experiences and achievements in data science applications, along with in-depth discussions on topics that are essential for data science projects, including pre-processing, that is carried out before applying predictive and descriptive data analysis tasks and proximity measures for numeric, categorical and mixed-type data. The book's authors include a systematic presentation of many predictive and descriptive learning algorithms, including recent developments that have successfully handled large datasets with high accuracy. In addition, a number of descriptive learning tasks are included. Presents the foundational concepts of data science along with advanced concepts and real-life applications for applied learning Includes coverage of a number of key topics such as data quality and pre-processing, proximity and validation, predictive data science, descriptive data science, ensemble learning, association rule mining, Big Data analytics, as well as incremental and distributed learning Provides updates on key applications of data science techniques in areas such as Computational Biology, Network Intrusion Detection, Natural Language Processing, Software Clone Detection, Financial Data Analysis, and Scientific Time Series Data Analysis Covers computer program code for implementing descriptive and predictive algorithms

Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services

This book is your practical and comprehensive guide to learning Google Cloud Platform (GCP) for data science, using only the free tier services offered by the platform. Data science and machine learning are increasingly becoming critical to businesses of all sizes, and the cloud provides a powerful platform for these applications. GCP offers a range of data science services that can be used to store, process, and analyze large datasets, and train and deploy machine learning models. The book is organized into seven chapters covering various topics such as GCP account setup, Google Colaboratory, Big Data and Machine Learning, Data Visualization and Business Intelligence, Data Processing and Transformation, Data Analytics and Storage, and Advanced Topics. Each chapter provides step-by-step instructions and examples illustrating how to use GCP services for data science and big data projects. Readers will learn how to set up a Google Colaboratory account and run Jupyternotebooks, access GCP services and data from Colaboratory, use BigQuery for data analytics, and deploy machine learning models using Vertex AI. The book also covers how to visualize data using Looker Data Studio, run data processing pipelines using Google Cloud Dataflow and Dataprep, and store data using Google Cloud Storage and SQL. What You Will Learn Set up a GCP account and project Explore BigQuery and its use cases, including machine learning Understand Google Cloud AI Platform and its capabilities Use Vertex AI for training and deploying machine learning models Explore Google Cloud Dataproc and its use cases for big data processing Create and share data visualizations and reports with Looker Data Studio Explore Google Cloud Dataflow and its use cases for batch and stream data processing Run data processing pipelines on Cloud Dataflow Explore Google Cloud Storageand its use cases for data storage Get an introduction to Google Cloud SQL and its use cases for relational databases Get an introduction to Google Cloud Pub/Sub and its use cases for real-time data streaming Who This Book Is For Data scientists, machine learning engineers, and analysts who want to learn how to use Google Cloud Platform (GCP) for their data science and big data projects

Learn Live: Train & track ML models with MLflow in Microsoft Fabric | BRK405LL

As a data scientist, you want to experiment with different machine learning models to find the best one. Microsoft Fabric offers you a familiar notebook experience to perform your data science workloads, while integrating with MLflow to allow you to easily track and manage your models. This LIVE session is presented by two experts, and our moderators will answer your questions directly in the chat.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Christopher MANEU * Kinfey Lo * Tim Fish * Frederick Anaafi * Konstantin Berezovsky * DE Producer 8 * Kimberly Murphy

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

BRK405LL | English (US) | Data

MSIgnite

The need for an independent semantic layer continues to rise as data science gains traction in the enterprise. Its five primary elements—metrics, caching, metadata management, APIs, and access controls—support AI/ML use cases as part of data science projects. Published at: https://www.eckerson.com/articles/why-and-how-to-enable-data-science-with-an-independent-semantic-layer

Data Smart, 2nd Edition
book
by Jordan Goldmeier (Booz Allen Hamilton; The Perduco Group; EY; Excel TV; Wake Forest University; Anarchy Data)

Want to jump into data science but don't know where to start? Let's be real, data science is presented as something mystical and unattainable without the most powerful software, hardware, and data expertise. Real data science isn't about technology. It's about how you approach the problem. In this updated edition of Data Smart: Using Data Science to Transform Information into Insight, award-winning data scientist and bestselling author Jordan Goldmeier shows you how to implement data science problems using Excel while exposing how things work behind the scenes. Data Smart is your field guide to building statistics, machine learning, and powerful artificial intelligence concepts right inside your spreadsheet. Inside you'll find: Four-color data visualizations that highlight and illustrate the concepts discussed in the book Tutorials explaining complicated data science using just Microsoft Excel How to take what you’ve learned and apply it to everyday problems at work and life Advice for using formulas, Power Query, and some of Excel's latest features to solve tough data problems Smart data science solutions for common business challenges Explanations of what algorithms do, how they work, and what you can tweak to take your Excel skills to the next level Data Smart is a must-read for students, analysts, and managers ready to become data science savvy and share their findings with the world.

Python for Data Science For Dummies, 3rd Edition

Let Python do the heavy lifting for you as you analyze large datasets Python for Data Science For Dummies lets you get your hands dirty with data using one of the top programming languages. This beginner’s guide takes you step by step through getting started, performing data analysis, understanding datasets and example code, working with Google Colab, sampling data, and beyond. Coding your data analysis tasks will make your life easier, make you more in-demand as an employee, and open the door to valuable knowledge and insights. This new edition is updated for the latest version of Python and includes current, relevant data examples. Get a firm background in the basics of Python coding for data analysis Learn about data science careers you can pursue with Python coding skills Integrate data analysis with multimedia and graphics Manage and organize data with cloud-based relational databases Python careers are on the rise. Grab this user-friendly Dummies guide and gain the programming skills you need to become a data pro.

Poor data engineering is like building a shaky foundation for a house—it leads to unreliable information, wasted time and money, and even legal problems, making everything less dependable and more troublesome in our digital world. In the retail industry specifically, data engineering is particularly important for managing and analyzing large volumes of sales, inventory, and customer data, enabling better demand forecasting, inventory optimization, and personalized customer experiences. It helps retailers make informed decisions, streamline operations, and remain competitive in a rapidly evolving market. Insight and frameworks learned from data engineering practices can be applied to a multitude of people and problems, and in turn, learning from someone who has been at the forefront of data engineering is invaluable.   Mohammad Sabah is SVP of Engineering and Data at Thrive Market, and was appointed to this role in 2018. He joined the company from The Honest Company where he served as VP of Engineering & Chief Data Scientist. Sabah joined The Honest Company following its acquisition of Insnap, which he co-founded in 2015. Over the course of his career, Sabah has held various data science and engineering roles at companies including Facebook, Workday, Netflix, and Yahoo! In the episode, Richie and Mo explore the importance of using AI to identify patterns and proactively address common errors, the use of tools like dbt and SODA for data pipeline abstraction and stakeholder involvement in data quality, data governance and data quality as foundations for strong data engineering, validation layers at each step of the data pipeline to ensure data quality, collaboration between data analysts and data engineers for holistic problem-solving and reusability of patterns, ownership mentality in data engineering and much more.  Links from the show: PagerDutyDomoOpsGeneCareer Track: Data Engineer

Se você sonha em mergulhar no mundo dos dados, exploramos as estratégias e habilidades necessárias para trilhar o caminho de se tornar um cientista de dados em 2024. Descubra como se preparar para as oportunidades do futuro e dominar o universo da ciencia de dados!

Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam essa dupla de especialistas: 

Mikaeri Ohana — Líder de AI e ML na CI&T, Criadora de Conteúdo no Explica Mi, premiada pelo Google como Google Developer Expert em ML e pela Microsoft como Microsoft Most Valuable Professional em AI, mestranda na Unicamp e fundadora da Escola Tesseract. Nilton Ueda — Global Data Product Manager at @AB-Inbev/Ambev, Professor MBA FIAP/MACKENZIE/IMPACTA/IBMEC, @LATAM Tableau Ambassador 3x

Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!

embed

Conheça nosso convidado:

Mikaeri Ohana  Nilton Ueda 

Bancada Data Hackers:

Paulo Vasconcellos Monique Femme Gabriel Lages

Falamos no episódioLinks de referências:

Participe e responda a pesquisa State of Data: http://www.stateofdata.com.br/podcast Onde encontrar a Mikaeri Http://Instagram.com/explicami https://medium.com/@mikaeriohana https://www.linkedin.com/in/mikaeriohana Onde encontrar o Nilton: https://www.linkedin.com/in/niltonkazuyukiueda/

podcast_episode
by Don Brown (UVA School of Data Science) , Bill Basener (UVA School of Data Science)

The latest episode of UVA Data Points features Don Brown, the senior associate dean for research at the School of Data Science, and professor Bill Basener as they discuss remote sensing, which is the process of collecting data about an object without contacting it.

The discussion traces the history of remote sensing, its many applications, and the challenges involved in gathering accurate information. The two take an in-depth look at Basener’s research, including his work with LiDAR and hyperspectral imaging.  Basener also explains the one aspect of this burgeoning technology that keeps him up at night.

Data Science: The Hard Parts

This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one. Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries. With this book, you will: Understand how data science creates value Deliver compelling narratives to sell your data science project Build a business case using unit economics principles Create new features for a ML model using storytelling Learn how to decompose KPIs Perform growth decompositions to find root causes for changes in a metric Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).

Today I’m joined by Marnix van de Stolpe, Product Owner at Coolblue in the area of data science. Throughout our conversation, Marnix shares the story of how he joined a data science team that was developing a solution that was too focused on the delivery of a data-science metric that was not on track to solve a clear customer problem. We discuss how Marnix came to the difficult decision to throw out 18 months of data science work, what it was like to switch to a human-centered, product approach, and the challenges that came with it. Marnix shares the impact this decision had on his team and the stakeholders involved, as well as the impact on his personal career and the advice he would give to others who find themselves in the same position. Marnix is also a Founding Member of the Data Product Leadership Community and will be going much more into the details and his experience live on Zoom on November 16 @ 2pm ET for members.

Highlights/ Skip to:

I introduce Marnix, Product Owner at Coolblue and one of the original members of the Data Product Leadership Community (00:35) Marnix describes what Coolblue does and his role there (01:20) Why and how Marnix decided to throw away 18 months of machine learning work (02:51) How Marnix determined that the KPI (metric) being created wasn’t enough to deliver a valuable product (07:56) Marnix describes the conversation with his data science team on mapping the solution back to the desired outcome (11:57) What the culture is like at Coolblue now when developing data products (17:17) Marnix’s advice for data product managers who are coming into an environment where existing work is not tied to a desired outcome (18:43) Marnix and I discuss why data literacy is not the solution to making more impactful data products (21:00) The impact that Marnix’s human-centered approach to data product development has had on the stakeholders at Coolblue (24:54) Marnix shares the ultimate outcome of the product his team was developing to measure product returns (31:05) How you can get in touch with Marnix (33:45)

Links Coolblue: https://www.coolblue.nl LinkedIn: https://www.linkedin.com/in/marnixvdstolpe/

R Bioinformatics Cookbook - Second Edition

R Bioinformatics Cookbook is your guide to leveraging the power of R for advanced bioinformatics tasks. This updated second edition uses a recipe-based method to teach data analysis, visualization, and machine learning tailored for biological datasets. You'll gain hands-on experience with popular tools like Bioconductor, ggplot2, and tidyverse to solve real-world genomics problems. What this Book will help me do Set up a reproducible bioinformatics analysis environment using R. Clean, analyze, and visualize biological data with R's powerful packages. Apply RNA-seq and ChIP-seq workflows to study genetic information effectively. Incorporate machine learning techniques into bioinformatics pipelines using R. Automate tasks and create professional-grade reports using functional programming and reporting tools. Author(s) The author, None MacLean, brings years of expertise in bioinformatics and computational biology. Known for clear explanations and practical approaches, they ensure the material is accessible yet challenging. With a strong focus on real-world applications, this book reflects their commitment to bridging bioinformatics and modern data science. Who is it for? This book is perfect for bioinformaticians, researchers, and data scientists with prior R experience. It's tailored for those looking to delve deeper into genomics, data visualization, and bioinformatics techniques. Intermediate knowledge of bioinformatics concepts and familiarity with R programming are assumed for readers to fully benefit from the content.

Send us a text Kristen McGarry is a Principal Account Technical Lead for the Financial Services Market at IBM. Based in New York City, she engages daily with the largest financial institutions globally to identify business opportunities for innovation, accelerate time to value, and operationalize new solutions across software, hardware and services offerings.  02:57 An Intro to Kristen McGarry  04:36 Why IBM?09:25 The Attraction of Data Science11:51 A Day in the Life of an Account Technical Leader13:30 Technical Sales versus Sales15:05 Continuing to Innovate19:09 Dealing with Wall Street20:17 The Methodology22:23 The How of Technical Sales23:05 Continuous Learning28:03 Management System 30:34 Wall Street Learnings32:20 Biggest Challenge33:08 The Data Challenge34:22 Best Data Science Use Cases in Finance 36:14 What Do Clients Miss on AI?38:09 PredictionsLinkedIn: https://www.linkedin.com/in/kristen-mcgarry/ Website: https://www.ibm.com/ Want to be featured as a guest on Making Data Simple?  Reach out to us at [email protected] and tell us why you should be next.  The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.  and leadership ... while keeping it simple & fun.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Driving toward data mesh at Rivian - Coalesce 2023

This session shares Rivian’s journey building an analytics ecosystem from scratch over the last two years, centered around dbt and dbt Cloud. Through this work, Rivian is driving towards a healthy data mesh that links data and developers across many domains at the company, to enable rapid growth and value as they ramp to produce their new fleet of EVs and keep the world adventurous.

Speakers: Will Bishop, Manager, Data Science Analytics, Rivian

Register for Coalesce at https://coalesce.getdbt.com

Notion’s blueprint for adapting data science models to changing sales processes - Coalesce 2023

Prioritizing the right sales opportunities is pivotal for any SaaS company's growth, but what happens after your initial success? Jessica Zhang, Data Science Manager at Notion, traces Notion's footsteps from its foundational days to its present-day lead scoring techniques. Learn how modern tools like dbt, Census, and Snowflake enable the Notion team to iterate quickly. More than a journey, this session is a lesson on evolving a data science model in response to changing business assumptions and fresh user insights.

Speakers: Jessica Zhang, Data Science Manager, Notion; Jeff Sloan, Sr. Data Community Advocate, Census

Register for Coalesce at https://coalesce.getdbt.com

The Statistics and Machine Learning with R Workshop

This book guides readers through the essentials of applied statistics and machine learning using the R programming language. By delving into robust data processing techniques, visualization, and statistical modeling with R, you will develop skills to effectively analyze data and design predictive models. Each chapter includes hands-on exercises to reinforce the concepts in a practical, intuitive way. What this Book will help me do Understand and apply key statistical concepts such as probability distributions and hypothesis testing to analyze data. Master foundational mathematical principles like linear algebra and calculus relevant to data science and machine learning. Develop proficiency in data manipulation and visualization using robust R libraries such as dplyr and ggplot2. Build predictive models through practical exercises and learn advanced concepts like Bayesian statistics and linear regression. Gain the practical knowledge needed to apply statistical and machine learning methodologies in real-world scenarios. Author(s) Liu Peng is an accomplished author with a strong academic and practical background in statistics and data science. Armed with extensive experience in applying R to real-world problems, he brings a blend of technical mastery and teaching expertise. His commitment is to transform complex concepts into accessible, enriching learning experiences for readers. Who is it for? This book is ideal for data scientists and analysts ranging from beginners to those at an intermediate level. It caters especially to those interested in practicing statistical modeling and learning R in depth. If you have basic familiarity with statistics and are looking to expand your data science capabilities using R, this book is well-suited for you.