Excel often gets unfair criticism from data practitioners, many of us will remember a time when Excel was looked down upon—why would anyone use Excel when we have powerful tools like Python, R, SQL, or BI tools? However, like it or not, Excel is here to stay, and there’s a meme, bordering on reality, that Excel is carrying a large chunk of the world’s GDP. But when it really comes down to it, can you do data science in Excel? Jordan Goldmeier is an entrepreneur, a consultant, a best-selling author of four books on data, and a digital nomad. He started his career as a data scientist in the defense industry for Booz Allen Hamilton and The Perduco Group, before moving into consultancy with EY, and then teaching people how to use data at Excel TV, Wake Forest University, and now Anarchy Data. He also has a newsletter called The Money Making Machine, and he's on a mission to create 100 entrepreneurs. In the episode, Adel and Jordan explore excel in data science, excel’s popularity, use cases for Excel in data science, the impact of GenAI on Excel, Power Query and data transformation, advanced Excel features, Excel for prototyping and generating buy-in, the limitations of Excel and what other tools might emerge in its place, and much more. Links Mentioned in the Show: Data Smart: Using Data Science to Transform Information Into Insight by Jordan Goldmeier[Webinar] Developing a Data Mindset: How to Think, Speak, and Understand Data[Course] Data Analysis in ExcelRelated Episode: Do Spreadsheets Need a Rethink? With Hjalmar Gislason, CEO of GRIDRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
talk-data.com
Topic
Data Science
1516
tagged
Activity Trend
Top Events
Being able to present your analysis and convince your teammates to take action is a huge part of the job for any Data Analyst or Data Scientist. But for many of us, delivering effective presentations isn't something that comes naturally. Fortunately, everyone (including you) can improve their communication skills if they know what to focus on. In this session, we'll be sharing some of the best strategies and actionable advice to help you capture your audience, tell a story with your data, and most importantly, drive impact for your organization. You'll leave with specific tips that you'll be able to use immediately to take your presentation game to the next level. What You'll Learn Why most presentations flop and how you can succeed How to stop sharing data and start telling stories instead The scientific approach to getting your audience to listen Register for free to be part of the next live session: https://bit.ly/3XB3A8b About our guest: Christopher Chin is a techie-turned leadership communication coach. He previously worked for Fortune 500 tech companies like Thermo Fisher Scientific, Humana, and Fannie Mae in the specialties of data journalism, data science, data visualization, and business intelligence. Each time, he saw extremely talented colleagues struggle to get the opportunities they deserved because they couldn't present, tell a story, and speak with confidence. Now he works as Founder & CEO of The Hidden Speaker, a training consultancy that puts tech professionals on the path to confident communication. He has returned to Fortune 500 companies to train their technical teams with highly specialized communication workshops, as well as taught for companies and universities around the world. As a speaker, coach, and trainer, Christopher's work has helped thousands demonstrate leadership through communication and he is passionate about convincing every introverted, techie out there that they, too, can bring out their hidden speaker. Check out Christopher's free e-book + Newsletter: The Ultimate Data Storytelling and Presentation Guide Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter
Despite GPT, Claude, Gemini, LLama and the other host of LLMs that we have access to, a variety of organizations are still exploring their options when it comes to custom LLMs. Logging in to ChatGPT is easy enough, and so is creating a 'custom' openAI GPT, but what does it take to create a truly custom LLM? When and why might this be useful, and will it be worth the effort? Vincent Granville is a pioneer in the AI and machine learning space, he is Co-Founder of Data Science Central, Founder of MLTechniques.com, former VC-funded executive, author, and patent owner. Vincent’s corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He is also a former post-doc at Cambridge University and the National Institute of Statistical Sciences. Vincent has published in the Journal of Number Theory, Journal of the Royal Statistical Society, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the author of multiple books, including “Synthetic Data and Generative AI”. In the episode, Richie and Vincent explore why you might want to create a custom LLM including issues with standard LLMs and benefits of custom LLMs, the development and features of custom LLMs, architecture and technical details, corporate use cases, technical innovations, ethics and legal considerations, and much more. Links Mentioned in the Show: Read Articles by VincentSynthetic Data and Generative AI by Vincent GranvilleConnect with Vincent on Linkedin[Course] Developing LLM Applications with LangChainRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
Para explorar as etapas de um processo seletivo na área de dados e como se preparar para uma boa entrevista, debatemos com as nossas convidadas as polêmicas associadas ao processo, como os desafios técnicos, o uso de plataformas de IA nas seleções, além de fornecer dicas e destacar as habilidades valorizadas pelos recrutadores e a frente técnica.
Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam Renata Biaggi, Data Scientist & AI Solutions na Celfocus, e Luna Metz, Tech Recruiter Specialist.
Juntas, elas compartilham como se preparar para entrevistas e processos seletivos em dados, além de histórias inusitadas que marcaram suas jornadas.
Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!
Nossa Bancada Data Hackers:
Paulo Vasconcellos — Co-founder da Data Hackers e Principal Data Scientist na Hotmart. Monique Femme — Head of Community Management na Data Hackers Gabriel Lages — Co-founder da Data Hackers e Data & Analytics Sr. Director na Hotmart.
Referências:
Website: www.renatabiaggi.com Instagram: @prof.renatabiaggi: www.instagram.com/prof.renatabiaggi O método DEFINITIVO para passar em entrevistas de Data Science: https://www.youtube.com/watch?v=wF_zkr2vTz4&t=67s
The role of the data scientist is changing. Some organizations are splitting the role into more narrowly focused jobs, while others are broadening it. The latter approach, known as the Full Stack Data Scientist, is derived from the concept of a full stack software engineer, with this role often including software engineering tasks. In particular, one of the key functions of a full stack data scientist is to take machine learning models and get them into production inside software. So, what separates projects from production? Savin Goyal is the Co-Founder & CTO at Outerbounds. In addition to his work at Outerbounds, Savin is the creator of the open source machine learning management platform Metaflow. Previously Savin has worked as a Software Engineer at Netflix and LinkedIn. In the episode, Richie and Savin explore the definition of production in data science, steps to move from internal projects to production, the lifecycle of a machine learning project, success stories in data science, challenges in quality control, Metaflow, scalability and robustness in production, AI and MLOps, advice for organizations and much more. Links Mentioned in the Show: OuterboundsMetaflowConnect with Savin on Linkedin[Course] Developing Machine Learning Models for ProductionRelated Episode: Why ML Projects Fail, and How to Ensure Success with Eric Siegel, Founder of Machine Learning Week, Former Columbia Professor, and Bestselling AuthorRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
Neste episódio, mergulhamos profundamente no tema automação de pipelines de dados e seu impacto na eficiência operacional. Descubra como as tecnologias de automação estão revolucionando a gestão de dados e impulsionando a produtividade das equipes.
Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam Murilo Viveiros — Gerente de produto na BMC Software; Fabiana Delfino — Sr. Solution Engineer na BMC Software e o Luiz Pereira — Data Architecture Manager na Gerdau
Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!
Falamos no episódio
Nossa Bancada Data Hackers:
Monique Femme — Head of Community Management na Data Hackers Gabriel Lages — Co-founder da Data Hackers e Data & Analytics Sr. Director na Hotmart.
Referências:
Data4all: https://ada.tech/sou-aluno/plataforma/gerdau-data4all Conheça a BMC : https://www.bmcsoftware.pt/
Matinée portes ouvertes avec choix de cours dans Data Analyse, Data Science ou Web Dev, animé par l'équipe Le Wagon Paris.
Enhance your data science programming and analysis with the Wolfram programming language and Mathematica, an applied mathematical tools suite. This second edition introduces the latest LLM Wolfram capabilities, delves into the exploration of data types in Mathematica, covers key programming concepts, and includes code performance and debugging techniques for code optimization. You’ll gain a deeper understanding of data science from a theoretical and practical perspective using Mathematica and the Wolfram Language. Learning this language makes your data science code better because it is very intuitive and comes with pre-existing functions that can provide a welcoming experience for those who use other programming languages. Existing topics have been reorganized for better context and to accommodate the introduction of Notebook styles. The book also incorporates new functionalities in code versions 13 and 14 for imported and exported data. You’ll see how to use Mathematica, where data management and mathematical computations are needed. Along the way, you’ll appreciate how Mathematica provides an entirely integrated platform: its symbolic and numerical calculation result in a mized syntax, allowing it to carry out various processes without superfluous lines of code. You’ll learn to use its notebooks as a standard format, which also serves to create detailed reports of the processes carried out. What You Will Learn Create datasets, work with data frames, and create tables Import, export, analyze, and visualize data Work with the Wolfram data repository Build reports on the analysis Use Mathematica for machine learning, with different algorithms, including linear, multiple, and logistic regression; decision trees; and data clustering Who This Book Is For Data scientists who are new to using Wolfram and Mathematica as a programming language or tool. Programmers should have some prior programming experience, but can be new to the Wolfram language.
From data science to software engineering, Large Language Models (LLMs) have emerged as pivotal tools in shaping the future of programming. In this session, Michele Catasta, VP of AI at Replit, Jordan Tigani, CEO at Motherduck, and Ryan J. Salva, VP of Product at GitHub, will explore practical applications of LLMs in coding workflows, how to best approach integrating AI into the workflows of data teams, what the future holds for AI-assisted coding, and a lot more. Links Mentioned in the Show: Rewatch Session from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
In his presentation, Elad will provide a novel take on Airflow, highlighting its versatility beyond conventional use for scheduled pipelines. He’ll discuss its application as an on-demand tool for initiating and halting jobs, mainly in the Data Science fields, like dataset enrichment and batch prediction via API calls, complete with real-time status tracking and alerts. The talk aims to encourage a fresh approach to Airflow utilization but will also delve into the technical aspects of implementing DAG triggering and cancellation logic. What will the audience learn: Real-life use case of leveraging Airflow capabilities beyond traditional pipeline scheduling, with innovative integration as the infrastructure for ML Platform. Trigger on-demand DAGs through API. Cancel running DAGs. Demonstration of an end-to-end ML pipeline utilizing AWS Sagemaker for batch predictions. Some more Airflow best practices. Join us to learn from Wix’s experience and best practices!
Jupyter Notebooks are widely used by data scientists and engineers to prototype and experiment with data. However these engineers are often required to work with other data or platform engineers to productionize these experiments due to the complexity in navigating infrastructure and systems. In this talk, we will deep dive into this PR https://github.com/apache/airflow/pull/34840 and share how airflow can be leveraged as a platform to execute notebook pipelines (python, scala or spark) in dynamic environments like Kubernetes for various heterogeneous use cases. We will demonstrate how data scientists can use a Jupyter extension to easily build and manage such pipelines which are executed using Airflow streamlining data science workflow development and supercharging productivity
While Airflow is widely known for orchestrating and managing workflows, particularly in the context of data engineering, data science, ML (Machine Learning), and ETL (Extract, Transform, Load) processes, its flexibility and extensibility make it a highly versatile tool suitable for a variety of use cases beyond these domains. In fact, Cloudflare has publicly shared in the past an example on how Airflow was leveraged to build a system that automates datacenter expansions. In this talk, I will share a few more of our use cases beyond traditional data engineering, demonstrating Airflow’s sophisticated capabilities for orchestrating a wide variety of complex workflows, and discussing how Airflow played a crucial role in building some of the highly successful autonomous systems at Cloudflare, from handling automated bare metal server diagnostics and recovery at scale, to Zero Touch Provisioning that is helping us accelerate the roll out of inference-optimized GPUs in 150+ cities in multiple countries globally.
Data science is expanding across industries at a rapid pace, and the companies first to adopt best practices will gain a significant advantage. To reap the benefits, decision makers need to have a confident understanding of data science and its application in their organization. This third edition delves into the latest advancements in AI, particularly focusing on large language models (LLMs), with clear distinctions made between AI and traditional data science, including AI's ability to emulate human decision-making. Author Stylianos Kampakis introduces you to the critical aspect of ethics in AI, an area of growing importance and scrutiny. The narrative examines the ethical considerations intrinsic to the development and deployment of AI technologies, including bias, fairness, transparency, and accountability. You’ll be provided with the expertise and tools required to develop a solid data strategy that is continuously effective. Ethics and legal issues surrounding data collection and algorithmic bias are some common pitfalls that Kampakis helps you avoid, while guiding you on the path to build a thriving data science culture at your organization. This updated edition also includes plenty of case studies, tools for project assessment, and expanded content for hiring and managing data scientists. Data science is a language that everyone at a modern company should understand across departments. Friction in communication arises most often when management does not connect with what a data scientist is doing or how impactful data collection and storage can be for their organization. The Decision Maker’s Handbook to Data Science bridges this gap and readies you for both the present and future of your workplace in this engaging, comprehensive guide. What You Will Learn Integrate AI with other innovative technologies Explore anticipated ethical, regulatory, and technical landscapes that will shape the future of AI and data science Discover how to hire and manage data scientists Build the right environment in order to make your organization data-driven Who This Book Is For Startup founders, product managers, higher level managers, and any other non-technical decision makers who are thinking to implement data science in their organization and hire data scientists. A secondary audience includes people looking for a soft introduction into the subject of data science.
The Data Product Management In Action podcast, brought to you by Soda and executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. In Season 01, Episode 004, it's time to meet host Nadiem von Heydebrand, CEO and Co-founder at Mindfuel. About our host Nadiem von Heydebrand: Nadiem is the CEO and Co-Founder of Mindfuel. In 2019, he merged his passion for data science with product management, becoming a thought leader in data product management. Nadiem is dedicated to demonstrating the true value contribution of data. With over a decade of experience in the data industry, Nadiem leverages his expertise to scale data platforms, implement data mesh concepts, and transform AI performance into business performance, delighting consumers at global organizations that include Volkswagen, Munich Re, Allianz, Red Bull, and Vorwerk. Connect with Nadiem on LinkedIn. All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn.
The Data Product Management In Action podcast, brought to you by Soda and executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. In Season 01, Episode 005, host Nadiem von Heydebrand (CEO and Co-founder at Mindfuel) sits down with Clemence Chee (VP of Data and Analytics at Babbel). Clemence shares his journey and the unique challenges of data product managment, and the critical role of creating tangible business value and Return On Investment. About our host Nadiem von Heydebrand: Nadiem is the CEO and Co-Founder of Mindfuel. In 2019, he merged his passion for data science with product management, becoming a thought leader in data product management. Nadiem is dedicated to demonstrating the true value contribution of data. With over a decade of experience in the data industry, Nadiem leverages his expertise to scale data platforms, implement data mesh concepts, and transform AI performance into business performance, delighting consumers at global organizations that include Volkswagen, Munich Re, Allianz, Red Bull, and Vorwerk. Connect with Nadiem on LinkedIn.
About our guest Clemence Chee: With over 10 years as a data and technology enthusiast, Clemence has extensive experience in Venture Development, Operations, and Business Intelligence. Prior to his current role at VP Data & Analytics at Babbel, he spent 7 years at HelloFresh as Global Senior Director of Data and has been fortunate to contribute to and build companies from ideation through pre-seed, Series A-D, IPO, and DAX40. Connect with Clemence on LinkedIn. All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn #dataproductmanagementwednesday
Ben Shneiderman is a leading figure in the field of human-computer interaction (HCI). Having founded one of the oldest HCI research centers in the country at the University of Maryland in 1983, Shneiderman has been intently studying the design of computer technology and its use by humans. Currently, Ben is a Distinguished University Professor in the Department of Computer Science at the University of Maryland and is working on a new book on human-centered artificial intelligence.
I’m so excited to welcome this expert from the field of UX and design to today’s episode of Experiencing Data! Ben and I talked a lot about the complex intersection of human-centered design and AI systems.
In our chat, we covered:
Ben's career studying human-computer interaction and computer science. (0:30) 'Building a culture of safety': Creating and designing ‘safe, reliable and trustworthy’ AI systems. (3:55) 'Like zoning boards': Why Ben thinks we need independent oversight of privately created AI. (12:56) 'There’s no such thing as an autonomous device': Designing human control into AI systems. (18:16) A/B testing, usability testing and controlled experiments: The power of research in designing good user experiences. (21:08) Designing ‘comprehensible, predictable, and controllable’ user interfaces for explainable AI systems and why [explainable] XAI matters. (30:34) Ben's upcoming book on human-centered AI. (35:55)
Resources and Links: People-Centered Internet: https://peoplecentered.net/ Designing the User Interface (one of Ben’s earlier books): https://www.amazon.com/Designing-User-Interface-Human-Computer-Interaction/dp/013438038X Bridging the Gap Between Ethics and Practice: https://doi.org/10.1145/3419764 Partnership on AI: https://www.partnershiponai.org/ AI incident database: https://www.partnershiponai.org/aiincidentdatabase/ University of Maryland Human-Computer Interaction Lab: https://hcil.umd.edu/ ACM Conference on Intelligent User Interfaces: https://iui.acm.org/2021/hcai_tutorial.html Human-Computer Interaction Lab, University of Maryland, Annual Symposium: https://hcil.umd.edu/tutorial-human-centered-ai/ Ben on Twitter: https://twitter.com/benbendc
Quotes from Today’s Episode The world of AI has certainly grown and blossomed — it’s the hot topic everywhere you go. It’s the hot topic among businesses around the world — governments are launching agencies to monitor AI and are also making regulatory moves and rules. … People want explainable AI; they want responsible AI; they want safe, reliable, and trustworthy AI. They want a lot of things, but they’re not always sure how to get them. The world of human-computer interaction has a long history of giving people what they want, and what they need. That blending seems like a natural way for AI to grow and to accommodate the needs of real people who have real problems. And not only the methods for studying the users, but the rules, the principles, the guidelines for making it happen. So, that’s where the action is. Of course, what we really want from AI is to make our world a better place, and that’s a tall order, but we start by talking about the things that matter — the human values: human rights, access to justice, and the dignity of every person. We want to support individual goals, a person’s sense of self-efficacy — they can do what they need to in the world, their creativity, their responsibility, and their social connections; they want to reach out to people. So, those are the sort of high aspirational goals that become the hard work of figuring out how to build it. And that’s where we want to go. - Ben (2:05)
The software engineering teams creating AI systems have got real work to do. They need the right kind of workflows, engineering patterns, and Agile development methods that will work for AI. The AI world is different because it’s not just programming, but it also involves the use of data that’s used for training. The key distinction is that the data that drives the AI has to be the appropriate data, it has to be unbiased, it has to be fair, it has to be appropriate to the task at hand. And many people and many companies are coming to grips with how to manage that. This has become controversial, let’s say, in issues like granting parole, or mortgages, or hiring people. There was a controversy that Amazon ran into when its hiring algorithm favored men rather than women. There’s been bias in facial recognition algorithms, which were less accurate with people of color. That’s led to some real problems in the real world. And that’s where we have to make sure we do a much better job and the tools of human-computer interaction are very effective in building these better systems in testing and evaluating. - Ben (6:10)
Every company will tell you, “We do a really good job in checking out our AI systems.” That’s great. We want every company to do a really good job. But we also want independent oversight of somebody who’s outside the company — someone who knows the field, who’s looked at systems at other companies, and who can bring ideas and bring understanding of the dangers as well. These systems operate in an adversarial environment — there are malicious actors out there who are causing trouble. You need to understand what the dangers and threats are to the use of your system. You need to understand where the biases come from, what dangers are there, and where the software has failed in other places. You may know what happens in your company, but you can benefit by learning what happens outside your company, and that’s where independent oversight from accounting companies, from governmental regulators, and from other independent groups is so valuable. - Ben (15:04)
There’s no such thing as an autonomous device. Someone owns it; somebody’s responsible for it; someone starts it; someone stops it; someone fixes it; someone notices when it’s performing poorly. … Responsibility is a pretty key factor here. So, if there’s something going on, if a manager is deciding to use some AI system, what they need is a control panel, let them know: what’s happening? What’s it doing? What’s going wrong and what’s going right? That kind of supervisory autonomy is what I talk about, not full machine autonomy that’s hidden away and you never see it because that’s just head-in-the-sand thinking. What you want to do is expose the operation of a system, and where possible, give the stakeholders who are responsible for performance the right kind of control panel and the right kind of data. … Feedback is the breakfast of champions. And companies know that. They want to be able to measure the success stories, and they want to know their failures, so they can reduce them. The continuous improvement mantra is alive and well. We do want to keep tracking what’s going on and make sure it gets better. Every quarter. - Ben (19:41)
Google has had some issues regarding hiring in the AI research area, and so has Facebook with elections and the way that algorithms tend to become echo chambers. These companies — and this is not through heavy research — probably have the heaviest investment of user experience professionals within data science organizations. They have UX, ML-UX people, UX for AI people, they’re at the cutting edge. I see a lot more generalist designers in most other companies. Most of them are rather unfamiliar with any of this or what the ramifications are on the design work that they’re doing. But even these largest companies that have, probably, the biggest penetration into the most number of people out there are getting some of this really important stuff wrong. - Brian (26:36)
Explainability is a competitive advantage for an AI system. People will gravitate towards systems that they understand, that they feel in control of, that are predictable. So, the big discussion about explainable AI focuses on what’s usually called post-hoc explanations, and the Shapley, and LIME, and other methods are usually tied to the post-hoc approach.That is, you use an AI model, you get a result and you say, “What happened?” Why was I denied a parole, or a mortgage, or a job? At that point, you want to get an explanation. Now, that idea is appealing, but I’m afraid I haven’t seen too many success stories of that working. … I’ve been diving through this for years now, and I’ve been looking for examples of good user interfaces of post-hoc explanations. It took me a long time till I found one. The culture of AI model-building would be much bolstered by an infusion of thinking about what the user interface will be for these explanations. And even the DARPA’s XAI—Explainable AI—project, which has 11 projects within it—has not really grappled with this in a good way about designing what it’s going to look like. Show it to me. … There is another way. And the strategy is basically prevention. Let’s prevent the user from getting confused and so they don’t have to request an explanation. We walk them along, let the user walk through the step—this is like Amazon checkout process, seven-step process—and you know what’s happened in each step, you can go back, you can explore, you can change things in each part of it. It’s also what TurboTax does so well, in really complicated situations, and walks you through it. … You want to have a comprehensible, predictable, and controllable user interface that makes sense as you walk through each step. - Ben (31:13)
The R programming language is a remarkably powerful tool for data analysis and visualization, but its steep learning curve can be intimidating for some. If you just want to automate repetitive tasks or visualize your data, without the need for complex math, R for the Rest of Us is for you. Inside you’ll find a crash course in R, a quick tour of the RStudio programming environment, and a collection of real-word applications that you can put to use right away. You’ll learn how to create informative visualizations, streamline report generation, and develop interactive websites—whether you’re a seasoned R user or have never written a line of R code. You’ll also learn how to: Manipulate, clean, and parse your data with tidyverse packages like dplyr and tidyr to make data science operations more user-friendly Create stunning and customized plots, graphs, and charts with ggplot2 to effectively communicate your data insights Import geospatial data and write code to produce visually appealing maps automatically Generate dynamic reports, presentations, and interactive websites with R Markdown and Quarto that seamlessly integrate code, text, and graphics Develop custom functions and packages tailored to your specific needs, allowing you to extend R’s functionality and automate complex tasks Unlock a treasure trove of techniques to transform the way you work. With R for the Rest of Us, you’ll discover the power of R to get stuff done. No advanced statistics degree required.
Neste episódio mergulhamos no mundo da jornada e engenharia de dados e analytics com os especialistas do Itaú. Conheça as estratégias de dados que moldam o futuro do banco, além do compartilhamento de como é trabalhar em uma empresa onde a utilização intensiva de dados é essencial para tomar as melhores decisões.
Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam as pessoas que desempenham um papel crucial na infraestrutura de dados do Itaú:
Priscila Militão — Data Engineer no Itaú Unibanco Vinicius Rio — Analista de Dados no Itaú Unibanco Thiago Panini — Analytics Engineer no Itaú Unibanco Carlos Vaccáro — Gerente de Analytics Engineering no Itaú Unibanco
Prepare-se para uma imersão no futuro dos dados no Itaú e descubra como essas mentes brilhantes estão moldando o panorama financeiro global com insights poderosos.
Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!
Nossa Bancada Data Hackers:
Monique Femme — Head of Community Management na Data Hackers Paulo Vasconcellos — Co-founder da Data Hackers e Principal Data Scientist na Hotmart. Gabriel Lages — Co-founder da Data Hackers e Data & Analytics Sr. Director na Hotmart.
Referências:
Página da Vaga | Batalha de Dados — Engenharia de Dados e Analytics :https://vemproitau.gupy.io/jobs/7312897 Episódio Computação Quantica Itaú: https://medium.com/data-hackers/o-que-%C3%A9-computa%C3%A7%C3%A3o-qu%C3%A2ntica-data-hackers-podcast-84-0389a3b299ab
Unlock the full potential of DuckDB with 'Getting Started with DuckDB,' your guide to mastering data analysis efficiently. By reading this book, you'll discover how to load, transform, and query data using DuckDB, leveraging its unique capabilities for processing large datasets. Gain hands-on experience with SQL, Python, and R to enhance your data science and engineering workflows. What this Book will help me do Effectively load and manage various types of data in DuckDB for seamless processing. Gain hands-on experience writing and optimizing SQL queries tailored for analytical tasks. Integrate DuckDB capabilities into Python and R workflows for streamlined data analysis. Understand DuckDB's optimizations and extensions for specialized data applications. Explore the broader ecosystem of data tools that complement DuckDB's capabilities. Author(s) Simon Aubury and Ned Letcher are seasoned experts in the field of data analytics and engineering. With extensive experience in using both SQL and programming languages like Python and R, they bring practical insights into the innovative uses of DuckDB. They have designed this book to provide a hands-on and approachable way to learn DuckDB, making complex concepts accessible. Who is it for? This book is well-suited for data analysts aiming to accelerate their data analysis workflows, data engineers looking for effective tools for data processing, and data scientists searching for a versatile library for scalable data manipulation. Prior exposure to SQL and programming in Python or R will be beneficial for readers to maximize their learning.
This episode features the second part of an engaging discussion between Raja Iqbal, Founder and CEO of Data Science Dojo, and Bob van Luijt, Co-founder and CEO of Weaviate, a prominent open-source vector database in the industry. Raja and Bob trace the evolution of AI over the years, the current LLM landscape, and its outlook for the future. They further dive deep into various LLM concepts such as RAG, fine-tuning, challenges in enterprise adoption, vector search, context windows, the potential of SLMs, generative feedback loop, and more. Lastly, Raja and Bob explore Artificial General Intelligence (AGI) and whether it could be a reality in the near future. This episode is a must watch for anyone interested in a comprehensive outlook on the current state and future trajectory of AI.