talk-data.com talk-data.com

Topic

LLM

Large Language Models (LLM)

nlp ai machine_learning

1405

tagged

Activity Trend

158 peak/qtr
2020-Q1 2026-Q1

Activities

1405 activities · Newest first

Vamos mergulhar no fascinante mundo da visão computacional com Carlos Melo, Computer Vision Engineer, que nos guiará desde os conceitos básicos até o funcionamento de modelos de visão computacional e onde eles estão presentes no nosso dia a dia.

Neste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam Carlos Melo — Computer Vision Engineer, que também abordará temas polêmicos, como os preconceitos e vieses que podem ser propagados por essas tecnologias, e discutirá como a chegada dos Large Language Models (LLMs) pode impactar o futuro da visão computacional.

Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!

Nossa Bancada Data Hackers:

Paulo Vasconcellos — Co-founder da Data Hackers e Principal Data Scientist na Hotmart. Monique Femme — Head of Community Management na Data Hackers Gabriel Lages — Co-founder da Data Hackers e Data & Analytics Sr. Director na Hotmart.

Referências:

Acesse nosso Medium.

By now, many of us are convinced that generative AI chatbots like ChatGPT are useful at work. However, many executives are rightfully worried about the risks from having business and customer conversations recorded by AI chatbot platforms. Some privacy and security-conscious organizations are going so far as to block these AI platforms completely. For organizations such as EY, a company that derives value from its intellectual property, leaders need to strike a balance between privacy and productivity.  John Thompson runs the department for the ideation, design, development, implementation, & use of innovative Generative AI, Traditional AI, & Causal AI solutions, across all of EY's service lines, operating functions, geographies, & for EY's clients. His team has built the world's largest, secure, private LLM-based chat environment. John also runs the Marketing Sciences consultancy, advising clients on monetization strategies for data. He is the author of four books on data, including "Data for All' and "Causal Artificial Intelligence". Previously, he was the Global Head of AI at CSL Behring, an Adjunct Professor at Lake Forest Graduate School of Management, and an Executive Partner at Gartner. In the episode, Richie and John explore the adoption of GenAI at EY, data privacy and security, GenAI use cases and productivity improvements, GenAI for decision making, causal AI and synthetic data, industry trends and predictions and much more.  Links Mentioned in the Show: Azure OpenAICausality by Judea Pearl[Course] AI EthicsRelated Episode: Data & AI at Tesco with Venkat Raghavan, Director of Analytics and Science at TescoCatch John talking about AI Maturity this SeptemberRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Artificial Intelligence

Artificial Intelligence (AI) revolves around creating and utilizing intelligent machines through science and engineering. This book delves into the theory and practical applications of computer science methods that incorporate AI across many domains. It covers techniques such as Machine Learning (ML), Convolutional Neural Networks (CNN), Deep Learning (DL), and Large Language Models (LLM) to tackle complex issues and overcome various challenges.

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !!

Aperte o play e ouça agora, o Data Hackers News dessa semana !

Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Conheça nossos comentaristas do Data Hackers News:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠**Matérias/assuntos comentados:

Pesquisa da Stack Overflow mostra que desenvolvedores não estão com medo de perder seus empregos para IA; ⁠

OpenAI anuncia seu próprio sistema de busca para competir com Google; ⁠

Twitter (X) usará seus dados para treinar IA

Podcast mencionado: Podcast Data Hackers #85 - Você deveria continuar aprendendo programação ?

Baixe o relatório completo do State of Data Brazil e os highlights da pesquisa :

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://stateofdata.datahackers.com.br/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠Dados Liberados do State of Data Brazil 2023 no Kaggle;

Demais canais do Data Hackers:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Já aproveita, para nos seguir no Spotify, Apple Podcasts, ou no seu player de podcasts favoritos !

In this podcast episode, we talked with Guillaume Lemaître about navigating scikit-learn and imbalanced-learn.

🔗 CONNECT WITH Guillaume Lemaître LinkedIn - https://www.linkedin.com/in/guillaume-lemaitre-b9404939/ Twitter - https://x.com/glemaitre58 Github - https://github.com/glemaitre Website - https://glemaitre.github.io/

🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/

🔗 CONNECT WITH ALEXEY Twitter - https://twitter.com/Al_Grigor Linkedin - https://www.linkedin.com/in/agrigorev/

🎙 ABOUT THE PODCAST At DataTalksClub, we organize live podcasts that feature a diverse range of guests from the data field. Each podcast is a free-form conversation guided by a prepared set of questions, designed to learn about the guests’ career trajectories, life experiences, and practical advice. These insightful discussions draw on the expertise of data practitioners from various backgrounds.

We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links.

You can access all the podcast episodes here - https://datatalks.club/podcast.html

📚Check our free online courses ML Engineering course - http://mlzoomcamp.com Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp LLM course - https://github.com/DataTalksClub/llm-zoomcamp Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html

👋🏼 GET IN TOUCH If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev

If you're a company and want to support us, contact at [email protected]

Retrieval-Augmented Generation (RAG) has become a popular method to address this issue, augmenting LLMs with an external knowledge base. However, implementing RAG introduces distinct challenges. In this presentation, Joanna will share practical insights into the challenges encountered while implementing RAG systems, alongside strategies for overcoming them. You'll be equipped with the tools and methodologies needed to navigate these challenges successfully.

In this presentation, we will explore into the key aspects of aligning Large Language Models (LLMs) and explore how to set up the necessary infrastructure to maintain a versatile alignment pipeline. Specifically, we will cover: Incorporating LLMs into the data collection for supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to maximize efficiency. Techniques for instilling desired behaviors in LLMs with the use of prompt tuning. A cutting-edge workflow management approach, and how it facilitates rapid prototyping of highly-intensive distributed training procedures. This session is tailored for machine learning engineers who are deploying their LLMs and seeking to improve their models.

Meta has been at the absolute edge of the open-source AI ecosystem, and with the recent release of Llama 3.1, they have officially created the largest open-source model to date. So, what's the secret behind the performance gains of Llama 3.1? What will the future of open-source AI look like? Thomas Scialom is a Senior Staff Research Scientist (LLMs) at Meta AI, and is one of the co-creators of the Llama family of models. Prior to joining Meta, Thomas worked as a Teacher, Lecturer, Speaker and Quant Trading Researcher.  In the episode, Adel and Thomas explore Llama 405B it’s new features and improved performance, the challenges in training LLMs, best practices for training LLMs, pre and post-training processes, the future of LLMs and AI, open vs closed-sources models, the GenAI landscape, scalability of AI models, current research and future trends and much more.  Links Mentioned in the Show: Meta - Introducing Llama 3.1: Our most capable models to dateDownload the Llama Models[Course] Working with Llama 3[Skill Track] Developing AI ApplicationsRelated Episode: Creating Custom LLMs with Vincent Granville, Founder, CEO & Chief Al Scientist at GenAltechLab.comRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !!

Aperte o play e ouça agora, o Data Hackers News dessa semana !

Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Conheça nossos comentaristas do Data Hackers News:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠;

⁠Paulo Vasconcellos⁠.

Matérias/assuntos comentados:

⁠⁠OpenAI anuncia novo modelo de baixo custo: o GPT-4o mini;

⁠Meta suspende recursos de inteligência artificial generativa no Brasil;

Alexa dando prejuízo de bilhões a Amazon.

Baixe o relatório completo do State of Data Brazil e os highlights da pesquisa :

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://stateofdata.datahackers.com.br/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠Dados Liberados do State of Data Brazil 2023 no Kaggle;

Demais canais do Data Hackers:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Já aproveita, para nos seguir no Spotify, Apple Podcasts, ou no seu player de podcasts favoritos !

Ready for more ideas about UX for AI and LLM applications in enterprise environments? In part 2 of my topic on UX considerations for LLMs, I explore how an LLM might be used for a fictitious use case at an insurance company—specifically, to help internal tools teams to get rapid access to primary qualitative user research. (Yes, it’s a little “meta”, and I’m also trying to nudge you with this hypothetical example—no secret!) ;-) My goal with these episodes is to share questions you might want to ask yourself such that any use of an LLM is actually contributing to a positive UX outcome  Join me as I cover the implications for design, the importance of foundational data quality, the balance between creative inspiration and factual accuracy, and the never-ending discussion of how we might handle hallucinations and errors posing as “facts”—all with a UX angle. At the end, I also share a personal story where I used an LLM to help me do some shopping for my favorite product: TRIP INSURANCE! (NOT!) 

Highlights/ Skip to:

(1:05) I introduce a hypothetical  internal LLM tool and what the goal of the tool is for the team who would use it  (5:31) Improving access to primary research findings for better UX  (10:19) What “quality data” means in a UX context (12:18) When LLM accuracy maybe doesn’t matter as much (14:03) How AI and LLMs are opening the door for fresh visioning work (15:38) Brian’s overall take on LLMs inside enterprise software as of right now (18:56) Final thoughts on UX design for LLMs, particularly in the enterprise (20:25) My inspiration for these 2 episodes—and how I had to use ChatGPT to help me complete a purchase on a website that could have integrated this capability right into their website

Quotes from Today’s Episode “If we accept that the goal of most product and user experience research is to accelerate the production of quality services, products, and experiences, the question is whether or not using an LLM for these types of questions is moving the needle in that direction at all. And secondly, are the potential downsides like hallucinations and occasional fabricated findings, is that all worth it? So, this is a design for AI problem.” - Brian T. O’Neill (8:09) “What’s in our data? Can the right people change it when the LLM is wrong? The data product managers and AI leaders reading this or listening know that the not-so-secret path to the best AI is in the foundational data that the models are trained on. But what does the word quality mean from a product standpoint and a risk reduction one, as seen from an end-users’ perspective? Somebody who’s trying to get work done? This is a different type of quality measurement.” - Brian T. O’Neill (10:40)

“When we think about fact retrieval use cases in particular, how easily can product teams—internal or otherwise—and end-users understand the confidence of responses? When responses are wrong, how easily, if at all, can users and product teams update the model’s responses? Errors in large language models may be a significant design consideration when we design probabilistic solutions, and we no longer control what exactly our products and software are going to show to users. If bad UX can include leading people down the wrong path unknowingly, then AI is kind of like the team on the other side of the tug of war that we’re playing.” - Brian T. O’Neill (11:22) “As somebody who writes a lot for my consulting business, and composes music in another, one of the hardest parts for creators can be the zero-to-one problem of getting started—the blank page—and this is a place where I think LLMs have great potential. But it also means we need to do the proper research to understand our audience, and when or where they’re doing truly generative or creative work—such that we can take a generative UX to the next level that goes beyond delivering banal and obviously derivative content.” - Brian T. O’Neill (13:31) “One thing I actually like about the hype, investment, and excitement around GenAI and LLMs in the enterprise is that there is an opportunity for organizations here to do some fresh visioning work. And this is a place that designers and user experience professionals can help data teams as we bring design into the AI space.” - Brian T. O’Neill (14:04)

“If there was ever a time to do some new visioning work, I think now is one of those times. However, we need highly skilled design leaders to help facilitate this in order for this to be effective. Part of that skill is knowing who to include in exercises like this, and my perspective, one of those people, for sure, should be somebody who understands the data science side as well, not just the engineering perspective. And as I posited in my seminar that I teach, the AI and analytical data product teams probably need a fourth member. It’s a quartet and not a trio. And that quartet includes a data expert, as well as that engineering lead.” - Brian T. O’Neill (14:38)

Links Perplexity.ai: https://perplexity.ai  Ideaflow: https://www.amazon.com/Ideaflow-Only-Business-Metric-Matters/dp/0593420586  My article that inspired this episode

Summary Generative AI has rapidly gained adoption for numerous use cases. To support those applications, organizational data platforms need to add new features and data teams have increased responsibility. In this episode Lior Gavish, co-founder of Monte Carlo, discusses the various ways that data teams are evolving to support AI powered features and how they are incorporating AI into their work. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Lior Gavish about the impact of AI on data engineersInterview IntroductionHow did you get involved in the area of data management?Can you start by clarifying what we are discussing when we say "AI"?Previous generations of machine learning (e.g. deep learning, reinforcement learning, etc.) required new features in the data platform. What new demands is the current generation of AI introducing?Generative AI also has the potential to be incorporated in the creation/execution of data pipelines. What are the risk/reward tradeoffs that you have seen in practice?What are the areas where LLMs have proven useful/effective in data engineering?Vector embeddings have rapidly become a ubiquitous data format as a result of the growth in retrieval augmented generation (RAG) for AI applications. What are the end-to-end operational requirements to support this use case effectively?As with all data, the reliability and quality of the vectors will impact the viability of the AI application. What are the different failure modes/quality metrics/error conditions that they are subject to?As much as vectors, vector databases, RAG, etc. seem exotic and new, it is all ultimately shades of the same work that we have been doing for years. What are the areas of overlap in the work required for running the current generation of AI, and what are the areas where it diverges?What new skills do data teams need to acquire to be effective in supporting AI applications?What are the most interesting, innovative, or unexpected ways that you have seen AI impact data engineering teams?What are the most interesting, unexpected, or challenging lessons that you have learned while working with the current generation of AI?When is AI the wrong choice?What are your predictions for the future impact of AI on data engineering teams?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your Links Monte CarloPodcast EpisodeNLP == Natural Language ProcessingLarge Language ModelsGenerative AIMLOpsML EngineerFeature StoreRetrieval Augmented Generation (RAG)LangchainThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

This special episode of DataFramed was made in collaboration with Analytics on Fire! Nowadays, the hype around generative AI is only the tip of the iceberg. There are so many ideas being touted as the next big thing that it’s difficult to keep up. More importantly, it’s challenging to discern which ideas will become the next ChatGPT and which will end up like the next NFT. How do we cut through the noise? Mico Yuk is the Community Manager at Acryl Data and Co-Founder at Data Storytelling Academy. Mico is also an SAP Mentor Alumni, and the Founder of the popular weblog, Everything Xcelsius and the 'Xcelsius Gurus’ Network. She was named one of the Top 50 Analytics Bloggers to follow, as-well-as a high-regarded BI influencer and sought after global keynote speaker in the Analytics ecosystem.  In the episode, Richie and Mico explore AI and productivity at work, the future of work and AI, GenAI and data roles, AI for training and learning, training at scale, decision intelligence, soft skills for data professionals, genAI hype and much more.  Links Mentioned in the Show: Analytics on Fire PodcastData Visualization for Dummies by Mico Yuk and Stephanie DiamondConnect with Miko[Skill Track] AI FundamentalsRelated Episode: What to Expect from AI in 2024 with Craig S. Smith, Host of the Eye on A.I PodcastRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !!

Aperte o play e ouça agora, o Data Hackers News dessa semana !

Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Baixe o relatório completo do State of Data Brazil e os highlights da pesquisa :

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://stateofdata.datahackers.com.br/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Conheça nossos comentaristas do Data Hackers News:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠; ⁠Paulo Vasconcellos⁠.

Demais canais do Data Hackers:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠Matérias/assuntos comentados:⁠

⁠⁠⁠Dados Liberados do State of Data Brazil 2023;⁠⁠

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !!

Aperte o play e ouça agora, o Data Hackers News dessa semana !

Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Baixe o relatório completo do State of Data Brazil e os highlights da pesquisa :

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://stateofdata.datahackers.com.br/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Conheça nossos comentaristas do Data Hackers News:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠; ⁠Paulo Vasconcellos⁠.

Demais canais do Data Hackers:

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠Matérias/assuntos comentados:⁠

⁠⁠⁠Dados Liberados do State of Data Brazil 2023;⁠⁠

Meta lançará versão mais poderosa do Llama dia 23 de Julho; Novo modelo da OpenAI está em produção; Emissões de carbono do Google sobem 50% devido a IA.

Já aproveita, para nos seguir no Spotify, Apple Podcasts, ou no seu player de podcasts favoritos !

Despite GPT, Claude, Gemini, LLama and the other host of LLMs that we have access to, a variety of organizations are still exploring their options when it comes to custom LLMs. Logging in to ChatGPT is easy enough, and so is creating a 'custom' openAI GPT, but what does it take to create a truly custom LLM? When and why might this be useful, and will it be worth the effort? Vincent Granville is a pioneer in the AI and machine learning space, he is Co-Founder of Data Science Central, Founder of MLTechniques.com, former VC-funded executive, author, and patent owner. Vincent’s corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He is also a former post-doc at Cambridge University and the National Institute of Statistical Sciences. Vincent has published in the Journal of Number Theory, Journal of the Royal Statistical Society, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the author of multiple books, including “Synthetic Data and Generative AI”. In the episode, Richie and Vincent explore why you might want to create a custom LLM including issues with standard LLMs and benefits of custom LLMs, the development and features of custom LLMs, architecture and technical details, corporate use cases, technical innovations, ethics and legal considerations, and much more.  Links Mentioned in the Show: Read Articles by VincentSynthetic Data and Generative AI by Vincent GranvilleConnect with Vincent on Linkedin[Course] Developing LLM Applications with LangChainRelated Episode: The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at PineconeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Links:

LinkedIn:https://www.linkedin.com/company/frontline100/ Ba Linh Le's LinkedIn: https://www.linkedin.com/in/ba-linh-le-/ Sabrina's LinkedIn: https://www.linkedin.com/in/sabina-firtala/ Twitter: https://x.com/frontline_100?mx=2 Website: https://www.frontline100.com/

Free LLM course: https://github.com/DataTalksClub/llm-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style! Bookmarklet Maker: Discover how to automate tasks with the Bookmarklet Maker, a tool for turning scripts into handy browser bookmarks. RouteLLM Framework: Explore the RouteLLM framework by LMSys and Anyscale, designed to optimize the cost-performance ratio of LLM routers. Learn more about this collaboration at LMSys and Anyscale. Q for SQL on CSV/TSV: Meet Q, a command-line tool that lets you run SQL queries directly on CSV or TSV files, simplifying data exploration from your terminal. DuckDB Community Extensions: Check out the latest updates in DuckDB's community extensions and see how this database system is evolving. Apple Intelligence and AI Maximalism: Explore Apple's AI strategy, their avoidance of chat UIs, risk management with OpenAI, and the shift of compute costs to users. Being Glue: Delve into the challenges of being "Glue" at work. Explore why women are more likely to take on non-promotable work and how this affects career progression and workplace dynamics.

Let’s talk about design for AI (which more and more, I’m agreeing means GenAI to those outside the data space). The hype around GenAI and LLMs—particularly as it relates to dropping these in as features into a software application or product—seems to me, at this time, to largely be driven by FOMO rather than real value. In this “part 1” episode, I look at the importance of solid user experience design and outcome-oriented thinking when deploying LLMs into enterprise products. Challenges with immature AI UIs, the role of context, the constant game of understanding what accuracy means (and how much this matters), and the potential impact on human workers are also examined. Through a hypothetical scenario, I illustrate the complexities of using LLMs in practical applications, stressing the need for careful consideration of benchmarks and the acceptance of GenAI's risks. 

I also want to note that LLMs are a very immature space in terms of UI/UX design—even if the foundation models continue to mature at a rapid pace. As such, this episode is more about the questions and mindset I would be considering when integrating LLMs into enterprise software more than a suggestion of “best practices.” 

Highlights/ Skip to:

(1:15) Currently, many LLM feature  initiatives seem to mostly driven by FOMO  (2:45) UX Considerations for LLM-enhanced enterprise applications  (5:14) Challenges with LLM UIs / user interfaces (7:24) Measuring improvement in UX outcomes with LLMs (10:36) Accuracy in LLMs and its relevance in enterprise software  (11:28) Illustrating key consideration for implementing an LLM-based feature (19:00) Leadership and context in AI deployment (19:27) Determining UX benchmarks for using LLMs (20:14) The dynamic nature of LLM hallucinations and how we design for the unknown (21:16) Closing thoughts on Part 1 of designing for AI and LLMs

Quotes from Today’s Episode

“While many product teams continue to race to deploy some sort of GenAI and especially LLMs into their products—particularly this is in the tech sector for commercial software companies—the general sense I’m getting is that this is still more about FOMO than anything else.” - Brian T. O’Neill (2:07) “No matter what the technology is, a good user experience design foundation starts with not doing any harm, and hopefully going beyond usable to be delightful. And adding LLM capabilities into a solution is really no different. So, we still need to have outcome-oriented thinking on both our product and design teams when deploying LLM capabilities into a solution. This is a cornerstone of good product work.” - Brian T. O’Neill (3:03)

“So, challenges with LLM UIs and UXs, right, user interfaces and experiences, the most obvious challenge to me right now with large language model interfaces is that while we’ve given users tremendous flexibility in the form of a Google search-like interface, we’ve also in many cases, limited the UX of these interactions to a text conversation with a machine. We’re back to the CLI in some ways.” - Brian T. O’Neill (5:14) “Before and after we insert an LLM into a user’s workflow, we need to know what an improvement in their life or work actually means.”- Brian T. O’Neill (7:24) "If it would take the machine a few seconds to process a result versus what might take a day for a worker, what’s the role and purpose of that worker going forward? I think these are all considerations that need to be made, particularly if you’re concerned about adoption, which a lot of data product leaders are." - Brian T. O’Neill (10:17)

“So, there’s no right or wrong answer here. These are all range questions, and they’re leadership questions, and context really matters. They are important to ask, particularly when we have this risk of reacting to incorrect information that looks plausible and believable because of how these LLMs tend to respond to us with a positive sheen much of the time.” - Brian T. O’Neill (19:00)

Links

View Part 1 of my article on UI/UX design considerations for LLMs in enterprise applications:  https://designingforanalytics.com/resources/ui-ux-design-for-enterprise-llms-use-cases-and-considerations-for-data-and-product-leaders-in-2024-part-1/