In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challenges of building a recommender system for cultural heritage materials, including dealing with sparse user-item interaction matrices, the cold start problem, and the need for multi-modal similarity approaches that can handle text, images, metadata, and historical context. The platform leverages various embedding techniques and gives users control over weighting different modalities—whether they're searching based on text similarity, visual imagery, or diplomatic features like issuers and receivers. A key insight from Florian's research is the importance of balancing serendipity with utility, collection representation to prevent bias, and system explainability while maintaining effectiveness. The discussion also touches on unique evaluation challenges in non-commercial recommendation contexts, including Florian's "research funnel" framework that considers discovery, interaction, integration, and impact stages. Looking ahead, Florian envisions recommendation systems becoming standard tools for exploration across digital archives and cultural heritage repositories throughout Europe, potentially transforming how researchers discover and engage with historical materials. The new version of Monasterium.net, set to launch with enhanced semantic search and recommendation features, represents an important step toward making cultural heritage more accessible and discoverable for everyone.
talk-data.com
Topic
C#/.NET
26
tagged
Activity Trend
Top Events
In this episode of Data Unchained, host Molly Presley sits down with Paul Lekas, Head of Global Public Policy for the Software and Information Industry Association (SIIA), to explore the future of data, AI, and public policy. From privacy legislation to the challenges of AI training data, Paul offers a unique perspective on how industry, government, and civil society must work together to build a trustworthy data ecosystem. You can find out more about Chris and SIIA by visiting their website: https://www.siia.net/ Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US
Hosted on Acast. See acast.com/privacy for more information.
Chris Mohr, President of the Software & Information Industry Association (SIIA), joins us to unpack the legal and policy challenges shaping the future of data, AI, and digital information. Discover how companies, policymakers, and innovators can prepare for an era where AI regulation, copyright liability, and privacy standards are evolving faster than ever. If you’re a CIO, CTO, or business leader navigating decentralized data, compliance, and digital transformation, this episode will give you the insights you need to stay ahead of the curve. Be sure to check out Chris's podcast The Business of Information: https://www.siia.net/the-business-of-information/ You can find out more about Chris and SIIA by visiting their website: https://www.siia.net/ Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US
AI #ArtificialIntelligence #Copyright #DataPrivacy #AIRegulation #TechnologyPolicy #DigitalTransformation #Section230 #DataUnchained #TechPodcast #CloudData #DecentralizedData #CIO #CTO #SIIA
Hosted on Acast. See acast.com/privacy for more information.
Supported by Our Partners • Statsig — The unified platform for flags, analytics, experiments, and more. • Sinch — Connect with customers at every step of their journey. • Modal — The cloud platform for building AI applications. — How has Microsoft changed since its founding in 1975, especially in how it builds tools for developers? In this episode of The Pragmatic Engineer, I sit down with Scott Guthrie, Executive Vice President of Cloud and AI at Microsoft. Scott has been with the company for 28 years. He built the first prototype of ASP.NET, led the Windows Phone team, led up Azure, and helped shape many of Microsoft’s most important developer platforms. We talk about Microsoft’s journey from building early dev tools to becoming a top cloud provider—and how it actively worked to win back and grow its developer base. In this episode, we cover: • Microsoft’s early years building developer tools • Why Visual Basic faced resistance from devs back in the day: even though it simplified development at the time • How .NET helped bring a new generation of server-side developers into Microsoft’s ecosystem • Why Windows Phone didn’t succeed • The 90s Microsoft dev stack: docs, debuggers, and more • How Microsoft Azure went from being the #7 cloud provider to the #2 spot today • Why Microsoft created VS Code • How VS Code and open source led to the acquisition of GitHub • What Scott’s excited about in the future of developer tools and AI • And much more! — Timestamps (00:00) Intro (02:25) Microsoft’s early years building developer tools (06:15) How Microsoft’s developer tools helped Windows succeed (08:00) Microsoft’s first tools were built to allow less technically savvy people to build things (11:00) A case for embracing the technology that’s coming (14:11) Why Microsoft built Visual Studio and .NET (19:54) Steve Ballmer’s speech about .NET (22:04) The origins of C# and Anders Hejlsberg’s impact on Microsoft (25:29) The 90’s Microsoft stack, including documentation, debuggers, and more (30:17) How productivity has changed over the past 10 years (32:50) Why Gergely was a fan of Windows Phone—and Scott’s thoughts on why it didn’t last (36:43) Lessons from working on (and fixing) Azure under Satya Nadella (42:50) Codeplex and the acquisition of GitHub (48:52) 2014: Three bold projects to win the hearts of developers (55:40) What Scott’s excited about in new developer tools and cloud computing (59:50) Why Scott thinks AI will enhance productivity but create more engineering jobs — The Pragmatic Engineer deepdives relevant for this episode: • Microsoft is dogfooding AI dev tools’ future • Microsoft’s developer tools roots • Why are Cloud Development Environments spiking in popularity, now? • Engineering career paths at Big Tech and scaleups • How Linux is built with Greg Kroah-Hartman — See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
The data through January tell a story of a moderating expansion, but the ongoing US policy turmoil is weighing on February sentiment readings. Absent a détente in the US war on trade and other domestic austerity measures, we put the risk of recession this year at 40%. Next week’s Fed meeting should aim to not make waves.
Speakers:
Bruce Kasman
Joseph Lupton
This podcast was recorded on 14 March 2025.
See the Daily consumer spending tracker (https://jpmm-internal.jpmchase.net/research/open/latest/publication/9002054) for disclaimers and methodology for Chase Card Tracker data.
This communication is provided for information purposes only. Institutional clients please visit www.jpmm.com/research/disclosures for important disclosures. © 2025 JPMorgan Chase & Co. All rights reserved. This material or any portion hereof may not be reprinted, sold or redistributed without the written consent of J.P. Morgan. It is strictly prohibited to use or share without prior written consent from J.P. Morgan any research material received from J.P. Morgan or an authorized third-party (“J.P. Morgan Data”) in any third-party artificial intelligence (“AI”) systems or models when such J.P. Morgan Data is accessible by a third-party. It is permissible to use J.P. Morgan Data for internal business purposes only in an AI system or model that protects the confidentiality of J.P. Morgan Data so as to prevent any and all access to or use of such J.P. Morgan Data by any third-party.
The first episode of The Pragmatic Engineer Podcast is out. Expect similar episodes every other Wednesday. You can add the podcast in your favorite podcast player, and have future episodes downloaded automatically. Listen now on Apple, Spotify, and YouTube. Brought to you by: • Codeium: Join the 700K+ developers using the IT-approved AI-powered code assistant. • TLDR: Keep up with tech in 5 minutes — On the first episode of the Pragmatic Engineer Podcast, I am joined by Simon Willison. Simon is one of the best-known software engineers experimenting with LLMs to boost his own productivity: he’s been doing this for more than three years, blogging about it in the open. Simon is the creator of Datasette, an open-source tool for exploring and publishing data. He works full-time developing open-source tools for data journalism, centered on Datasette and SQLite. Previously, he was an engineering director at Eventbrite, joining through the acquisition of Lanyrd, a Y Combinator startup he co-founded in 2010. Simon is also a co-creator of the Django Web Framework. He has been blogging about web development since the early 2000s. In today’s conversation, we dive deep into the realm of Gen AI and talk about the following: • Simon’s initial experiments with LLMs and coding tools • Why fine-tuning is generally a waste of time—and when it’s not • RAG: an overview • Interacting with GPTs voice mode • Simon’s day-to-day LLM stack • Common misconceptions about LLMs and ethical gray areas • How Simon’s productivity has increased and his generally optimistic view on these tools • Tips, tricks, and hacks for interacting with GenAI tools • And more! I hope you enjoy this episode. — In this episode, we cover: (02:15) Welcome (05:28) Simon’s ‘scary’ experience with ChatGPT (10:58) Simon’s initial experiments with LLMs and coding tools (12:21) The languages that LLMs excel at (14:50) To start LLMs by understanding the theory, or by playing around? (16:35) Fine-tuning: what it is, and why it’s mostly a waste of time (18:03) Where fine-tuning works (18:31) RAG: an explanation (21:34) The expense of running testing on AI (23:15) Simon’s current AI stack (29:55) Common misconceptions about using LLM tools (30:09) Simon’s stack – continued (32:51) Learnings from running local models (33:56) The impact of Firebug and the introduction of open-source (39:42) How Simon’s productivity has increased using LLM tools (41:55) Why most people should limit themselves to 3-4 programming languages (45:18) Addressing ethical issues and resistance to using generative AI (49:11) Are LLMs are plateauing? Is AGI overhyped? (55:45) Coding vs. professional coding, looking ahead (57:27) The importance of systems thinking for software engineers (1:01:00) Simon’s advice for experienced engineers (1:06:29) Rapid-fire questions — Where to find Simon Willison: • X: https://x.com/simonw • LinkedIn: https://www.linkedin.com/in/simonwillison/ • Website: https://simonwillison.net/ • Mastodon: https://fedi.simonwillison.net/@simon — Referenced: • Simon’s LLM project: https://github.com/simonw/llm • Jeremy Howard’s Fast Ai: https://www.fast.ai/ • jq programming language: https://en.wikipedia.org/wiki/Jq_(programming_language) • Datasette: https://datasette.io/ • GPT Code Interpreter: https://platform.openai.com/docs/assistants/tools/code-interpreter • Open Ai Playground: https://platform.openai.com/playground/chat • Advent of Code: https://adventofcode.com/ • Rust programming language: https://www.rust-lang.org/ • Applied AI Software Engineering: RAG: https://newsletter.pragmaticengineer.com/p/rag • Claude: https://claude.ai/ • Claude 3.5 sonnet: https://www.anthropic.com/news/claude-3-5-sonnet • ChatGPT can now see, hear, and speak: https://openai.com/index/chatgpt-can-now-see-hear-and-speak/ • GitHub Copilot: https://github.com/features/copilot • What are Artifacts and how do I use them?: https://support.anthropic.com/en/articles/9487310-what-are-artifacts-and-how-do-i-use-them • Large Language Models on the command line: https://simonwillison.net/2024/Jun/17/cli-language-models/ • Llama: https://www.llama.com/ • MLC chat on the app store: https://apps.apple.com/us/app/mlc-chat/id6448482937 • Firebug: https://en.wikipedia.org/wiki/Firebug_(software)# • NPM: https://www.npmjs.com/ • Django: https://www.djangoproject.com/ • Sourceforge: https://sourceforge.net/ • CPAN: https://www.cpan.org/ • OOP: https://en.wikipedia.org/wiki/Object-oriented_programming • Prolog: https://en.wikipedia.org/wiki/Prolog • SML: https://en.wikipedia.org/wiki/Standard_ML • Stabile Diffusion: https://stability.ai/ • Chain of thought prompting: https://www.promptingguide.ai/techniques/cot • Cognition AI: https://www.cognition.ai/ • In the Race to Artificial General Intelligence, Where’s the Finish Line?: https://www.scientificamerican.com/article/what-does-artificial-general-intelligence-actually-mean/ • Black swan theory: https://en.wikipedia.org/wiki/Black_swan_theory • Copilot workspace: https://githubnext.com/projects/copilot-workspace • Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems: https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321 • Bluesky Global: https://www.blueskyglobal.org/ • The Atrocity Archives (Laundry Files #1): https://www.amazon.com/Atrocity-Archives-Laundry-Files/dp/0441013651 • Rivers of London: https://www.amazon.com/Rivers-London-Ben-Aaronovitch/dp/1625676158/ • Vanilla JavaScript: http://vanilla-js.com/ • jQuery: https://jquery.com/ • Fly.io: https://fly.io/ — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Ben Shneiderman is a leading figure in the field of human-computer interaction (HCI). Having founded one of the oldest HCI research centers in the country at the University of Maryland in 1983, Shneiderman has been intently studying the design of computer technology and its use by humans. Currently, Ben is a Distinguished University Professor in the Department of Computer Science at the University of Maryland and is working on a new book on human-centered artificial intelligence.
I’m so excited to welcome this expert from the field of UX and design to today’s episode of Experiencing Data! Ben and I talked a lot about the complex intersection of human-centered design and AI systems.
In our chat, we covered:
Ben's career studying human-computer interaction and computer science. (0:30) 'Building a culture of safety': Creating and designing ‘safe, reliable and trustworthy’ AI systems. (3:55) 'Like zoning boards': Why Ben thinks we need independent oversight of privately created AI. (12:56) 'There’s no such thing as an autonomous device': Designing human control into AI systems. (18:16) A/B testing, usability testing and controlled experiments: The power of research in designing good user experiences. (21:08) Designing ‘comprehensible, predictable, and controllable’ user interfaces for explainable AI systems and why [explainable] XAI matters. (30:34) Ben's upcoming book on human-centered AI. (35:55)
Resources and Links: People-Centered Internet: https://peoplecentered.net/ Designing the User Interface (one of Ben’s earlier books): https://www.amazon.com/Designing-User-Interface-Human-Computer-Interaction/dp/013438038X Bridging the Gap Between Ethics and Practice: https://doi.org/10.1145/3419764 Partnership on AI: https://www.partnershiponai.org/ AI incident database: https://www.partnershiponai.org/aiincidentdatabase/ University of Maryland Human-Computer Interaction Lab: https://hcil.umd.edu/ ACM Conference on Intelligent User Interfaces: https://iui.acm.org/2021/hcai_tutorial.html Human-Computer Interaction Lab, University of Maryland, Annual Symposium: https://hcil.umd.edu/tutorial-human-centered-ai/ Ben on Twitter: https://twitter.com/benbendc
Quotes from Today’s Episode The world of AI has certainly grown and blossomed — it’s the hot topic everywhere you go. It’s the hot topic among businesses around the world — governments are launching agencies to monitor AI and are also making regulatory moves and rules. … People want explainable AI; they want responsible AI; they want safe, reliable, and trustworthy AI. They want a lot of things, but they’re not always sure how to get them. The world of human-computer interaction has a long history of giving people what they want, and what they need. That blending seems like a natural way for AI to grow and to accommodate the needs of real people who have real problems. And not only the methods for studying the users, but the rules, the principles, the guidelines for making it happen. So, that’s where the action is. Of course, what we really want from AI is to make our world a better place, and that’s a tall order, but we start by talking about the things that matter — the human values: human rights, access to justice, and the dignity of every person. We want to support individual goals, a person’s sense of self-efficacy — they can do what they need to in the world, their creativity, their responsibility, and their social connections; they want to reach out to people. So, those are the sort of high aspirational goals that become the hard work of figuring out how to build it. And that’s where we want to go. - Ben (2:05)
The software engineering teams creating AI systems have got real work to do. They need the right kind of workflows, engineering patterns, and Agile development methods that will work for AI. The AI world is different because it’s not just programming, but it also involves the use of data that’s used for training. The key distinction is that the data that drives the AI has to be the appropriate data, it has to be unbiased, it has to be fair, it has to be appropriate to the task at hand. And many people and many companies are coming to grips with how to manage that. This has become controversial, let’s say, in issues like granting parole, or mortgages, or hiring people. There was a controversy that Amazon ran into when its hiring algorithm favored men rather than women. There’s been bias in facial recognition algorithms, which were less accurate with people of color. That’s led to some real problems in the real world. And that’s where we have to make sure we do a much better job and the tools of human-computer interaction are very effective in building these better systems in testing and evaluating. - Ben (6:10)
Every company will tell you, “We do a really good job in checking out our AI systems.” That’s great. We want every company to do a really good job. But we also want independent oversight of somebody who’s outside the company — someone who knows the field, who’s looked at systems at other companies, and who can bring ideas and bring understanding of the dangers as well. These systems operate in an adversarial environment — there are malicious actors out there who are causing trouble. You need to understand what the dangers and threats are to the use of your system. You need to understand where the biases come from, what dangers are there, and where the software has failed in other places. You may know what happens in your company, but you can benefit by learning what happens outside your company, and that’s where independent oversight from accounting companies, from governmental regulators, and from other independent groups is so valuable. - Ben (15:04)
There’s no such thing as an autonomous device. Someone owns it; somebody’s responsible for it; someone starts it; someone stops it; someone fixes it; someone notices when it’s performing poorly. … Responsibility is a pretty key factor here. So, if there’s something going on, if a manager is deciding to use some AI system, what they need is a control panel, let them know: what’s happening? What’s it doing? What’s going wrong and what’s going right? That kind of supervisory autonomy is what I talk about, not full machine autonomy that’s hidden away and you never see it because that’s just head-in-the-sand thinking. What you want to do is expose the operation of a system, and where possible, give the stakeholders who are responsible for performance the right kind of control panel and the right kind of data. … Feedback is the breakfast of champions. And companies know that. They want to be able to measure the success stories, and they want to know their failures, so they can reduce them. The continuous improvement mantra is alive and well. We do want to keep tracking what’s going on and make sure it gets better. Every quarter. - Ben (19:41)
Google has had some issues regarding hiring in the AI research area, and so has Facebook with elections and the way that algorithms tend to become echo chambers. These companies — and this is not through heavy research — probably have the heaviest investment of user experience professionals within data science organizations. They have UX, ML-UX people, UX for AI people, they’re at the cutting edge. I see a lot more generalist designers in most other companies. Most of them are rather unfamiliar with any of this or what the ramifications are on the design work that they’re doing. But even these largest companies that have, probably, the biggest penetration into the most number of people out there are getting some of this really important stuff wrong. - Brian (26:36)
Explainability is a competitive advantage for an AI system. People will gravitate towards systems that they understand, that they feel in control of, that are predictable. So, the big discussion about explainable AI focuses on what’s usually called post-hoc explanations, and the Shapley, and LIME, and other methods are usually tied to the post-hoc approach.That is, you use an AI model, you get a result and you say, “What happened?” Why was I denied a parole, or a mortgage, or a job? At that point, you want to get an explanation. Now, that idea is appealing, but I’m afraid I haven’t seen too many success stories of that working. … I’ve been diving through this for years now, and I’ve been looking for examples of good user interfaces of post-hoc explanations. It took me a long time till I found one. The culture of AI model-building would be much bolstered by an infusion of thinking about what the user interface will be for these explanations. And even the DARPA’s XAI—Explainable AI—project, which has 11 projects within it—has not really grappled with this in a good way about designing what it’s going to look like. Show it to me. … There is another way. And the strategy is basically prevention. Let’s prevent the user from getting confused and so they don’t have to request an explanation. We walk them along, let the user walk through the step—this is like Amazon checkout process, seven-step process—and you know what’s happened in each step, you can go back, you can explore, you can change things in each part of it. It’s also what TurboTax does so well, in really complicated situations, and walks you through it. … You want to have a comprehensible, predictable, and controllable user interface that makes sense as you walk through each step. - Ben (31:13)
Send us a text GenAI in Marketing. Making Data Simple welcomes Michael Cohen, Chief Data Analytics Officer and ML and AI product and marketing expert in consumer data technologies. Marketing Operations, Automated Decision Activation, Measurement and Analytics, Info Security and Privacy. 01:15 Meeting Michael Cohen03:33 The Plus Company08:06 Traditional Approaches to Marketing12:03 The Future of Marketing17:31 Data Augmentin's Role24:46 Data Inputs26:18 The AIOS Product31:39 Algorithms34:03 2 Min Plus Pitch41:13 Aggressive Innovation Roadmaps44:44 Next Marketing Disruption46:33 For FunLinkedIn: www.linkedin.com/in/macohen1/ Website: www.macohen.net, https://pluscompany.com Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.
We talked about:
Rob’s background Going from software engineering to Bayesian modeling Frequentist vs Bayesian modeling approach About integrals Probabilistic programming and samplers MCMC and Hakaru Language vs library Encoding dependencies and relationships into a model Stan, HMC (Hamiltonian Monte Carlo) , and NUTS Sources for learning about Bayesian modeling Reaching out to Rob
Links:
Book 1: https://bayesiancomputationbook.com/welcome.html Book/Course: https://xcelab.net/rm/statistical-rethinking/
Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
¿Cuándo te enseñaron lo que tienes que saber ANTES, durante y después de tomar un préstamo? ¿Nunca verdad? 😳 pues este episodio es una guía para ti. Hablamos desde nuestra experiencia sobre las cosas que TIENES que saber antes de tomar tu próximo préstamo. Links: Loan Calculator Calculator.net Síguenos en: Instagram: https://www.instagram.com/economicsdata/ Tiktok: https://www.tiktok.com/@economicsdata Youtube: https://youtube.com/@economicsdata257
In this brief retrospective episode, Shane, Alcine and their 'Magic Millennial' producer, Maya Cueva look back at Season 1, reflecting on the moments that nested deepest in their hearts. You’ll get to hear or revisit impactful clips from guests in Season 1 and hear about what our producer Maya Cueva is up to on her other projects. The hosts also talk about current innovations from outside of education, including sobriety “quit lit” and Dr. Gabor Maté’s incredible work on childhood development, trauma and the potential lifelong impacts on physical and mental health conditions that show up daily in our schools and classrooms. As we prepare to launch Season 2 in February, Alcine invokes Dr. Jamila Dugan’s invitation in Episode 4: “How do I dream bigger and in community? Who do I need to be in community with so that my dreams become bigger?” Join us and dream with us about next-generation schools that affirm love and value every child!
For Further Learning:
Learn about Producer Maya Cueva’s PBS project On the Divide Host a screening of On The Divide vía GOOD Docs! https://gooddocs.net/products/on-the-divide Episodes mentioned and excerpted include: https://www.onthedividemovie.com
Episode 4: “What Does it Mean to Freedom Dream?”: Disrupting Traps and Tropes with Dr. Jamila Dugan
Episode 6: “We Need to Marginalize Standardized Testing” with Young Whan Choi
Episode 8: “Connecting Present to Past”: The Impact of Critical Pedagogy with Rocky Rivera and Norma Gallegos
If you’re interested in listening to Tales of The Town, the podcast about Oakland — listen here. You can also get tickets to the Tales of The Town film: https://www.talesofthetown.info Tales of the Town Podcast : https://podcasts.apple.com/us/podcast/introducing-tales-of-the-town-a-podcast-about-black-oakland/id1235932328?i=1000579592977 Get Dr. Gholdy Mohammed’s Cultivating Genius
We talked about
Chris’s background Switching careers multiple times Freedom at companies Chris’s role as an internal consultant Chris’s sabbatical ChatGPT How being a generalist helped Chris in his career The cons of being a generalist and the importance of T-shaped expertise The importance of learning things you’re interested in Tips to enjoy learning new things Recruiting generalists The job market for generalists vs for specialists Narrowing down your interests Chris’s book recommendations
Links:
Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman Andrej Karpathy, former Senior Director of AI at Tesla, who's now focused on teaching and sharing his knowledge: https://www.youtube.com/@AndrejKarpathy Beautifully done videos on engineering of things in the real world: https://www.youtube.com/@RealEngineering Chris' website: https://szafranek.net/ Zalando Tech Radar: https://opensource.zalando.com/tech-radar/ Modal Labs, new way of deploying code to the cloud, also useful for testing ML code on GPUs: https://modal.com Excellent Twitter account to follow to learn more about prompt engineering for ChatGPT: https://twitter.com/goodside Image prompts for Midjourney: https://twitter.com/GuyP Machine Learning Workflows in Production - Krzysztof Szafanek: https://www.youtube.com/watch?v=CO4Gqd95j6k From Data Science to DataOps: https://datatalks.club/podcast/s11e03-from-data-science-to-dataops.html
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
David’s background A day in the life of a professor David’s current projects Starting a school The different types of professors David’s recent papers Similarities and differences between research labs and startups Finding (or creating) good datasets David’s lab Balancing research and teaching as a professor David’s most rewarding research project David’s most underrated research project David’s virtual data science seminars on YouTube Teaching at universities without doing research Staying up-to-date in research David’s favorite conferences Selecting topics for research Convincing students to stay in academia and competing with industry Finding David online
Links:
David A. Bader: https://davidbader.net/ NJIT Institute for Data Science: https://datascience.njit.edu/ Arkouda: https://github.com/Bears-R-Us/arkouda NJIT Data Science YouTube Channel: https://www.youtube.com/c/NJITInstituteforDataScience
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Christiaan’s background Usual ways of collecting and curating data Getting the buy-in from experts and executives Starting an annotation booklet Pre-labeling Dataset collection Human level baseline and feedback Using the annotation booklet to boost annotation productivity Putting yourself in the shoes of annotators (and measuring performance) Active learning Distance supervision Weak labeling Dataset collection in career positioning and project portfolios IPython widgets GDPR compliance and non-English NLP Finding Christiaan online
Links:
My personal blog: https://useml.net/ Comtura, my company: https://comtura.ai/ LI: https://www.linkedin.com/in/christiaan-swart-51a68967/ Twitter: https://twitter.com/swartchris8/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
In this episode, Bryce and Conor interview Kate Gregory about her career history. Link to Episode 92 on Website Twitter ADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest: Kate Gregory is an author, sought-after conference speaker, trainer, Microsoft Most Valuable Professional (MVP), and partner at Gregory Consulting. Kate has been using C++ since before Microsoft had a C++ compiler. She is an early adopter of many software technologies and tools, and a well-connected member of the software development community. Kate is one of the founders of #include whose goal is a more welcoming and inclusive C++ community. She also serves on the board of directors of Cpp Toronto, a non-profit organization that provides an open, inclusive, and collaborative place where software developers can meet and discuss topics related to C++ software development. Show Notes Date Recorded: 2022-08-15 Date Released: 2022-08-26 Podcast Appearances CppCast Episode 30: Stop Teaching C (When Teaching C++)Episode 148: C++ SimplicityEpisode 238: Beautiful C++.NET Rocks! Episode 88: Kate Gregory on C+++, VB.NET, and VSTOOther .Net Rocks Episodes (search “Kate Gregory”)CoRecursive Episode 56: Memento Mori With Kate GregoryOther Links C++Now 2019: Conor Hoekstra “Algorithm Intuition”CppCon 2015: Kate Gregory “Stop Teaching C”Keynote: “Am I A Good Programmer?” - Kate Gregory - CppNorth 2022Beautiful C++: 30 Core Guidelines for Writing Clean, Safe, and Fast Code by Guy Davidson & Kate GregoryWATFOR — The University of Waterloo FORTRAN IV compilerWATFIVPluralSight - Kate GregoryNDC TechTown - Magazinet Kongsberg (29 Aug – 1 Sept)Intro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
We talked about:
DataTalks.Club intro Tereza’s background Working as a coach Identifying the mismatches between your needs and that of a company How to avoid misalignments Considering what’s mentioned in the job description, what isn’t, and why Diversity and culture of a company Lack of a salary in the job description Way of doing research about the company where you will potentially work How to avoid a mismatch with a company other than learning from your mistakes Before data, during data, after data (a company’s data maturity level) The company’s tech stack Finding Tereza online
Links:
Decoding Data Science Job Descriptions (talk): https://www.youtube.com/watch?v=WAs9vSNTza8 Talk at ConnectForward: https://www.youtube.com/watch?v=WAs9vSNTza8 Slides: https://www.slideshare.net/terezaif/decoding-data-science-job-descriptions-250687704 Talk at DataLift: https://www.youtube.com/watch?v=pCtQ0szJiLA Slides: https://www.slideshare.net/terezaif/lessons-learned-from-hiring-and-retaining-data-practitioners
MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Daynan’s background Astronomy vs cosmology Applications of data science and machine learning in astronomy Determining signal vs noise What the data looks like in astronomy Determining the features of an object in space Ground truth for space objects Why water is an important resource in the space economy Other useful resources that can be found in asteroids Sources of asteroids The data team at an asteroid mining company Open datasets for hobbyists Mission and hardware design for asteroid mining Partnerships and hires
Links:
LinkedIn: https://www.linkedin.com/in/daynan/ We're looking for a Sr Data Engineer: https://boards.eu.greenhouse.io/karmanplus/jobs/4027128101?gh_jid=4027128101 Minor Planet Center: https://minorplanetcenter.net/- JPL Horizons has a nice set of APIs for accessing data related to small bodies (including asteroids): https://ssd.jpl.nasa.gov/api.html ESA has NEODyS: https://newton.spacedys.com/neodys IRSA catalog that contains image and catalog data related to the WISE/NEOWISE data (and other infrared platforms): https://irsa.ipac.caltech.edu/frontpage/ NASA also has an archive of data collected from their various missions, including a node related to small bodies: https://pds-smallbodies.astro.umd.edu/ Sub-node directly related to asteroids: https://sbn.psi.edu/pds/ Size, Mass, and Density of Asteroids (SiMDA) is a nice catalog of observed asteroid attributes (and an indication of how small our sample size is!): https://astro.kretlow.de/?SiMDA The source survey data, several are useful for asteroids: Pan-STARRS (https://outerspace.stsci.edu/display/PANSTARRS)
MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
Thanks for tuning in to the Data Driven Strength Podcast!
Timestamps:
01:04 New review paper on the limitations of proximity to failure research
13:50 What is the effect size of interest?
41:00 Strength Limiter Discussion
To learn more about 1 on 1 coaching: https://datadrivenstrength.typeform.com/to/JR3Gzm?typeform-source=linktr.ee
If you'd like to sign up to our email list, please visit the bottom section of our website via this link:
https://www.data-drivenstrength.com
If you’d like to submit a question for a future episode please follow the link provided:
https://forms.gle/c5aCswfCq6XUDTiAA
Link to Individualized Programming + Self Coaching Toolkit Product Page:
https://www.data-drivenstrength.com/individualized-programming Links to papers discussed:
-
https://www.researchgate.net/publication/359044527_Methods_for_Controlling_and_Reporting_Resistance_Training_Proximity_to_Failure_Current_Issues_and_Future_Directions
-
https://pubmed.ncbi.nlm.nih.gov/35263685/
-
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7725035/
-
https://journals.lww.com/nsca-jscr/Fulltext/2021/02001/Impact_of_Two_High_Volume_Set_Configuration.21.aspx?fbclid=IwAR3_TzZaaPIK6ZaoK5Tty3e22GgJU_1sJMtdHICV1Okq0AQSVfqvwdNu1fM
Follow us on Instagram at: @datadrivenstrength @zac.datadrivenstrength @josh.datadrivenstrength @jake.datadrivenstrength @drake.datadrivenstrength
Q&A Episode 25- General Update, Minimum Effective Training Dose, Rate of Force Development, and More
Thanks for tuning in to the Data Driven Strength Podcast!
Timestamps:
00:00 Intro/general update
10:02 Home gym equipment and setting up weighted pushups
23:47 Training frequency and proximity to failure
36:14 Rate of force development discussion
01:09:00 Minimum effective dose for strength and hypertrophy
To learn more about 1 on 1 coaching: https://datadrivenstrength.typeform.com/to/JR3Gzm?typeform-source=linktr.ee
If you'd like to sign up to our email list, please visit the bottom section of our website via this link: https://www.data-drivenstrength.com
If you’d like to submit a question for a future episode please follow the link provided: https://forms.gle/c5aCswfCq6XUDTiAA
Link to Individualized Programming + Self Coaching Toolkit Product Page: https://www.data-drivenstrength.com/individualized-programming
Training to Failure Fatigue Meta discussed:
- https://link.springer.com/article/10.1007/s40279-021-01602-x
Links to RFD papers discussed:
-
https://onlinelibrary.wiley.com/doi/pdf/10.1111/sms.13775?casa_token=EkLP_ZxQKuEAAAAA:pZoYDR1zERHdyTDFJhdkxJByWY4POb2kilm1JQnhf2o4-K-wWGKwk_iPxKpYJPrIwXHfxUfC1eso4yI
-
https://link.springer.com/content/pdf/10.1007/s00421-016-3439-2.pdf
-
https://www.mdpi.com/2076-3417/11/1/45
-
https://www.tandfonline.com/doi/pdf/10.1080/02640414.2015.1119299?needAccess=true
-
https://europepmc.org/article/med/34100789
-
https://pubmed.ncbi.nlm.nih.gov/29577974/
-
https://www.researchgate.net/publication/325748706_Functional_and_physiological_adaptations_following_concurrent_training_using_sets_with_and_without_concentric_failure_in_elderly_men_A_randomized_clinical_trial
-
https://pubmed.ncbi.nlm.nih.gov/32049887/
-
https://journals.humankinetics.com/view/journals/ijspp/14/1/article-p46.xml
Links to MED papers discussed:
-
https://pubmed.ncbi.nlm.nih.gov/21131862/
-
https://www.frontiersin.org/articles/10.3389/fphys.2021.735932/full
-
https://pubmed.ncbi.nlm.nih.gov/31373973/
Follow us on Instagram at: @datadrivenstrength @zac.datadrivenstrength @josh.datadrivenstrength @jake.datadrivenstrength @drake.datadrivenstrength
Simson Garfinkel, Senior Computer Scientist for Confidentiality and Data Access at the US Census Bureau, discusses his work modernizing the Census Bureau disclosure avoidance system from private to public disclosure avoidance techniques using differential privacy. Some of the discussion revolves around the topics in the paper Randomness Concerns When Deploying Differential Privacy.
WORKS MENTIONED:
"Calibrating Noise to Sensitivity in Private Data Analysis" by Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith "Issues Encountered Deploying Differential Privacy" by Simson L Garfinkel, John M Abowd, and Sarah Powazek "Randomness Concerns When Deploying Differential Privacy" by Simson L. Garfinkel and Philip Leclerc
Check out: https://simson.net/page/Differential_privacy
Thank you to our sponsor, BetterHelp. Professional and confidential in-app counseling for everyone. Save 10% on your first month of services with www.betterhelp.com/dataskeptic