talk-data.com

Showing 19 results

Activities & events

Title & Speakers	Event
183 - Part II: Designing with the Flow of Work: Accelerating Sales in B2B Analytics and AI Products by Minimizing Behavior Change 2025-11-27 · 02:00 Brian T. O’Neill – host In this second part of my three-part series (catch Part I via episode 182), I dig deeper into the key idea that sales in commercial data products can be accelerated by designing for actual user workflows—vs. going wide with a “many-purpose” AI and analytics solution that “does more,” but is misaligned with how users’ most important work actually gets done. To explain this, I will explain the concept of user experience (UX) outcomes, and how building your solution to enable these outcomes may be a dependency for you to get sales traction, and for your customer to see the value of your solution. I also share practical steps to improve UX outcomes in commercial data products, from establishing a baseline definition of UX quality to mapping out users’ current workflows (and future ones, when agentic AI changes their job). Finally, I talk about how approaching product development as small “bets” helps you build small, and learn fast so you can accelerate value creation. Highlights/ Skip to: Continuing the journey: designing for users, workflows, and tasks (00:32) How UX impacts sales—not just usage and adoption(02:16) Understanding how you can leverage users’ frustrations and perceived risks as fuel for building an indispensable data product (04:11) Definition of a UX outcome (7:30) Establishing a baseline definition of product (UX) quality, so you know how to observe and measure improvement (11:04 ) Spotting friction and solving the right customer problems first (15:34) Collecting actionable user feedback (20:02) Moving users along the scale from frustration to satisfaction to delight (23:04) Unique challenges of designing B2B AI and analytics products used for decision intelligence (25:04) Quotes from Today’s Episode One of the hardest parts of building anything meaningful, especially in B2B or data-heavy spaces, is pausing long enough to ask what the actual ‘it’ is that we’re trying to solve. People rush into building the fix, pitching the feature, or drafting the roadmap before they’ve taken even a moment to define what the user keeps tripping over in their day-to-day environment. And until you slow down and articulate that shared, observable frustration, you’re basically operating on vibes and assumptions instead of behavior and reality. What you want is not a generic problem statement but an agreed-upon description of the two or three most painful frictions that are obvious to everyone involved, frictions the user experiences visibly and repeatedly in the flow of work. Once you have that grounding, everything else prioritization, design decisions, sequencing, even organizational alignment suddenly becomes much easier because you’re no longer debating abstractions, you’re working against the same measurable anchor. And the irony is, the faster you try to skip this step, the longer the project drags on, because every downstream conversation becomes a debate about interpretive language rather than a conversation about a shared, observable experience. __ Want people to pay for your product? Solve an observable problem—not a vague information or data problem. What do I mean? “When you’re trying to solve a problem for users, especially in analytical or AI-driven products, one of the biggest traps is relying on interpretive statements instead of observable ones. Interpretive phrasing like ‘they’re overwhelmed’ or ‘they don’t trust the data’ feels descriptive, but it hides the important question of what, exactly, we can see them doing that signals the problem. If you can’t film it happening, if you can’t watch the behavior occur in real time, then you don’t actually have a problem definition you can design around. Observable frustration might be the user jumping between four screens, copying and pasting the same value into different systems, or re-running a query five times because something feels off even though they can’t articulate why. Those concrete behaviors are what allow teams to converge and say, ‘Yes, that’s the thing, that is the friction we agree must change,’ and that shift from interpretation to observation becomes the foundation for better design, better decision-making, and far less wasted effort. And once you anchor the conversation in visible behavior, you eliminate so many circular debates and give everyone, from engineering to leadership, a shared starting point that’s grounded in reality instead of theory." __ One of the reasons that measuring the usability/utility/satisfaction of your product’s UX might seem hard is that you don’t have a baseline definition of how satisfactory (or not) the product is right now. As such, it’s very hard to tell if you’re just making product changes—or you’re making improvements that might make the product worth paying for at all, worth paying more for, or easier to buy. "It’s surprisingly common for teams to claim they’re improving something when they’ve never taken the time to document what the current state even looks like. If you want to create a meaningful improvement, something a user actually feels, you need to understand the baseline level of friction they tolerate today, not what you imagine that friction might be. Establishing a baseline is not glamorous work, but it’s the work that prevents you from building changes that make sense on paper but do nothing to the real flow of work. When you diagram the existing workflow, when you map the sequence of steps the user actually takes, the mismatches between your mental model and their lived experience become crystal clear, and the design direction becomes far less ambiguous. That act of grounding yourself in the current state allows every subsequent decision, prioritizing fixes, determining scope, measuring progress, to be aligned with reality rather than assumptions. And without that baseline, you risk designing solutions that float in conceptual space, disconnected from the very pains you claim to be addressing." __ Prototypes are a great way to learn—if you’re actually treating them as a means to learn, and not a product you intend to deliver regardless of the feedback customers give you. "People often think prototyping is about validating whether their solution works, but the deeper purpose is to refine the problem itself. Once you put even a rough prototype in front of someone and watch what they do with it, you discover the edges of the problem more accurately than any conversation or meeting can reveal. Users will click in surprising places, ignore the part you thought mattered most, or reveal entirely different frictions just by trying to interact with the thing you placed in front of them. That process doesn’t just improve the design, it improves the team’s understanding of which parts of the problem are real and which parts were just guesses. Prototyping becomes a kind of externalization of assumptions, forcing you to confront whether you’re solving the friction that actually holds back the flow of work or a friction you merely predicted. And every iteration becomes less about perfecting the interface and more about sharpening the clarity of the underlying problem, which is why the teams that prototype early tend to build faster, with better alignment, and far fewer detours." __ Most founders and data people tend to measure UX quality by “counting usage” of their solution. Tracking usage stats, analytics on sessions, etc. The problem with this is that it tells you nothing useful about whether people are satisfied (“meets spec”) or delighted (“a product they can’t live without”). These are product metrics—but they don’t reflect how people feel. There are better measurements to use for evaluating users’ experience that go beyond “willingness to pay.” Payment is great, but in B2B products, buyers aren’t always users—and we’ve all bought something based on the promise of what it would do for us, but the promise fell short. "In B2B analytics and AI products, the biggest challenge isn’t complexity, it’s ambiguity around what outcome the product is actually responsible for changing. Teams often define success in terms of internal goals like ‘adoption,’ ‘usage,’ or ‘efficiency,’ but those metrics don’t tell you what the user’s experience is supposed to look like once the product is working well. A product tied to vague business outcomes tends to drift because no one agrees on what the improvement should feel like in the user’s real workflow. What you want are visible, measurable, user-centric outcomes, outcomes that describe how the user’s behavior or experience will change once the solution is in place, down to the concrete actions they’ll no longer need to take. When you articulate outcomes at that level, it forces the entire organization to align around a shared target, reduces the scope bloat that normally plagues enterprise products, and gives you a way to evaluate whether you’re actually removing friction rather than just adding more layers of tooling. And ironically, the clearer the user outcome is, the easier it becomes to achieve the business outcome, because the product is no longer floating in abstraction, it’s anchored in the lived reality of the people who use it." Links Listen to part one: Episode 182 Schedule a Design-Eyes Assessment with me and get clarity, now. AI/ML Analytics	Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) Listen
Bridging Data and Decision-Making: AI's Role in Modern Analytics 2025-08-12 · 00:35 Drew Gilson – guest @ Gravity , Lucas Thelosen – guest @ Gravity , Tobias Macey – host Summary In this episode of the Data Engineering Podcast Lucas Thelosen and Drew Gilson from Gravity talk about their development of Orion, an autonomous data analyst that bridges the gap between data availability and business decision-making. Lucas and Drew share their backgrounds in data analytics and how their experiences have shaped their approach to leveraging AI for data analysis, emphasizing the potential of AI to democratize data insights and make sophisticated analysis accessible to companies of all sizes. They discuss the technical aspects of Orion, a multi-agent system designed to automate data analysis and provide actionable insights, highlighting the importance of integrating AI into existing workflows with accuracy and trustworthiness in mind. The conversation also explores how AI can free data analysts from routine tasks, enabling them to focus on strategic decision-making and stakeholder management, as they discuss the future of AI in data analytics and its transformative impact on businesses. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Lucas Thelosen and Drew Gilson about the engineering and impact of building an autonomous data analystInterview IntroductionHow did you get involved in the area of data management?Can you describe what Orion is and the story behind it?How do you envision the role of an agentic analyst in an organizational context?There have been several attempts at building LLM-powered data analysis, many of which are essentially a text-to-SQL interface. How have the capabilities and architectural patterns grown in the past ~2 years to enable a more capable system?One of the key success factors for a data analyst is their ability to translate business questions into technical representations. How can an autonomous AI-powered system understand the complex nuance of the business to build effective analyses?Many agentic approaches to analytics require a substantial investment in data architecture, documentation, and semantic models to be effective. What are the gradations of effectiveness for autonomous analytics for companies who are at different points on their journey to technical maturity?Beyond raw capability, there is also a significant need to invest in user experience design for an agentic analyst to be useful. What are the key interaction patterns that you have found to be helpful as you have developed your system?How does the introduction of a system like Orion shift the workload for data teams?Can you describe the overall system design and technical architecture of Orion?How has that changed as you gained further experience and understanding of the problem space?What are the most interesting, innovative, or unexpected ways that you have seen Orion used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Orion?When is Orion/agentic analytics the wrong choice?What do you have planned for the future of Orion?Contact Info LucasLinkedInDrewLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links OrionLookerGravityVBA == Visual Basic for ApplicationsText-To-SQLOne-shotLookMLData GrainLLM As A JudgeGoogle Large Time Series ModelThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA AI/ML Analytics Data Analytics Data Engineering Data Management Datafold LLM Python SQL	Data Engineering Podcast Listen
167 - AI Product Management and Design: How Natalia Andreyeva and Team at Infor Nexus Create B2B Data Products that Customers Value 2025-04-16 · 11:59 Brian T. O’Neill – host , Natalia Andreyeva – Senior Director of Product Management @ Infor Today, I’m talking with Natalia Andreyeva from Infor about AI / ML product management and its application to supply chain software. Natalia is a Senior Director of Product Management for the Nexus AI / ML Solution Portfolio, and she walks us through what is new, and what is not, about designing AI capabilities in B2B software. We also got into why user experience is so critical in data-driven products, and the role of design in ensuring AI produces value. During our chat, Natalia hit on the importance of really nailing down customer needs through solid discovery and the role of product leaders in this non-technical work. We also tackled some of the trickier aspects of designing for GenAI, digital assistants, the need to keep efforts strongly grounded in value creation for customers, and how even the best ML-based predictive analytics need to consider UX and the amount of evidence that customers need to believe the recommendations. During this episode, Natalia emphasizes a huge key to her work’s success: keeping customers and users in the loop throughout the product development lifecycle. Highlights/ Skip to What Natalia does as a Senior Director of Product Management for Infor Nexus (1:13) Who are the people using Infor Nexus Products and what do they accomplish when using them (2:51) Breaking down who makes up Natalia's team (4:05) What role does AI play in Natalia's work? (5:32) How do designers work with Natalia's team? (7:17) The problem that had Natalia rethink the discovery process when working with AI and machine learning applications (10:28) Why Natalia isn’t worried about competitors catching up to her team's design work (14:24) How Natalia works with Infor Nexus customers to help them understand the solutions her team is building (23:07) The biggest challenges Natalia faces with building GenAI and machine learning products (27:25) Natalia’s four steps to success in building AI products and capabilities (34:53) Where you can find more from Natalia (36:49) Quotes from Today’s Episode “I always launch discovery with customers, in the presence of the UX specialist [our designer]. We do the interviews together, and [regardless of who is facilitating] the goal is to understand the pain points of our customers by listening to how they do their jobs today. We do a series of these interviews and we distill them into the customer needs; the problems we need to really address for the customers. And then we start thinking about how to [address these needs]. Data products are a particular challenge because it’s not always that you can easily create a UX that would allow users to realize the value they’re searching for from the solution. And even if we can deliver it, consuming that is typically a challenge, too. So, this is where [design becomes really important]. [...] What I found through the years of experience is that it’s very difficult to explain to people around you what it is that you’re building when you’re dealing with a data-driven product. Is it a dashboard? Is it a workboard? They understand the word data, but that’s not what we are creating. We are creating the actual experience for the outcome that data will deliver to them indirectly, right? So, that’s typically how we work.” - Natalia Andreyeva (7:47) “[When doing discovery for products without AI], we already have ideas for what we want to get out. We know that there is a space in the market for those solutions to come to life. We just have to understand where. For AI-driven products, it’s not only about [the user’s] understanding of the problem or the design, it is also about understanding if the data exists and if it’s feasible to build the solution to address [the user’s] problem. [Data] feasibility is an extremely important piece because it will drive the UX as well.” - Natalia Andreyeva (10:50) “When [the team] discussed the problem, it sounded like a simple calculation that needed to be created [for users]. In reality, it was an entire process of thinking of multiple people in the chain [of command] to understand whether or not a medical product was safe to be consumed. That’s the outcome we needed to produce, and when we finally did, we actually celebrated with our customers and with our designers. It was one of the most difficult things that we had to design. So why did this problem actually get solved, and why we were the ones who solved it? It’s because we took the time to understand the current user experience through [our customer] interviews. We connected the dots and translated it all into a visual solution. We would never be able to do that without the proper UX and design in that place for the data.” - Natalia Andreyeva (13:16) “Everybody is pressured to come up with a strategy [for AI] or explain how AI is being incorporated into their solutions and platform, but it is still essential for all of my peers in product management to focus on the value [we’re] creating for customers. You cannot bypass discovery. Discovery is the essential portion where you have to spend time with your customers, champions, advisors, and their leads, but especially users who are doing this [supply chain] job every single day—so we understand where the pain point really is for them, we solve that pain, and we solve it with our design team as a partner, so that solution can surface value. ” - Natalia Andreyeva (22:08) “GenAI is a new field and new technology. It’s evolving quickly, and nobody really knows how to properly adapt or drive the adoption of AI solutions. The speed of innovation [in the AI field] is a challenge for everybody. People who work on the frontlines (i.e. product, engineering teams), have to stay way ahead of the market. Meanwhile, customers who are going to be using these [AI] solutions are not going to trust the [initial] outcomes. It’s going to take some time for people to become comfortable with them. But it doesn’t mean that your solution is bad or didn’t find the market fit. It’s just not time for your [solution] yet. Educating our users on the value of the solution is also part of that challenge, and [designers] have to be very careful that solutions are accessible. Users do not adopt intimidating solutions.” - Natalia Andreyeva (27:41) “First, discovery—where we search for the problems. From my experience, [discovery] works better if you’re very structured. I always provide [a customer] with an outline of what needs to happen so it’s not a secret. Then, do the prototyping phase and keep the customer engaged so they can see the quick outcomes of those prototypes. This is where you also have to really include the feasibility of the data if you’re building an AI solution, right? [Prototyping] can be short or long, but you need to keep the customer engaged throughout that phase so they see quick outcomes. Keep on validating this conceptually, you know, on the napkin, in Figma, it doesn’t really matter; you have to keep on keeping them engaged. Then, once you validate it works and the customer likes it, then build. Don’t really go into the deep development work until you know [all of this!] When you do build, create a beta solution. It only has to work so much to prove the value. Then, run the pilot, and if it’s successful, build the MVP, then launch. It’s simple, but it is a lot of work, and you have to keep your customers really engaged through all of those phases. If something doesn’t work [along the way], try to pivot early enough so you still have a viable product at the end.” - Natalia Andreyeva (34:53) Links Natalia's LinkedIn AI/ML Analytics Dashboard GenAI	Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) Listen
DevFest Berlin 2024 2024-11-23 · 08:00 DevFest Berlin is back! This year back to Humboldt University of Berlin, with more than 25 talks & workshops, you can expect a whole day of learning, socialising, and engaging with a vibrant Berlin Tech community! 🎫 Get you ticket here: pretix.eu/devfestberlin/2024/ 🖍 Call for Papers still open: pretalx.com/devfest-berlin-2024/cfp Agenda Day 1 9:00 AM: Registration & Coffee 🥐 ☕️ 9:45 AM: 🎤 Welcoming 10:00 AM: 🎤 Katya Vinnichenko - Introduction to Google Principles of Responsible AI This year's DevFest explores how AI can improve lives globally, from business to healthcare to education. At Google we acknowledge AI's potential, while also recognising the challenges it presents. Thus, we are committed to helping you build and use AI responsibly, ensuring fairness and ethical practices. In my talk you will learn: the main principles of responsible AI at Google; the ethical implications of AI; best practices for developing AI systems and integrating AI into Google products and services; last but not least – how AI will change the role of the developer as we know it. 10:50 AM: 🎤 Oleksii Antypov - DMARC Demystified Discover the essential framework behind DMARC and how it secures email communication across the internet. This session covers the historical evolution of email security, dives into the common challenges of implementing DMARC, and provides actionable best practices for protecting your domain. Ideal for developers, security professionals, and anyone interested in safe email practices. In a world where phishing and email spoofing are constant threats, DMARC stands as a vital defense mechanism. “DMARC Demystified” takes you through a journey from the origins of email security to the modern challenges and solutions that DMARC offers. We'll explore how DMARC works with SPF and DKIM, why it’s essential for organizations of all sizes, and the practical steps to ensure smooth implementation. Expect an interactive timeline tracing the milestones of email security, detailed breakdowns of real-world cases, and insights into optimizing DMARC. Walk away with a deeper understanding of email protection, armed with knowledge to strengthen your email systems and protect against threats. 11:40 AM: 🎤 Marcin Chudy - Demystifying App Architecture: The LeanCode Guide At LeanCode we developed over 40 Flutter apps, spanning from huge enterprise apps to nimble startup ventures. Some were developed by a single Flutter dev, some came into light through collaborative efforts across multiple teams. Each of them was different. Each of them presented unique challenges and taught us invaluable lessons. In this talk, we invite you to explore different approaches to architecting Flutter apps. Central to our narrative will be the concept of architectural drivers - key factors or priorities that steer our decisions about how the app is structured and designed. We'll show how we leverage our experience when approaching new projects. Drawing from our successes and failures, we'll present our current Flutter stack which enables us to craft robust, scalable, and maintainable applications. While there is no silver bullet for Flutter architecture, we can still have some sensible defaults. Why do we use BLoC for state management? Why not Riverpod? Why do we love hook 12:30 PM: 🎤 Danny Preussler - Ten things you heard about testing that might be wrong Testing became an essential part of Android development. Many conference talks have been given and even more best practices have been written. But what if, as time evolved, some of the things we thought were true, changed? Let’s start questioning some of these in this talk: Are flaky tests fixable? Are mocks even harmful? Is DI about testing? Did we understand testing in isolation properly? Is the test pyramid still valid? And in times of AI, should we generate tests? Come and join my session to learn more! 1:10 PM: Lunch 🍔🥤 2:40 PM: 🎤 Andrey Sitnik - Privacy-first architecture: alternatives to GDPR popup and local-first Why and how modern developers could increase the privacy of modern Web. The popularity of clouds, the rise of huge monopolies across the internet, and the growth of shady data brokers recently have made the world a much more dangerous place for ordinary people—here is how we fix it. In this talk, Andrey Sitnik, the creator of PostCSS and the privacy-first open-source RSS reader, will explain how we can stop this dangerous trend and make the web a private place again. — Beginners will find simple steps, which can be applied to any website — Advanced developers will get practical insights into new local-first architecture — Privacy experts could find useful unique privacy tricks from a global world perspective and beyond just U.S. privacy risks 3:30 PM: 🎤 Raphaël VO - Largest Contentful Paint - The unheard story Largest Contentful Paint (LCP) is more than a speed metric — it's the unseen factor shaping user experiences and impacting SEO. While often overlooked, LCP reveals when a page’s core content is truly ready, affecting how users perceive load time and usability. This talk uncovers LCP’s role, why it matters more than we think, and simple strategies to boost LCP for better engagement and rankings. Discover the hidden story behind one of web performance’s most crucial, yet understated metrics. Did you know the speed of a single webpage element could decide if users stay or leave? Largest Contentful Paint (LCP) is that hidden hero, quietly working to load the most important content quickly. This talk unveils LCP’s role in creating faster, more engaging web experiences and why it’s key to winning user loyalty. Dive into the “unheard story” of LCP and discover practical tips to make your site not only faster but unforgettable. 4:20 PM: 🎤 Ash Davies - Navigation in a Multiplatform World: Choosing the Right Framework for your App Navigation in mobile, desktop, and web applications is such a fundamental part of how we structure our architecture. In order to both obtain functional clarity, and abstraction from platform level implementation. For a long time, there have been options available specific to each platform, and even options part of the platform framework itself. Though it can be difficult to find the right option for platform-agnostic code, ensuring consistency. Some go one step further, providing an opinionated guide on how to architecture your application. In this talk, I'll evaluate the options available, how they differ, and to what type of applications they are best suited. Including how to get started with them, and the best practice guidelines on how to get the most out of them, for your application. 5:10 PM: 🎤 Vadim Makeev - You don’t know MathML. Almost nobody does Do you speak math? Me neither. Still, math formulas have always been around: from Wikipedia articles to JavaScript APIs and even CSS docs. It looks so alien that I never had a clue how to express it on the web. Apparently, there’s a markup language for that. HTML for content, SVG for vector graphics, and MathML for math! And it’s pretty cross-browser, too. Let’s dive into the basics and quirks of the language of the universe. Even if math is not your love language, you might learn something interesting about the web platform. Day 2 9:00 AM: Registration & Coffee 🥐 ☕️ 10:00 AM: 🎤 Alex Mir – Accessibility matters The regulators are here and now businesses will care about the a11y. Let's make the a11y compliance not just a formal check. I believe that it is our job as industry experts to understand why it is important and get our products ready for all groups of people. 10:50 AM: 🎤 Marco Gomiero - From Android to Multiplatform and beyond With Kotlin Multiplatform getting increasingly established, many Android libraries became multiplatform. But how to make an existing Android library multiplatform? In this talk, we will cover the common challenges faced while migrating Android libraries to Kotlin Multiplatform, like handling platform-specific dependencies, re-organizing the project structure without losing the contributor's history, testing on multiple platforms, and publishing the library. 11:20 AM: 🎤 Muhammad Salman Bediya - Crucial Performance Issue in Flutter Apps: Memory Leaks Memory leaks can be hard to spot but have a big impact on the performance of Flutter apps, especially those running for long periods. In this talk, we’ll explore the most common reasons memory leaks happen in Flutter and Dart, focusing on how asynchronous programming and Streams can make them more challenging. You’ll learn practical tips to identify and fix these issues, helping your apps run smoother and more efficiently. 11:40 AM: 🎤 Andrii Raikov - Maximizing Scalability with Go and Redis: A Telemetry Processing Journey At Delivery Hero, we process 10,000 requests per second using Go and Redis. Join us to learn how this powerful duo handles high-load telemetry data efficiently and cost-effectively, with scalability, resource optimization, and continuous innovation through customized data flows. 12:30 PM: 🎤 Tomek Porozynski - Can You Outsmart an AI? Adventures in Prompt Hacking In this talk combined with hands-on elements, participants will engage in a series of live prompt hacking challenges, accessible directly through their mobile devices. The workshop begins with simple prompt injection techniques and progressively moves to more sophisticated manipulation strategies. After each successful hack, I'll analyze what made it work and transform these insights into practical defense mechanisms. Attendees will learn: Common vulnerabilities in AI prompt design, Practical techniques for prompt injection attacks, Essential strategies for securing chatbot applications, Best practices for implementing defensive layers, Real-world examples of prompt security failures and successes Perfect for developers working with AI models, security enthusiasts, or anyone interested in building safer AI applications. No specialized tools needed - just bring your phone and creativity! You'll leave with concrete techniques for both testing and securing your AI systems against prompt manipulation attacks. 1:10 PM: Lunch 🍔🥤 2:40 PM: 🎤 Cesar Martinez - Domain Driven Design Fundamentals for Frontend Developers What can we learn from Domain Driven Design and how to start applying its teachings in your frontend codebase. 3:30 PM: 🎤 Vadym Pinchuk - Effortless optimization of Flutter apps: performance tips for developers In this session, we’ll dive into effortless yet impactful ways to optimize your Flutter applications. Performance improvements don’t always require a full rewrite—sometimes, small adjustments can lead to big gains. We'll explore practical tips and tricks for enhancing app speed, responsiveness, and efficiency with minimal effort. From reducing widget rebuilds to handling large data efficiently and managing state effectively, this talk will provide developers with actionable insights to deliver a smoother user experience. Whether you’re a beginner or an experienced Flutter dev, you’ll walk away with easy-to-apply techniques to optimize your apps without breaking a sweat. 4:20 PM: 🎤 Ian Ballantyne - Generative AI on Mobile and Web with Google AI Edge Generative AI is no longer limited to execution in the cloud. Small language models, such as Gemma 2B, are quickly becoming small and powerful enough for on-device AI, offering benefits like low latency, offline functionality, privacy, and cost-effectiveness. Google AI Edge, with MediaPipe and LiteRT (formerly Tensorflow Lite), enables the development and deployment of efficient on-device AI models. These frameworks handle the complexities of model execution and hardware acceleration, allowing developers to focus on creating innovative AI experiences. Think generative AI is just about chatbots? Think again. This talk will go beyond basic conversations with language models and explore how on-device generative AI can be integrated into everyday apps ready to help with tasks, answer questions, and provide creative inspiration, all powered by the information located on-device. Imagine truly useful apps that are quick to respond and still work without an internet connection. 5:10 PM: 🎤 Bogdan Plieshka - Automated Testing Layers in a multidimensional Monorepo: Fast-tracking Quality for hundreds apps In this talk, I’ll dive into the testing layers that make up our quality pipeline at Zattoo, including static analysis, unit, system, and end-to-end testing. We’ll discuss the concept of quality gates, shift-left approach, and affected domain recognition, which helps us maintain reliability across a large, dynamic codebase, bringing total quality feedback for contributors to 3 minutes. I’ll share practices for achieving scalable, fast testing in a high-complexity environment, offering insights for anyone working with large-scale applications or monorepos and looking to streamline QA processes. Day 3 9:00 AM: Registration & Coffee 🥐 ☕️ 10:00 AM: 🎤 Inès Mir & Doruk Deniz Kutukculer - Fellowship of Product. How your team setup affects your experience Did you know there are 2 types of team formation in tech? These formations can change your experience in the team drastically and you better recognise them early to adjust your expectations from the job. And even more importantly, you need to show different qualities on job interviews to get this job in a particular team formation! Deniz Doruk Kuetuekcueler, a head of engineering, and Inès Mir, a principal product designer, are trying to figure out how design and engineering can effectively work together in these setups. 10:50 AM: 🎤 Alireza Rahmaty - How we automate the App Release Monitoring at GetYourGuide App release monitoring (ARM) represents a suite of innovative tools designed to monitor the health and stability of iOS and Android app releases. These tools provide real-time updates by sending notifications to Slack channels and logging the app's status throughout the release process. At GetYourGuide, we have developed an ARM to monitor the rollout of our Android and iOS apps from the moment they are submitted to the App Store & Google Play until they are fully released. We ship releases faster and with more confidence using ARM! 11:40 AM: 🎤 Aleksandr Gorbunov - Flutter for frontenders or There and Back Again Every developer, regardless of specialization, may encounter the need to create a UI for a client application. The choice of technology may depend on the developer, or it may be pre-determined by the client, as happened in my case. The peculiarity is that, coming from frontend development in JavaScript, I started building user interfaces in Flutter. Today, there is a vast number of technologies that enable the development of cross-platform applications. These technologies are evolving rapidly, attracting large communities, and more frequently, companies are adopting them. For example, Flutter is a powerful framework that allows developers to create cross-platform applications. With a high probability, every developer may encounter the need to use such development tools, and it’s great that frameworks like Flutter come with detailed documentation and extensive community support, making it relatively easy to start developing with them. Although, at first glance, everything might not seem smooth, and the desire to revert to familiar methods may arise. 12:05 PM: 🎤 Muhammad Salman Bediya - Crucial Performance Issue in Flutter Apps: Memory Leaks Memory leaks can be hard to spot but have a big impact on the performance of Flutter apps, especially those running for long periods. In this talk, we’ll explore the most common reasons memory leaks happen in Flutter and Dart, focusing on how asynchronous programming and Streams can make them more challenging. You’ll learn practical tips to identify and fix these issues, helping your apps run smoother and more efficiently. 12:30 PM: 🎤 Ole Bulbuk - Native GUIs For All Traditionally native GUIs are highly platform dependent and often specific for one programming language. In this talk we will explore a way to create GUI applications that supports virtually all platforms and any programming language. It is very effective and easy to use, too. 1:10 PM: Lunch 🍔🥤 2:40 PM: 🎤 Nicole Terc - Tap it! Shake it! Fling it! Sheep it! - The Gesture Animations Dance! Let's have fun with animations, gestures and sensors! Using Compose Multiplatform, we'll go over how to create animations using gestures and sensor events for Android & iOS. We'll cover some basics like how to get the device motion and position information, how to track gestures in the screen, and how you can combine them with animations to have fun! After this talk, you'll have a better understanding on how to use the sensor frameworks, how to make your own gesture effects, and how to create interesting animations in an easy way. Keep it fun, keep it animated! 3:30 PM: 🎤 Andrii Khrystian - From waves to widgets: Sound processing in Flutter In this talk, we'll explore how to work with sound in Flutter apps. We'll go over the basics of adding sound effects and processing audio to make your apps more interesting. You'll learn how to handle audio files and integrate them smoothly with your Flutter projects. This session is great for anyone looking to add audio features to their apps simply and effectively. 4:20 PM: 🎤 Randy Nel Gupta - From Practice: Migration of an Order Processing System to the Cloud A case study on how an order processing system, processing 50,000 orders daily for an international retailer spread across multiple continents and jurisdictions, is migrated to the cloud. The legacy system is implemented in PL/SQL and must be migrated during ongoing operations. The presentation will cover all aspects from testing, monitoring, to development and the application of Site Reliability Engineering. Furthermore, less technical topics will be introduced, such as the systematic composition of teams to ensure the necessary technical as well as domain-specific expertise. 4:50 PM: 🎤 Wietse Venema - Running open large language models in production with serverless GPUs Many developers are interested in running open large language models, such as Google's Gemma and Llama. Open models give you full control over the deployment options, the timing of model upgrades, the private data that goes into the model, and the ability to fine-tune on specific tasks such as data extraction. Hugging Face TGI is a popular open-source LLM inference server, and Hugging Face TRL is excellent for fine-tuning. You’ll learn how to build and deploy an application that uses an open model on Google Cloud Run with cost-effective GPUs that scale down to zero instances. Day 4 9:00 AM: Registration & Coffee 🥐 ☕️ 10:00 AM: 🎤 Daniel Stamer & Diana Nanova - Workshop: From Prototype to Production In this hands-on technical workshop participants will work on a hilarious web service prototype and deploy it to the cloud, set up build and deployment pipelines, extend the code base to leverage GenAI functionality, use SRE practices to effectively operate the application and finally strengthen the security posture of the overall software delivery process to guard against supply chain attacks. 1:10 PM: Lunch 🍔🥤 2:40 PM: 🎤 John Nguyen - Building a Chrome Extension using Gemini and Langchain In this workshop, you will learn the basics of creating a Google Chrome Extension (which will also work on any Chromium-based Browser). We will build a simple Page summarizer using Bun, Typescript, Gemini, and LangChain. We will learn the anatomy of the manifest.json for building a Chrome Extension, Bun's bundler, how to interact with Gemini, and why LangChain is a good idea here. 3:45 PM: 🎤 Guillaume Vernade - How to make the most of Gemini multimodal capabilities? We all know that in Tech there are always dozens of way of doing anything. But what if we could only use LLM for a first investigation? Let me show you how I'm trying to solve the mystery of who killed my pond's fishes using the power of Gemini. Day 5 9:00 AM: Registration & Coffee 🥐 ☕️ 10:00 AM: 🎤 Mario Bodemann & Joost van Dijk - Workshop: Passkeys on Android: How to get rid of passwords Passwords. Or two factors? What about multiple factors? Which email did you register with? Why is 'password123' not working on this side, that is password is shared everywhere else? If you recognize some of those questions, I am happy to add another couple: What are passkeys? Or how about: How to use passkeys to replace passwords in an Android app? In this workshop I will walk through the later two questions: How to build an Android App that registers and signs users in, using passkeys. Expect a quick explanation of this fancy new technology, why it will replace passwords and how you can store them either on your mobile devices or on dedicated hardware. Following that, a fictive application and service will be built to show you how to use those passkeys and which moving pieces you will need. Expect to use you Android Studio with Kotlin and common best practices to build an Android app, talking to the public available backend. 11:05 AM: 🎤 Anton Borries - Workshop: Adding Homescreen Widgets to Flutter Apps HomeScreen Widgets are a great way to provide more Information to your Users right on their HomeScreens providing more ways for your App to appear in User's lives and help them achieve their goals. In this Workshop we'll look at the necessary steps needed in order to add HomeScreen Widgets to Flutter Apps using the home_widget package 12:10 PM: 🎤 Elena Grahovac - Workshop: Mastering Multiple Engineering Leadership Roles for Maximum Impact As an engineering manager or technical leader, navigating multiple roles that demand a diverse set of skills is a common yet challenging part of the job. In this workshop, we will explore how to effectively balance these multiple roles and responsibilities in a complex engineering environment. Participants will be guided through the creation of their own leadership framework, tailored to adapt to the unique situations and styles of each individual. Beginning with identifying core values and responsibilities, the framework is elaborated into an actionable plan to succeed. This workshop not only offers an opportunity for reflection on personal and professional development but also provides tools and insights to enhance management capabilities and team dynamics. Join us to cultivate a comprehensive approach to leadership that aligns with your unique role, responsibilities, and personal style. 1:10 PM: Lunch 🍔🥤 2:40 PM: 🎤 Gus Martins - Workshop: Gemma for Everyone: Your First Steps with Open Models and AI Dive into the world of open models and AI with Gemma! This workshop will guide you through the basics of using Gemma, Google's powerful family of language models. Learn how to harness Gemma's capabilities for tasks like text generation, question answering, and more. We'll also explore how to fine-tune Gemma on your own data, allowing you to create custom AI solutions tailored to your needs. No prior experience with large language models is required! 3:45 PM: 🎤 Shahriyar Rzayev - Learn Flask the hard way: Introduce Architecture Patterns Flask is a popular and flexible web framework for Python, but building scalable and maintainable Flask applications can be challenging without a solid understanding of architecture patterns. This workshop aims to provide participants with a detailed explanation of applying architecture patterns to Flask projects. By exploring various design principles and best practices, attendees will learn how to structure their Flask applications for improved scalability, modularity, and maintainability. Focusing on the Repository, Unit of Work, and Use Cases patterns, attendees will gain experience in applying these patterns to enhance code organization, maintainability, and testability. All these layers are wired together using Dependency Injection, which is yet another powerful tool to use in your applications. The application we are going to build is stored in: https://github.com/ShahriyarR/hexagonal-flask-blog-tutorial We are going to completely rewrite the official Blog application described in Flask documentation by applying architecture patterns. All abstraction layers are covered by unit and integration tests, which will give the attendees a detailed view of why it is important to structure the application using architecture patterns. Speakers Aleksandr Gorbunov - Smart Steel Technologies (Full Stack Developer) A skilled developer specializing in JavaScript (JS) and TypeScript (TS), with strong expertise in frontend development. Proficient in the Vue ecosystem (Vue2, Vue3, Composition API, Nuxt 3), using Webpack and Vite for project bundling. Experienced in testing with Vitest, Cypress, and Jest. Adept in CSS preprocessors like SASS and Stylus. Additionally, has solid knowledge of Flutter and experie… Andrey Sitnik - Evil Martians (Lead Engineer) With more than 20 years in open source, Andrey Sitnik created a few popular CSS tools (PostCSS, Autoprefixer), local-first framework (Logux), and many small libraries with millions of downloads (like Nano ID). Andrii Khrystian - Dynatrace (Senior Flutter Developer) GDG Linz organiser. Senior Flutter Developer at Dynatrace. Public speaker and tech writer Andrii Raikov - Delivery Hero SE (Principal Software Engineer) Andrii is a Principal Software Engineer at Delivery Hero. He has a total of 15 years of experience with Ruby and has been very passionate about Go for the last 5 years. Anton Borries - 1KOMMA5° (Software Engineer) Anton is a Software Engineer working at 1KOMMA5° He loves building great UI and UX using Flutter. Coming from an Android Background the gap between Flutter and native Features has always tickled his interest. This has lead him into improving the experience of developing HomeScreen Widgets for Flutter Apps Ash Davies Google Developer Expert for Android, enthusiastic speaker, lead engineer at ImmobilieenScout24, Kotlin aficionado, spends more time travelling than working. Daniel Stamer - Google (Cloud Customer Engineer) Daniel is passionate about building modern cloud-native applications on Google's serverless technologies. He works with digital natives out of Germany’s startup capital Berlin and helps to modernize applications or build brand new ones in the cloud. Danny Preussler - SoundCloud (Android Platform Lead) Danny is a developer by heart, living in Berlin and leading the Android team at SoundCloud. He worked for companies like Groupon, Viacom, eBay and Alcatel and started his mobile career long before any Android with Java ME and Blackberry applications. Danny writes and talks about mobile development and testing regularly and is a Google Developer Expert for Android and Kotlin. Elena Grahovac - FerretDB (Director of Engineering) Elena has been in software engineering since 2007, focusing on backend systems and infrastructure. Having played the roles of both individual contributor and engineering manager, Elena is passionate about combining technical expertise with strong team collaboration. A dedicated advocate of DevOps practices, she aims to enhance workflows and bring teams together. Elena believes in helping peopl… Gus Martins - Google (Developer Advocate) Katya Vinnichenko - Google (Program Manager) Katya is a Program Manager at Google Developer Relations team. Currently she is leading the Google Developer Groups across Europe, the Middle East and Africa. Marcin Chudy - LeanCode (Senior Flutter Developer) Marcin is a Senior Flutter Developer at LeanCode, currently playing tech lead role in a big project for the banking sector. Previously worked with backend, web frontend with React, finally settling on mobile and falling in love with Flutter at first sight. After work, he enjoys dancing salsa and bachata and attends metal concerts. Marcin is a Senior Flutter Developer at LeanCode and has … Marco Gomiero - Airalo (Senior Android Developer \| Kotlin GDE) Marco is an Android engineer, currently working at Airalo. He is a Google Developer Expert for Kotlin, he loves Kotlin and he has experience with native Android and native iOS development, as well as cross-platform development with Flutter and Kotlin Multiplatform. In his spare time, he writes and maintains open-source code, he shares his dev experience by writing on his blog, speaking a… Mario Bodeman - Yubico (Android Developer Advocate) Speaker of talks, coder of code, doer of dones. Muhammad Bediya Muhammad Salman is a Senior Software Engineer specializing in mobile app development with a focus on building scalable, high-quality applications using Flutter, React Native, Xamarin, and Swift. With experience leading frontend teams on enterprise-level projects that have reached over 1.5 million users, he brings a strong commitment to creating impactful, user-centered solutions. A dedic… Nicole Terc Android GDE, Boardgame lover, videogame addict and origami enthusiast, Nicole self taught herself to code and has been fooling around with the Android ecosystem for more than 10 years. She has participated in a diverse variety of projects for several clients around the world, including video streaming, news, social media and public transport applications. Regardless of what the current adventu… Ole Bulbuk - Ardan Labs Ole is a backend engineer since the nineties. He has been working for many companies big and small and seen many projects fail or succeed. He loves to be part of the global Go community and working on projects that make the world a better place. In his spare time he is co-organising the Berlin chapter of GDG Golang, develops open source software and enjoys time with his family. Oleksii Antypov - DmarcDkim.com (Founder & CEO) Experienced CTO specializing in early-stage startups. Formerly with Rocket Internet and PocketBook, now focused on accelerating global DMARC adoption. Originally from Ukraine, I relocated to Berlin in 2015 to deepen my expertise in building successful startups from the ground up. Raphaël VO - Ekino (Senior Software Engineer) I’m Raphael Vo, a passionate Senior Software Engineer with over 10 years of experience, specializing in Angular and frontend development. I love turning complex ideas into delightful user experiences and tackling challenges creatively and enthusiastically. When I'm not coding, you’ll find me diving into the latest tech trends or enjoying epic board game nights with friends. As an aspiring spea… Vadim Makeev Frontend developer in love with the Web, browsers, bicycles, and podcasting. He/him, MDN technical writer, Google Developer Expert. Alex Mir - mobile.de (Frontend Engineer) Frontend Engineer at car retail platform mobile.de (part of Adevinta / ex-Ebay) Alireza Rahmaty - GetYourGuide (Android Developer) I am Alireza, an Android developer with 6+ years of experience building apps. I have experience building server-driven UI apps, complex UI, localisation and testing, and CI/CDI. I sometimes go hiking and play video games. Cesar Martinez - Meyer Sound (Web Developer) Web developer with around 10 years of experience and a passion for software architecture. Currently working at Meyer Sound. Bogdan Plieshka - Zattoo (Principal Engineer) Engineer with over a decade of Frontend development experience, passionate about automation, accessibility, and scaling complex systems. Working at Zattoo as a Principal Engineer, focusing on delivering frontend solutions across Web, React, and React Native for streaming media content.Organizer of the React Berlin Meetup, actively contributing to the development community. Diana Nanova - Google (Customer Engineering Manager) Diana is a Customer Engineering Manager at Google Cloud. Based in the German tech startup capital Berlin, Diana helps digital native customers and startups across various industries to leverage the capabilities of Google Cloud and loves championing for Google culture. Doruk Deniz Kutukculer - Zalando (Head of Engineering) IT professional and a leader with over 15 years of experience in the industry. Currently a Head of Engineering at Zalando. Guillaume Vernade - Google (AI Dev Rel) I've been a jack-of-all-trades in the Tech industry, starting as a prototyper building apps on Google Glasses and the first Android watches, then became a Product Owner and an Agile coach. I realized my childhood dream of becoming a video game producer then came back to my other passion: AI. Ian Ballantyne - Google (AI DevRel) Ian is a Developer Relations Engineer for AI at Google. Currently he works on generative AI, such as Gemini and Gemma. He is passionate about on-device AI, using technologies such as Google AI Edge to deploy artificial intelligence to web and mobile devices. He has been in Developer Relations at Google for 9 years specializing in helping partners and developers unlock the capability of Google … Inès Mir - Zalando (Principal Product Designer) A principal product designer at Zalando and a content creator. John Nguyen - Eon (Backend Developer) Fullstack developer with a knack for whipping up code recipes using my secret ingredients: a dash of JavaScript, a pinch of Python, and a whole lot of serverless magic John's journey in software development began as a PHP developer, but he later transitioned to front-end development and became passionate about all things related to Javascript. While working as a data DevOps engineer in a… Joost van Dijk - Yubico (Developer Advocate) Joost van Dijk is a developer advocate at Yubico. As the inventor of the YubiKey, Yubico makes secure login easy and available for everyone. Joost focuses on securing digital identities and accelerating the adoption of open authentication standards as part of Yubico’s developer program. Randy Gupta Randy is a Google Developer Expert for Cloud and also Organizer of the GDG Düsseldorf. With a professional experience of more 25 years in software development he is focused today on building microservices applications on top of Kubernetes. Shahriyar Rzayev - Nord Security (Senior Software Engineer) Senior Software Engineer @ Nord Security. Moving forward on Clean Code and Clean Architecture. Previous accomplishments include contributing to open source, providing technical direction, and sharing knowledge about Clean Code and Architectural patterns. An empathetic team player and mentor. Azerbaijan Python Group Leader. Former QA Engineer and Bug Hunter. Tomek Porożyński - Atos Vadym Pinchuk - Sky (Mobile Software Engineer) Vadym, a seasoned software engineer, possesses a wealth of experience in Android application development. He has skillfully transitioned his expertise to cross-platform development, utilizing Flutter. Throughout his career, Vadym has collaborated with a diverse range of companies, from industry giants like Samsung, Volvo, Bosch, and Instagram to smaller start-ups. Leveraging his extensiv… Wietse Venema - Google (Google Cloud Engineer) Wietse Venema is an engineer at Google Cloud. He wrote the O’Reilly book on Cloud Run. Hosts Seemran Xec - Sawayo (Software Engineer) A focused developer possessing professional experience of 6+ years in software development for product-based and service-based industries, with businesses acquiring valuable insight and implementing best practices. Collaborated with startups and other businesses as a freelancer/consultant to build, design, and manage the product. I'm passionate about what I do and a lifelong learner. Louis Tsai - Zalando SE (GDG Organizer) Alex Mir - mobile.de (Frontend Engineer) Frontend Engineer at car retail platform mobile.de (part of Adevinta / ex-Ebay) Jhoon Saravia - Greenmates (Mobile Engineer) Software consultant and developer, experienced in Android, Flutter and Full-stack. Interested in working on DEI initiatives as a complement to my core work. Particularly interested in technology, gadgetry, the future, the combination of those three and the impact that driving Diversity, Equity and Inclusion has on all of them both in and out of the workplace.Amateur photographer a… Matthias Geisler - Thermondo (Senior Software Engineer) True believer in (Kotlin) Multiplatform and working with it for over 4 years now. Builds solutions for Android. Maintainer and developer of KMock. Co-Organizer of KUG Berlin, GDG Android Berlin, Rust Berlin and XTC Berlin. Emy Jamalian - Atlas Metrics (Software QA Engineer) Complete your event RSVP here: https://gdg.community.dev/events/details/google-gdg-berlin-presents-devfest-berlin-2024/.	DevFest Berlin 2024
Hands-On with SurrealDB: Exploring Key Features in Practice 2024-11-04 · 20:00 Getting Started with SurrealDB is a 3-part online workshop series hosted by Women Coding Community in collaboration with SurrealDB. During these hands-on workshops, Caroline Morton - software engineer and senior research fellow at SurrealDB - will demonstrate SurrealDB's functionality and use cases, both as a traditional relational database, as well as highlighting the additional features and benefits of a multi-model database. -- Hands-On with SurrealDB: Exploring Key Features in Practice This session will provide a practical exploration of some of SurrealDB’s key out-of-the-box features. We’ll focus on fundamental operations such as CRUD (Create, Read, Update, Delete), schema definition, and SurrealDB’s unique "Get functions." Additionally, we’ll take a look at the Surreal Store demo dataset to see how these features can be applied in real-world scenarios. The session will emphasise hands-on learning rather than covering every feature in detail, giving participants an opportunity to engage with the database directly. We'll also take time to reflect on the pros and cons of the features we explore, providing a balanced understanding of where SurrealDB excels and where it may have limitations. Agenda: 20:00 - 20:10 Introduction to Women Coding Community 20:10 - 21:10 Tech Talk by Caroline and Alexander 21:10 - 21:30 Q&A Speaker: Caroline Morton \\| LinkedIn Dr. Caroline Morton is a software engineer, epidemiologist and co-author of Async Rust Programming book. A Senior Clinical Research Fellow in Health Data Engineering at Queen Mary University London, Caroline writes software tools and data pipelines for the Genes and Health initiative, the world's largest community-embedded biobank. Simultaneously, Caroline is pursuing a Ph.D. in Synthetic Data Generation for health data research, sponsored by SurrealDB. She also leads Yellow Bird, a London start-up changing medical training with its virtual emergency room product, Clinical Metrics. Speaker: Alexander Fridriksson \\| LinkedIn Alexander has been with SurrealDB from the start, being employee number 5. He is passionate about storytelling with data and sharing lessons learned working across Analytics, Data Science, Data Engineering, and Consulting. Host: Sonali Goel \\| LinkedIn As a Senior/Lead Engineer at Yoox-Net-a-porter, Sonali brings over 14 years of experience in the technology industry, specializing in crafting solutions that prioritize reliability, flexibility, and scalability. Her focus is on managing and constructing extensive-scale e-commerce solutions. Sonali is committed to empowering women in technology and the workplace. As an active member of the DEI women's network at her organization and a leader in the Women Coding Community, she strives to support and advance diversity, equity, and inclusion initiatives that uplift women in the field. Co-Host: Nevena Verbič \\| LinkedIn Nevena is a Software Engineer experienced in working with diverse technologies such as C, C++, C#, Python, JavaScript, HTML, PHP and Java. Working for an ISO 9001 certificated company she had the opportunity to learn the best practices in software development and project organization. Nevena is improving Python, Java 21, and Spring Boot skills. SurrealDB 👉 Get started with SurrealDB: https://sdb.li/getstarted 👉 Explore the new and improved features of SurrealDB 2.0 Key features of SurrealDB: ⭐ Reduces development time: SurrealDB simplifies your database and API stack by removing the need for most server-side components, allowing you to build secure, performant apps faster and cheaper. ⭐ Real-time collaborative API backend service: SurrealDB functions as both a database and an API backend service, enabling real-time collaboration. ⭐ Support for multiple querying languages: SurrealDB supports SQL querying from client devices, GraphQL, ACID transactions, WebSocket connections, structured and unstructured data, graph querying, full-text indexing, and geospatial querying. ⭐ Granular access control: SurrealDB provides row-level permissions-based access control, giving you the ability to manage data access with precision.	Hands-On with SurrealDB: Exploring Key Features in Practice
Build Your Second Brain One Piece At A Time 2024-04-28 · 16:00 Tsavo Knott – Founder / Creator @ Pieces , Tobias Macey – host Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementDagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Tsavo Knott about Pieces, a personal AI toolkit to improve the efficiency of developersInterview IntroductionHow did you get involved in machine learning?Can you describe what Pieces is and the story behind it?The past few months have seen an endless series of personalized AI tools launched. What are the features and focus of Pieces that might encourage someone to use it over the alternatives?model selectionsarchitecture of Pieces applicationlocal vs. hybrid vs. online modelsmodel update/delivery processdata preparation/serving for models in context of Pieces appapplication of AI to developer workflowstypes of workflows that people are building with piecesWhat are the most interesting, innovative, or unexpected ways that you have seen Pieces used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pieces?When is Pieces the wrong choice?What do you have planned for the future of Pieces?Contact Info LinkedInParting Question From your perspective, what is the biggest barrier to adoption of machine learning today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.Links PiecesNPU == Neural Processing UnitTensor ChipLoRA == Low Rank AdaptationGenerative Adversarial NetworksMistralEmacsVimNeoVimDartFlutte AI/ML Analytics Cloud Computing Dagster Data Collection Data Engineering Data Lake Data Lakehouse Delta GenAI Hudi Iceberg Python Cyber Security SQL Trino	Data Engineering Podcast Listen
Futures Forum: AI/LLM Meetup 2024-03-13 · 18:30 🎙️ Featured Talks: Improve your Apps with Generative AI Christoffer Noring Panel Discussion: Opportunities presented by AI: what are the opportunities presented by AI both for consumers and companies? What is necessary to be able to seize those opportunities? -- Futures Forum is a new monthly meetup, powered by SurrealDB, that explores the AI revolution, with a particular focus on LLMs (Large Language Models). With actionable insights, the series will include introductory and advanced topic discussions by a dynamic and diverse range of expert speakers, live demos, news updates and panel discussions. Plus plenty of time for networking and refreshments. 📣 Talks and demos by experts in their fields 🗣️ Friendly networking 🍕 Delicious bites – Homeslice, 🥗 Kaleido Rolls and 🍦ice cream 🍹 Tasty drinks – including boozy and alcohol-free options, sponsored by Something & Nothing 😆 Informative, inclusive and fun! Agenda 18:30 - 19:00 Welcome Drinks Attendees arrive – grab a drink, explore the space and mingle. 19:00 - 19:30 Featured Talk Christoffer Noring: Improve your Apps with Generative AI Maybe you've used ChatGPT, or maybe everyone is talking about AI but you want to know how you can add that for your apps to benefit customers. In this talk you'll learn how to: Get the most out of your prompting through tips and tricks and patterns Integrate AI in your apps Leverage different types of AI from text , images to search 19:30 - 20:00 Social & Tasty Bites A fun and informal way to connect with others in the tech community. Grab a slice of pizza or summer roll, hop on the Oculus or chat with other attendees and the SurrealDB team. 20:00 - 20:45 Panel Discussion with Audience Q&As Opportunities presented by AI: what are the opportunities presented by AI both for consumers and companies? What is necessary to be able to seize those opportunities? 20:45 - 21:15 Social & Sweet Treats FAQs Who’s this event for? For anyone wanting to learn more about the way that AI and LLMs are shaping the world. What’s an LLM? A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massive data sets to understand, summarise, generate and predict new content. LLMs power chat apps like ChatGPT, but also they are increasingly underlying almost every aspect of AI. Is the venue accessible? Absolutely! There is a lift that takes you up to Level 4 where SurrealDB Social is held. What's a SurrealDB event like? Check out photos from our previous events at https://surrealdb.gallery. Who are SurrealDB? SurrealDB is a modern cloud-native multi-model database that allows users and developers to focus on building their applications rather than architecting and managing their infrastructure with features like SurrealQL, Live Queries, Search, Change Data Capture, and ML. Am I guaranteed a ticket at this event? Our events are tech-focused and in the interest of keeping our events relevant and meaningful for those attending, tickets are issued at our discretion. We therefore reserve the right to refund ticket orders before the event and to request proof of identity and/or professional background upon entry. Are there any House Rules? At SurrealDB, we are committed to providing live and online events that are safe and enjoyable for all attending. Please review our Code of Conduct and Privacy Policy for more information.	Futures Forum: AI/LLM Meetup
How to Get the Most Out of Your Time Series Data 2024-02-27 · 17:00 Welcome to our upcoming webinar where we'll delve into the dynamic world of time series data analysis. Join us as we explore effective solutions to common challenges encountered when working with time series data. Be prepared for a live demonstration of CrateDB in action, showcasing essential techniques such as data ingestion, modeling, visualization, and machine learning model training. What you will learn: - Understanding the complexity of time series data and the importance of contextual information for accurate interpretation. - Initiating and developing your initial time series project with CrateDB\, covering data ingestion\, modeling\, and analysis. - Harnessing CrateDB's distinctive features for storage\, querying\, and in-depth analysis of time series data. - Applying a variety of techniques for data visualization and training machine learning models to extract meaningful insights. Speakers: - Christian Kurze\, VP Product at CrateDB - Marija Selakovic\, Developer Advocate at CrateDB ➡️ Register for free: https://hubs.ly/Q02l8q_M0	How to Get the Most Out of Your Time Series Data
How to Get the Most Out of Your Time Series Data 2024-02-27 · 10:00 Welcome to our upcoming webinar where we'll delve into the dynamic world of time series data analysis. Join us as we explore effective solutions to common challenges encountered when working with time series data. Be prepared for a live demonstration of CrateDB in action, showcasing essential techniques such as data ingestion, modeling, visualization, and machine learning model training. What you will learn: - Understanding the complexity of time series data and the importance of contextual information for accurate interpretation. - Initiating and developing your initial time series project with CrateDB\, covering data ingestion\, modeling\, and analysis. - Harnessing CrateDB's distinctive features for storage\, querying\, and in-depth analysis of time series data. - Applying a variety of techniques for data visualization and training machine learning models to extract meaningful insights. Speakers: - Christian Kurze\, VP Product at CrateDB - Marija Selakovic\, Developer Advocate at CrateDB ➡️ Register for free: https://hubs.ly/Q02l8q_M0	How to Get the Most Out of Your Time Series Data
Christmas Tech Talks: Dive into DuckDB & Hopsworks 2023-12-14 · 16:30 Christmas is just around the corner and what better way to end the year with talks around DuckDB? Our last meetup of the year will welcome Max from DuckDB Labs and Fabio from Hopsworks. Max will introduce DuckDB, an innovative embedded data management system optimized for analytical SQL workloads and Fabio will introduce feature stores and the challenges & learnings of integrating DuckDB and Arrow Flight into the Hopsworks platform. Agenda: 17:30 - 18:00: Doors open 18:00 - 18:10: Welcome 18:10 - 18:40: DuckDB: Transforming Data Management and Analytics 18:40 - 19:10: Snacks & Refreshments 19:10 - 19:40: MLOps on the fly: Optimizing a feature store with DuckDB and ArrowFlight 19:40 - 20:30: Networking – Presentations: DuckDB: Transforming Data Management and Analytics Max Gabrielsson - Software Engineer at DuckDB Labs In this talk we present DuckDB, a novel embedded data management system designed for analytical SQL workloads. By incorporating decades of clever techniques and algorithms from the database research community, DuckDB empowers data engineering on a single machine to reach a whole new level of scale and performance, without the hassle and operational overhead commonly associated with traditional database and data warehouse systems. One way DuckDB aims to achieve this goal is through its unique in-depth integration with Python, allowing for seamless interoperability with the existing data science ecosystem through familiar APIs and zero-copy data sharing between staple libraries like Numpy, Pandas and Polars. This makes DuckDB an essential tool for the practical data scientist looking to squeeze the most out of their system without having to leave their comfort zone. We will introduce and explain the main strengths and characteristics of DuckDB such as its parallelized vectorized query execution engine, out-of-core beyond memory capabilities and transparent compression, and demonstrate how these features can be leveraged in a typical python-based data science workflow through a series of examples mixing both SQL and dataframes. We will also showcase DuckDBs flexible extension system and illustrate how it can be used to bridge different data sources and domains. Speaker Bio: Max Gabrielsson is a software engineer at DuckDB Labs where he works on the DuckDB database system. While he generally tries to not stay confined to any specific part of the stack, he has a particular interest in geospatial data management and is the primary maintainer of the DuckDB spatial GIS extension. Max holds a BSc in Computer Science from Uppsala University and hopes to one day finish his MSc with a thesis on the topic of database systems. In his spare time he enjoys kickboxing and hacking on side projects, usually involving compilers, databases or cartography. MLOps on the fly: Optimizing a feature store with DuckDB and ArrowFlight Fabio Buso - VP of Engineering at Hopsworks Feature Stores are a vital part of the MLOps stack for managing machine learning features and ensuring data consistency. This talk introduces Feature Stores and the underlying data management architecture. We’ll then discuss the challenges and learnings of integrating DuckDB and Arrow Flight into our Feature Store platform, and share benchmarks showing up to 30x speedups compared to Spark/Hive. Discover how DuckDB and ArrowFlight can also speedup your data management and machine learning pipelines. Speaker Bio: Fabio Buso is VP of Engineering at Hopsworks, leading the Feature Store development team. Fabio holds a master’s degree in Cloud Computing and Services with a focus on data intensive applications. – About the event Date: December 14th, 17:30 - 20:30 Location: Hopsworks Office (Åsögatan 119, Plan 2, 116 24 Stockholm) The venue this time is at the Hopsworks Office. As the office is sometimes difficult to locate we have made this map for everyone to follow. See you then! Directions: 2-minute walk from Medborgarplatsen. Tickets: Sign up required. Anyone who is not on the list will not get in. The event is free of charge. Capacity: Space is limited to 70 participants. If you are signed up but unable to attend, please let us know by December 13th. Food and drinks: Snacks and drinks will be provided. Questions: Please contact the meetup organizers. – Code of Conduct The NumFOCUS Code of Conduct applies to this event; please familiarize yourself with it before attending. If you have any questions or concerns regarding the Code of Conduct, please contact the organizers.	Christmas Tech Talks: Dive into DuckDB & Hopsworks
DATA RELAY 2023 - Manchester - talks on Azure, PBI, SQL, Bicep, Powershell & DAX 2023-10-06 · 08:00 Data Relay is back in Manchester on Friday, 6 October RSVP via website https://www.eventbrite.co.uk/e/datarelay-2023-manchester-tickets-636529215017 Data Relay will be in Manchester on Friday 6th October 2023, featuring top quality Microsoft Data Platform content from great speakers. The day consists of a series of 50 minute technical presentations for all levels - beginner, intermediate and advanced. You can switch tracks throughout the day to enjoy the sessions that interest you the most. Address: Odeon Great Northern Warehouse, 235 Deansgate, Manchester, M3 4EN Registration: Tickets are on a first come, first served basis. Doors open at 8:30am on the day with sessions starting at 9:00am. Competitions: If you allow a sponsor to scan your QR code you will be entered into their raffle prize draw. By doing so you opt in to receive sales and marketing information from them, using the contact details you provided during registration. All sponsors will provide an unsubscribe link in their emails. Food: We'll be providing a light lunch, plus tea & coffee at a couple of breaks. We will be catering for vegan food, according to your food preferences during registration. Dress Code: Casual First time at Data Relay? Everyone has a 'first event', including the speakers and organisers of this event. We're lucky to be a part of the UK Microsoft Data Platform community, one of the most friendly and welcoming tech communities out there. Whether you're in a group or on your own, you'll be made to feel welcome and will learn lots about new and existing technology. It's a great opportunity to meet your peers and maybe even your Data Platform heroes! If you have any queries, contact us at [email protected] and we'll help you out. Manchester Scedule 09:00 - Keynote & Welcome 09:30 1 - Saving Your Wallet from the Cloud - Kevin Feasel 09:30 2 - AI Driven Data Enrichment with Azure Cognitive Search - Matt How 09:30 3 - Automating Database Deployments using Azure DevOps - Grant Fritchey 10:45 1 - Delta Lake Tables 101 - Kamil Nowinski 10:45 2 - Get a Load of this - Realistic Load Testing for Power BI, explained! - Matt Lakin 10:45 3 - Introduction to Bicep for the Cloud DBA - Frank Geisler 11:40 1 - Getting Insight from Anything: using IoT for good! - Andrew Blance 11:40 2 - A Hybrid SQL Journey: The daily routine of a Hybrid DBA - Jose Manuel Jurado Diaz 11:40 3 - Combining web apps and AI - how to unlock the power of the Azure OpenAI Service - Ben Jarvis 12:30 Lunch 13:30 1 - Securing Data Workloads with Azure Landing Zones - Chris Murray 13:30 2 - Ned Stratton - Regex in PowerShell: How to grasp it and why you should grasp it 13:30 3 - Sweet streams (are made of this: Hubs, Stream Analytics, Datalake) - Pálmi Símonarson 14:25 1 - The Kusto experience in Fabric - Brian Bønk 14:25 2 - Cognitive Services Extravaganza! - Gosia Borzęcka 14:25 3 - Examples of query tuning in SQL Server - Torsten Strauß 15:40 1 - Model Creation in Azure AutoML and Ingestion Through Excel - Lewis Prince 15:40 2 - DAX Explained Through Dance, Memes and Dad Jokes - Barney Lawrence 15:40 3 - Real-time Fraud Detection Challenges and Solutions - Fawaz Ghali 16:35 Raffle & Close	DATA RELAY 2023 - Manchester - talks on Azure, PBI, SQL, Bicep, Powershell & DAX
Event Data Engineering Podcast 2022-03-27
Eliminate The Bottlenecks In Your Key/Value Storage With SpeeDB 2022-03-27 · 18:00 Adi Gelvan – guest @ SpeeDB , Tobias Macey – host Summary At the foundational layer many databases and data processing engines rely on key/value storage for managing the layout of information on the disk. RocksDB is one of the most popular choices for this component and has been incorporated into popular systems such as ksqlDB. As these systems are scaled to larger volumes of data and higher throughputs the RocksDB engine can become a bottleneck for performance. In this episode Adi Gelvan shares the work that he and his team at SpeeDB have put into building a drop-in replacement for RocksDB that eliminates that bottleneck. He explains how they redesigned the core algorithms and storage management features to deliver ten times faster throughput, how the lower latencies work to reduce the burden on platform engineers, and how they are working toward an open source offering so that you can try it yourself with no friction. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it’s often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold. TimescaleDB, from your friends at Timescale, is the leading open-source relational database with support for time-series data. Time-series data is time stamped so you can measure how a system is changing. Time-series data is relentless and requires a database like TimescaleDB with speed and petabyte-scale. Understand the past, monitor the present, and predict the future. That’s Timescale. Visit them today at dataengineeringpodcast.com/timescale Your host is Tobias Macey and today I’m interviewing Adi Gelvan about his work on SpeeDB, the "next generation data engine" Interview Introduction How did you get involved in the area of data management? Can you describe what SpeeDB is and the story behind it? What is your target market and customer? What are some of the shortcomings of RocksDB t Airflow Analytics CI/CD Data Engineering Data Management Data Quality Datafold dbt GitHub Kubernetes Looker Modern Data Stack Snowflake SQL	Listen
Building A Community For Data Professionals at Data Council 2019-09-02 · 16:00 Pete Soderling – founder @ Data Council , Tobias Macey – host Summary Data professionals are working in a domain that is rapidly evolving. In order to stay current we need access to deeply technical presentations that aren’t burdened by extraneous marketing. To fulfill that need Pete Soderling and his team have been running the Data Council series of conferences and meetups around the world. In this episode Pete discusses his motivation for starting these events, how they serve to bring the data community together, and the observations that he has made about the direction that we are moving. He also shares his experiences as an investor in developer oriented startups and his views on the importance of empowering engineers to launch their own companies. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Listen, I’m sure you work for a ‘data driven’ company – who doesn’t these days? Does your company use Amazon Redshift? Have you ever groaned over slow queries or are just afraid that Amazon Redshift is gonna fall over at some point? Well, you’ve got to talk to the folks over at intermix.io. They have built the “missing” Amazon Redshift console – it’s an amazing analytics product for data engineers to find and re-write slow queries and gives actionable recommendations to optimize data pipelines. WeWork, Postmates, and Medium are just a few of their customers. Go to dataengineeringpodcast.com/intermix today and use promo code DEP at sign up to get a $50 discount! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing Pete Soderling about his work to build and grow a community for data professionals with the Data Council conferences and meetups, as well as his experiences as an investor in data oriented companies Interview Introduction How did you get involved in the area of data management? What was your original reason for focusing your efforts on fostering a community of data engineers? What was the state of recognition in the industry for that role at the time that you began your efforts? The current manifestation of your community efforts is in the form of the Data Council conferences and meetups. Previously they were known as Data Eng Conf and before that was Hakka Labs. Can you discuss the evolution of your efforts to grow this community? How has the community itself changed and grown over the past few years? Communities form around a huge variety of focal points. What are some of the complexities or challenges in building one based on something as nebulous as data? Where do you draw inspiration and direction for how to manage such a large and distributed community? What are some of the most interesting/challenging/unexpected aspects of community management that you have encountered? What are some ways that you have been surprised or delighted in your interactions with the data community? How do you approach sustainability of the Data Council community and the organization itself? The tagline that you have focused on for Data Council events is that they are no fluff, juxtaposing them against larger business oriented events. What are your guidelines for fulfilling that promise and why do you think that is an important distinction? In addition to your community building you are also an investor. How did you get involved in that side of your business and how does it fit into your overall mission? You also have a stated mission to help engineers build their own companies. In your opinion, how does an engineer led business differ from one that may be founded or run by a business oriented individual and why do you think that we need more of them? What are the ways that you typically work to empower engineering founders or encourage them to create their own businesses? What are some of the challenges that engineering founders face and what are some common difficulties or misunderstandings related to business? What are your opinions on venture-backed vs. "lifestyle" or bootstrapped businesses? What are the characteristics of a data business that you look at when evaluating a potential investment? What are some of the current industry trends that you are most excited by? What are some that you find concerning? What are your goals and plans for the future of Data Council? Contact Info @petesoder on Twitter LinkedIn @petesoder on Medium Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don’t forget to check out our other show, Podcast.init to learn about the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Links Data Council Database Design For Mere Mortals Bloomberg Garmin 500 Startups Geeks On A Plane Data Council NYC 2019 Track Summary Pete’s Angel List Syndicate DataOps Data Kitchen Episode DataOps Vs DevOps Episode Great Expectations Podcast.init Interview Elementl Dagster Data Council Presentation Data Council Call For Proposals The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast AI/ML Analytics Big Data Dagster Data Engineering Data Management DataOps DevOps Marketing Python Redshift Data Streaming	Listen

New Feature: Labels Page 2019-07-31 · 04:00 Jason Joven – host @ Chartmetric Highlights Are you a manager looking for a quick summary of a record label you’re talking to? Maybe you want to compare the label you work for with a competitor? We’ve got a solution for you in our new Labels Page!Mission Good morning, it’s Jason here at Chartmetric with your 3-minute Data Dump where we upload charts, artists and playlists into your brain so you can stay up on the latest in the music data world.We’re on the socials at “chartmetric”, that’s Chartmetric, no “S ”- follow us on LinkedIn, Instagram, Twitter, or Facebook, and talk to us! We’d love to hear from you.DateThis is your Data Dump for Wednesday, July 31st, 2019.New Feature: Labels PageDebuting this week is a new feature simply called the Labels Page! We’re still putting tweaks on how we’re visualizing and organizing the data, but we’re hoping you can get a good idea of what certain imprints have been up to lately.First and foremost, it isn’t an exhaustive list of labels in our entire database, because that would equal long load times.So we try to focus on more active labels with better known tracks.Specifically, this translates to labels that have released at least one track on Spotify with a Popularity Index of 40 or higher, in the past three years.The Labels List defaults to the highest number of such popular releases in the past 3 years......with Sony Music Entertainment having the most with 747 as of today, Columbia Records in 2nd with 711 and then RCA Records in 3rd place with 432.So notice that we are currently displaying both parent group labels as well as label imprints alongside each other, since this is how we receive tracks’ metadata, but we’ll keep optimizing this down the road.Once you click on various labels, we display their artists who’ve released in the past 60 days, along with releases in the past three years and even each label’s social media followers over time!For example, you can check out that Sony Music Latin’s top performer in the past 60 days is CNCO, the Latin boy band that we blogged about back in December 2017 (link in the show notes).CNCO is at a 78 Spotify Popularity Index, with an impressive 5.7M followers on the platform and 9.3M monthly listeners.Their latest release on the label was “De Cero” a little over a month ago on June 23rd, which is also on 43 Apple Music playlists, 25 Amazon Music playlists and 18 Deezer ones...which you can see quickly here in one page.You will also be able to get quick personalities of labels via their release numbers…...for example, Armada Music, as an electronic label, doesn’t have many releases that got over 60 on the Spotify Popularity Index in the past three years, as electronic as a genre has never been the strongest type of music on Spotify (which we also blogged about in Jan 2018, link in show notes).However, Armada has released 810 tracks in the past three years that we’re tracking, while similarly electronic genre labels do a similar pace like Monstercat at 785 and Ultra at 767.You can compare this with the 442 that 300 Entertainment has put out in the same time period, which is still a fast pace at over 12 releases per month...to be expected for a rap label in a genre that is well-known for constantly dropping new music.Get to know foreign labels like India’s T-Series or Brazil’s Som Livre via our Labels feature as well...we’ll surely include more label analysis down the road, but for now, check it out yourself with a free account at chartmetric.comOutro That’s it for your Daily Data Dump for Wednesday, July 31st, 2019. This is Jason from Chartmetric.Article links and show notes are at: podcast.chartmetric.comHappy Wednesday, see you on Friday!	How Music Charts Listen
Event Data Engineering Podcast 2018-12-24
Continuously Query Your Time-Series Data Using PipelineDB with Derek Nelson and Usman Masood - Episode 62 2018-12-24 · 03:00 Derek Nelson – guest @ PipelineDB , Usman Masood – guest @ PipelineDB , Tobias Macey – host Summary Processing high velocity time-series data in real-time is a complex challenge. The team at PipelineDB has built a continuous query engine that simplifies the task of computing aggregates across incoming streams of events. In this episode Derek Nelson and Usman Masood explain how it is architected, strategies for designing your data flows, how to scale it up and out, and edge cases to be aware of. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Usman Masood and Derek Nelson about PipelineDB, an open source continuous query engine for PostgreSQL Interview Introduction How did you get involved in the area of data management? Can you start by explaining what PipelineDB is and the motivation for creating it? What are the major use cases that it enables? What are some example applications that are uniquely well suited to the capabilities of PipelineDB? What are the major concepts and components that users of PipelineDB should be familiar with? Given the fact that it is a plugin for PostgreSQL, what level of compatibility exists between PipelineDB and other plugins such as Timescale and Citus? What are some of the common patterns for populating data streams? What are the options for scaling PipelineDB systems, both vertically and horizontally? How much elasticity does the system support in terms of changing volumes of inbound data? What are some of the limitations or edge cases that users should be aware of? Given that inbound data is not persisted to disk, how do you guard against data loss? Is it possible to archive the data in a stream, unaltered, to a separate destination table or other storage location? Can a separate table be used as an input stream? Since the data being processed by the continuous queries is potentially unbounded, how do you approach checkpointing or windowing the data in the continuous views? What are some of the features that you have found to be the most useful which users might initially overlook? What would be involved in generating an alert or notification on an aggregate output that was in some way anomalous? What are some of the most challenging aspects of building continuous aggregates on unbounded data? What have you found to be some of the most interesting, complex, or challenging aspects of building and maintaining PipelineDB? What are some of the most interesting or unexpected ways that you have seen PipelineDB used? When is PipelineDB the wrong choice? What do you have planned for the future of PipelineDB now that you have hit the 1.0 milestone? Contact Info Derek derekjn on GitHub LinkedIn Usman @usmanm on Twitter Website Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links PipelineDB Stride PostgreSQL Podcast Episode AdRoll Probabilistic Data Structures TimescaleDB [Podcast Episode]( Hive Redshift Kafka Kinesis ZeroMQ Nanomsg HyperLogLog Bloom Filter The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineerin Kinesis Data Engineering Data Management GitHub Hive Kafka postgresql Redshift	Listen
MarketStore: Managing Timeseries Financial Data with Hitoshi Harada and Christopher Ryan - Episode 24 2018-03-25 · 19:00 Hitoshi Harada – CTO @ Alpaca , Christopher Ryan – Lead software engineer @ Alpaca , Tobias Macey – host Summary The data that is used in financial markets is time oriented and multidimensional, which makes it difficult to manage in either relational or timeseries databases. To make this information more manageable the team at Alapaca built a new data store specifically for retrieving and analyzing data generated by trading markets. In this episode Hitoshi Harada, the CTO of Alapaca, and Christopher Ryan, their lead software engineer, explain their motivation for building MarketStore, how it operates, and how it has helped to simplify their development workflows. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Christopher Ryan and Hitoshi Harada about MarketStore, a storage server for large volumes of financial timeseries data Interview Introduction How did you get involved in the area of data management? What was your motivation for creating MarketStore? What are the characteristics of financial time series data that make it challenging to manage? What are some of the workflows that MarketStore is used for at Alpaca and how were they managed before it was available? With MarketStore’s data coming from multiple third party services, how are you managing to keep the DB up-to-date and in sync with those services? What is the worst case scenario if there is a total failure in the data store? What guards have you built to prevent such a situation from occurring? Since MarketStore is used for querying and analyzing data having to do with financial markets and there are potentially large quantities of money being staked on the results of that analysis, how do you ensure that the operations being performed in MarketStore are accurate and repeatable? What were the most challenging aspects of building MarketStore and integrating it into the rest of your systems? Motivation for open sourcing the code? What is the next planned major feature for MarketStore, and what use-case is it aiming to support? Contact Info Christopher Email Hitoshi Email Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links MarketStore GitHub Release Announcement Alpaca IBM DB2 GreenPlum Algorithmic Trading Backtesting OHLC (Open-High-Low-Close) HDF5 Golang C++ Timeseries Database List InfluxDB JSONRPC Slait CircleCI GDAX The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast API Data Engineering Data Management Datadog GitHub Go Hierarchical Data Format IBM	Listen
TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18 2018-02-11 · 16:00 Mike Freedman – Co-founder @ Timescale , Ajay Kulkarni – Co-founder @ Timescale , Tobias Macey – host Summary As communications between machines become more commonplace the need to store the generated data in a time-oriented manner increases. The market for timeseries data stores has many contenders, but they are not all built to solve the same problems or to scale in the same manner. In this episode the founders of TimescaleDB, Ajay Kulkarni and Mike Freedman, discuss how Timescale was started, the problems that it solves, and how it works under the covers. They also explain how you can start using it in your infrastructure and their plans for the future. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Ajay Kulkarni and Mike Freedman about Timescale DB, a scalable timeseries database built on top of PostGreSQL Interview Introduction How did you get involved in the area of data management? Can you start by explaining what Timescale is and how the project got started? The landscape of time series databases is extensive and oftentimes difficult to navigate. How do you view your position in that market and what makes Timescale stand out from the other options? In your blog post that explains the design decisions for how Timescale is implemented you call out the fact that the inserted data is largely append only which simplifies the index management. How does Timescale handle out of order timestamps, such as from infrequently connected sensors or mobile devices? How is Timescale implemented and how has the internal architecture evolved since you first started working on it? What impact has the 10.0 release of PostGreSQL had on the design of the project? Is timescale compatible with systems such as Amazon RDS or Google Cloud SQL? For someone who wants to start using Timescale what is involved in deploying and maintaining it? What are the axes for scaling Timescale and what are the points where that scalability breaks down? Are you aware of anyone who has deployed it on top of Citus for scaling horizontally across instances? What has been the most challenging aspect of building and marketing Timescale? When is Timescale the wrong tool to use for time series data? One of the use cases that you call out on your website is for systems metrics and monitoring. How does Timescale fit into that ecosystem and can it be used along with tools such as Graphite or Prometheus? What are some of the most interesting uses of Timescale that you have seen? Which came first, Timescale the business or Timescale the database, and what is your strategy for ensuring that the open source project and the company around it both maintain their health? What features or improvements do you have planned for future releases of Timescale? Contact Info Ajay LinkedIn @acoustik on Twitter Timescale Blog Mike Website LinkedIn @michaelfreedman on Twitter Timescale Blog Timescale Website @timescaledb on Twitter GitHub Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links Timescale PostGreSQL Citus Timescale Design Blog Post MIT NYU Stanford SDN Princeton Machine Data Timeseries Data List of Timeseries Databases NoSQL Online Transaction Processing (OLTP) Object Relational Mapper (ORM) Grafana Tableau Kafka When Boring Is Awesome PostGreSQL RDS Google Cloud SQL Azure DB Docker Continuous Aggregates Streaming Replication PGPool II Kubernetes Docker Swarm Citus Data Website Data Engineering Podcast Interview Database Indexing B-Tree Index GIN Index GIST Index STE Energy Redis Graphite Prometheus pg_prometheus OpenMetrics Standard Proposal Timescale Parallel Copy Hadoop PostGIS KDB+ DevOps Internet of Things MongoDB Elastic DataBricks Apache Spark Confluent New Enterprise Associates MapD Benchmark Ventures Hortonworks 2σ Ventures CockroachDB Cloudflare EMC Timescale Blog: Why SQL is beating NoSQL, and what this means for the future of data The intro and outro music is from a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug?utm_source=rss&utm_medium=rss" target="_blank"… Amazon RDS Azure Cloud Computing Cloudflare Data Engineering Data Management Databricks DevOps Docker ELK GCP GitHub Grafana Hadoop Kafka Kubernetes Linux Marketing MongoDB NoSQL postgresql Prometheus Redis Spark SQL Data Streaming Tableau	Listen
SiriDB: Scalable Open Source Timeseries Database with Jeroen van der Heijden - Episode 11 2017-12-18 · 03:00 Jeroen van der Heijden – guest , Tobias Macey – host Summary Time series databases have long been the cornerstone of a robust metrics system, but the existing options are often difficult to manage in production. In this episode Jeroen van der Heijden explains his motivation for writing a new database, SiriDB, the challenges that he faced in doing so, and how it works under the hood. Preamble Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Continuous delivery lets you get new features in front of your users as fast as possible without introducing bugs or breaking production and GoCD is the open source platform made by the people at Thoughtworks who wrote the book about it. Go to dataengineeringpodcast.com/gocd to download and launch it today. Enterprise add-ons and professional support are available for added peace of mind. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Jeroen van der Heijden about SiriDB, a next generation time series database Interview Introduction How did you get involved in the area of data engineering? What is SiriDB and how did the project get started? What was the inspiration for the name? What was the landscape of time series databases at the time that you first began work on Siri? How does Siri compare to other time series databases such as InfluxDB, Timescale, KairosDB, etc.? What do you view as the competition for Siri? How is the server architected and how has the design evolved over the time that you have been working on it? Can you describe how the clustering mechanism functions? Is it possible to create pools with more than two servers? What are the failure modes for SiriDB and where does it fall on the spectrum for the CAP theorem? In the documentation it mentions needing to specify the retention period for the shards when creating a database. What is the reasoning for that and what happens to the individual metrics as they age beyond that time horizon? One of the common difficulties when using a time series database in an operations context is the need for high cardinality of the metrics. How are metrics identified in Siri and is there any support for tagging? What have been the most challenging aspects of building Siri? In what situations or environments would you advise against using Siri? Contact Info joente on Github LinkedIn Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links SiriDB Oversight InfluxDB LevelDB OpenTSDB Timescale DB KairosDB Write Ahead Log Grafana The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast CI/CD Data Engineering Data Management GitHub Grafana Linux	Listen

XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition 2008-05-05 Michael Kay – author This book is primarily a practical reference book for professional XSLT developers. It assumes no previous knowledge of the language, and many developers have used it as their first introduction to XSLT; however, it is not structured as a tutorial, and there are other books on XSLT that provide a gentler approach for beginners. The book does assume a basic knowledge of XML, HTML, and the architecture of the Web, and it is written for experienced programmers. There's no assumption that you know any particular language such as Java or Visual Basic, just that you recognize the concepts that all programming languages have in common. The book is suitable both for XSLT 1.0 users upgrading to XSLT 2.0, and for newcomers to XSLT. The book is also equally suitable whether you work in the Java or .NET world. As befits a reference book, a key aim is that the coverage should be comprehensive and authoritative. It is designed to give you all the details, not just an overview of the 20 percent of the language that most people use 80 percent of the time. It's designed so that you will keep coming back to the book whenever you encounter new and challenging programming tasks, not as a book that you skim quickly and then leave on the shelf. If you like detail, you will enjoy this book; if not, you probably won't. But as well as giving the detail, this book aims to explain the concepts, in some depth. It's therefore a book for people who not only want to use the language but who also want to understand it at a deep level. The book aims to tell you everything you need to know about the XSLT 2.0 language. It gives equal weight to the things that are new in XSLT 2.0 and the things that were already present in version 1.0. The book is about the language, not about specific products. However, there are appendices about Saxon (the author's own implementation of XSLT 2.0), about the Altova XSLT 2.0 implementation, and about the Java and Microsoft APIs for controlling XSLT transformations, which will no doubt be upgraded to handle XSLT 2.0 as well as 1.0. A third XSLT 2.0 processor, Gestalt, was released shortly before the book went to press, too late to describe it in any detail. But the experience of XSLT 1.0 is that there has been a very high level of interoperability between different XSLT processors, and if you can use one of them, then you can use them all. In the previous edition we split XSLT 2.0 and XPath 2.0 into separate volumes. The idea was that some readers might be interested in XPath alone. However, many bought the XSLT 2.0 book without its XPath companion and were left confused as a result; so this time, the material is back together. The XPath reference information is in self-contained chapters, so it should still be accessible when you use XPath in contexts other than XSLT. The book does not cover XSL Formatting Objects, a big subject in its own right. Nor does it cover XML Schemas in any detail. If you want to use these important technologies in conjunction with XSLT, there are other books that do them justice. This book contains twenty chapters and eight appendixes (the last of which is a glossary) organized into four parts. The following section outlines what you can find in each part, chapter, and appendix. Part I: Foundations: The first part of the book covers essential concepts. You should read these before you start coding. If you ignore this advice, as most people do, then you read them when you get to that trough of despair when you find it impossible to make the language do anything but the most trivial tasks. XSLT is different from other languages, and to make it work for you, you need to understand how it was designed to be used. Chapter 1: XSLT in Context: This chapter explains how XSLT fits into the big picture: how the language came into being and how it sits alongside other technologies. It also has a few simple coding examples to keep you alert. Chapter 2: The XSLT Processing Model: This is about the architecture of an XSLT processor: the inputs, the outputs, and the data model. Understanding the data model is perhaps the most important thing that distinguishes an XSLT expert from an amateur; it may seem like information that you can't use immediately, but it's knowledge that will stop you making a lot of stupid mistakes. Chapter 3: Stylesheet Structure: XSLT development is about writing stylesheets, and this chapter takes a bird's eye view of what stylesheets look like. It explains the key concepts of rule-based programming using templates, and explains how to undertake programming-in-the-large by structuring your application using modules and pipelines. Chapter 4: Stylesheets and Schemas: A key innovation in XSLT 2.0 is that stylesheets can take advantage of knowledge about the structure of your input and output documents, provided in the form of an XML Schema. This chapter provides a quick overview of XML Schema to describe its impact on XSLT development. Not everyone uses schemas, and you can skip this chapter if you fall into that category. Chapter 5: The Type System: XPath 2.0 and XSLT 2.0 offer strong typing as an alternative to the weak typing approach of the 1.0 languages. This means that you can declare the types of your variables, functions, and parameters, and use this information to get early warning of programming errors. This chapter explains the data types available and the mechanisms for creating user-defined types. Part II: XSLT and XPath Reference: This section of the book contains reference material, organized in the hope that you can easily find what you need when you need it. It's not designed for sequential reading, though you might well want to leaf through the pages to discover what's there. Chapter 6: XSLT Elements: This monster chapter lists all the XSLT elements you can use in a stylesheet, in alphabetical order, giving detailed rules for the syntax and semantics of each element, advice on usage, and examples. This is probably the part of the book you will use most frequently as you become an expert XSLT user. It's a "no stone unturned" approach, based on the belief that as a professional developer you need to know what happens when the going gets tough, not just when the wind is in your direction. Chapter 7: XPath Fundamentals: This chapter explains the basics of XPath: the low-level constructs such as literals, variables, and function calls. It also explains the context rules, which describe how the evaluation of XPath expressions depends on the XSLT processing context in which they appear. Chapter 8: XPath: Operators on Items: XPath offers the usual range of operators for performing arithmetic, boolean comparison, and the like. However, these don't always behave exactly as you would expect, so it's worth reading this chapter to see what's available and how it differs from the last language that you used. Chapter 9: XPath: Path Expressions: Path expressions are what make XPath special; they enable you to navigate around the structure of an XML document. This chapter explains the syntax of path expressions, the 13 axes that you can use to locate the nodes that you need, and associated operators such as union, intersection, and difference. Chapter 10: XPath: Sequence Expressions: Unlike XPath 1.0, in version 2.0 all values are sequences (singletons are just a special case). Some of the most important operators in XPath 2.0 are those that manipulate sequences, notably the "for" expression, which translates one sequence into another by applying a mapping. Chapter 11: XPath: Type Expressions: The type system was explained in Chapter 5; this chapter explains the operations that you can use to take advantage of types. This includes the "cast" operation which is used to convert values from one type to another.A big part of this chapter is devoted to the detailed rules for how these conversions are done. Chapter 12: XSLT Patterns: This chapter returns from XPath to a subject that's specific to XSLT. Patterns are used to define template rules, the essence of XSLT's rule-based programming approach. The reason for explaining them now is that the syntax and semantics of patterns depends strongly on the corresponding rules for XPath expressions. Chapter 13: The Function Library: XPath 2.0 includes a library of functions that can be called from any XPath expression; XSLT 2.0 extends this with some additional functions that are available only when XPath is used within XSLT. The library has grown immensely since XPath 1.0. This chapter provides a single alphabetical reference for all these functions. Chapter 14: Regular Expressions: Processing of text is an area where XSLT 2.0 and XPath 2.0 are much more powerful than version 1.0, and this is largely through the use of constructs that exploit regular expressions. If you're familiar with regexes from languages such as Perl, this chapter tells you how XPath regular expressions differ. If you're new to the subject, it explains it from first principles. Chapter 15: Serialization: Serialization in XSLT means the ability to generate a textual XML document from the tree structure that's manipulated by a stylesheet. This isn't part of XSLT processing proper, so (following W3C's lead) it's separated it into its own chapter. You can control serialization from the stylesheet using an declaration, but many products also allow you to control it directly via an API. Part III: Exploitation: The final section of the book is advice and guidance on how to take advantage of XSLT to write real applications. It's intended to make you not just a competent XSLT coder, but a competent designer too. The best way of learning is by studying the work of others, so the emphasis here is on practical case studies. Chapter 16: Extensibility: This chapter describes the "hooks" provided in the XSLT specification to allow vendors and users to plug in extra functionality. The way this works will vary from one implementation to another, so we can't cover all possibilities, but one important aspect that the chapter does cover is how to use such extensions and still keep your code portable. Chapter 17: Stylesheet Design Patterns: This chapter explores a number of design and coding patterns for XSLT programming, starting with the simplest "fill-in-the-blanks" stylesheet, and extending to the full use of recursive programming in the functional programming style, which is needed to tackle problems of any computational complexity. This provides an opportunity to explain the thinking behind functional programming and the change in mindset needed to take full advantage of this style of development. Chapter 18: Case Study: XMLSpec: XSLT is often used for rendering documents, so where better to look for a case study than the stylesheets used by the W3C to render the XML and XSLT specifications, and others in the same family, for display on the web? The resulting stylesheets are typical of those you will find in any publishing organization that uses XML to develop a series of documents with a compatible look-and-feel. Chapter 19: Case Study: A Family Tree: Displaying a family tree is another typical XSLT application. This example with semi-structured data—a mixture of fairly complex data and narrative text—that can be presented in many different ways for different audiences. It also shows how to tackle another typical XSLT problem, conversion of the data into XML from a legacy text-based format. As it happens, this uses nearly all the important new XSLT 2.0 features in one short stylesheet. But another aim of this chapter is to show a collection of stylesheets doing different jobs as part of a complete application. Chapter 20: Case Study: Knight's Tour: Finding a route around a chessboard where a knight visits every square without ever retracing its steps might sound a fairly esoteric application for XSLT, but it's a good way of showing how even the most complex of algorithms are within the capabilities of the language. You may not need to tackle this particular problem, but if you want to construct an SVG diagram showing progress against your project plan, then the problems won't be that dissimilar. Part IV: Appendices: Appendix A: XPath 2.0 Syntax Summary: Collects the XPath grammar rules and operator precedences into one place for ease of reference. Appendix B: Error Codes: A list of all the error codes defined in the XSLT and XPath language specifications, with brief explanations to help you understand what's gone wrong. Appendix C: Backward Compatibility: The list of things you need to look out for when converting applications from XSLT 1.0. Appendix D: Microsoft XSLT Processors: Although the two Microsoft XSLT processors don't yet support XSLT 2.0, we thought many readers would find it useful to have a quick summary here of the main objects and methods used in their APIs. Appendix E: JAXP: the Java API for XML Processing: JAXP is an interface rather than a product. Again, it doesn't have explicit support yet for XSLT 2.0, but Java programmers will often be using it in XSLT 2.0 projects, so the book includes an overview of the classes and methods available. Appendix F: Saxon: At the time of writing Saxon (developed by the author of this book) provides the most comprehensive implementation of XSLT 2.0 and XPath 2.0, so its interfaces and extensions are covered in some detail. Appendix G: Altova: Altova, the developers of XML Spy, have an XSLT 2.0 processor that can be used either as part of the development environment or as a freestanding component. This appendix gives details of its interfaces. Appendix H: Glossary Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file. data data-engineering storage-formats XML API Data Modelling HTML Java Microsoft	O'Reilly Data Engineering Books

Showing 19 results