talk-data.com talk-data.com

Topic

Oracle

database enterprise_software cloud

541

tagged

Activity Trend

33 peak/qtr
2020-Q1 2026-Q1

Activities

541 activities · Newest first

Summary In this episode of the Data Engineering Podcast Andy Warfield talks about the innovative functionalities of S3 Tables and Vectors and their integration into modern data stacks. Andy shares his journey through the tech industry and his role at Amazon, where he collaborates to enhance storage capabilities, discussing the evolution of S3 from a simple storage solution to a sophisticated system supporting advanced data types like tables and vectors crucial for analytics and AI-driven applications. He explains the motivations behind introducing S3 Tables and Vectors, highlighting their role in simplifying data management and enhancing performance for complex workloads, and shares insights into the technical challenges and design considerations involved in developing these features. The conversation explores potential applications of S3 Tables and Vectors in fields like AI, genomics, and media, and discusses future directions for S3's development to further support data-driven innovation.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementTired of data migrations that drag on for months or even years? What if I told you there's a way to cut that timeline by up to 6x while guaranteeing accuracy? Datafold's Migration Agent is the only AI-powered solution that doesn't just translate your code; it validates every single data point to ensure perfect parity between your old and new systems. Whether you're moving from Oracle to Snowflake, migrating stored procedures to dbt, or handling complex multi-system migrations, they deliver production-ready code with a guaranteed timeline and fixed price. Stop burning budget on endless consulting hours. Visit dataengineeringpodcast.com/datafold to book a demo and see how they're turning months-long migration nightmares into week-long success stories.Your host is Tobias Macey and today I'm interviewing Andy Warfield about S3 Tables and VectorsInterview IntroductionHow did you get involved in the area of data management?Can you describe what your goals are with the Tables and Vector features of S3?How did the experience of building S3 Tables inform your work on S3 Vectors?There are numerous implementations of vector storage and search. How do you view the role of S3 in the context of that ecosystem?The most directly analogous implementation that I'm aware of is the Lance table format. How would you compare the implementation and capabilities of Lance with what you are building with S3 Vectors?What opportunity do you see for being able to offer a protocol compatible implementation similar to the Iceberg compatibility that you provide with S3 Tables?Can you describe the technical implementation of the Vectors functionality in S3?What are the sources of inspiration that you looked to in designing the service?Can you describe some of the ways that S3 Vectors might be integrated into a typical AI application?What are the most interesting, innovative, or unexpected ways that you have seen S3 Tables/Vectors used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on S3 Tables/Vectors?When is S3 the wrong choice for Iceberg or Vector implementations?What do you have planned for the future of S3 Tables and Vectors?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links S3 TablesS3 VectorsS3 ExpressParquetIcebergVector IndexVector DatabasepgvectorEmbedding ModelRetrieval Augmented GenerationTwelveLabsAmazon BedrockIceberg REST CatalogLog-Structured Merge TreeS3 MetadataSentence TransformerSparkTrinoDaftThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Summary In this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offers a reactive execution model, full Python integration, and built-in UI elements to enhance the interactive computing experience. He discusses the challenges of traditional Jupyter notebooks, such as hidden states and lack of interactivity, and how Marimo addresses these issues with features like reactive execution and Python-native file formats. Akshay also explores the broader landscape of programmatic notebooks, comparing Marimo to other tools like Jupyter, Streamlit, and Hex, highlighting its unique approach to creating data apps directly from notebooks and eliminating the need for separate app development. The conversation delves into the technical architecture of Marimo, its community-driven development, and future plans, including a commercial offering and enhanced AI integration, emphasizing Marimo's role in bridging the gap between data exploration and production-ready applications.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementTired of data migrations that drag on for months or even years? What if I told you there's a way to cut that timeline by up to 6x while guaranteeing accuracy? Datafold's Migration Agent is the only AI-powered solution that doesn't just translate your code; it validates every single data point to ensure perfect parity between your old and new systems. Whether you're moving from Oracle to Snowflake, migrating stored procedures to dbt, or handling complex multi-system migrations, they deliver production-ready code with a guaranteed timeline and fixed price. Stop burning budget on endless consulting hours. Visit dataengineeringpodcast.com/datafold to book a demo and see how they're turning months-long migration nightmares into week-long success stories.Your host is Tobias Macey and today I'm interviewing Akshay Agrawal about Marimo, a reusable and reproducible Python notebook environmentInterview IntroductionHow did you get involved in the area of data management?Can you describe what Marimo is and the story behind it?What are the core problems and use cases that you are focused on addressing with Marimo?What are you explicitly not trying to solve for with Marimo?Programmatic notebooks have been around for decades now. Jupyter was largely responsible for making them popular outside of academia. How have the applications of notebooks changed in recent years?What are the limitations that have been most challenging to address in production contexts?Jupyter has long had support for multi-language notebooks/notebook kernels. What is your opinion on the utility of that feature as a core concern of the notebook system?Beyond notebooks, Streamlit and Hex have become quite popular for publishing the results of notebook-style analysis. How would you characterize the feature set of Marimo for those use cases?For a typical data team that is working across data pipelines, business analytics, ML/AI engineering, etc. How do you see Marimo applied within and across those contexts?One of the common difficulties with notebooks is that they are largely a single-player experience. They may connect into a shared compute cluster for scaling up execution (e.g. Ray, Dask, etc.). How does Marimo address the situation where a data platform team wants to offer notebooks as a service to reduce the friction to getting started with analyzing data in a warehouse/lakehouse context?How are you seeing teams integrate Marimo with orchestrators (e.g. Dagster, Airflow, Prefect)?What are some of the most interesting or complex engineering challenges that you have had to address while building and evolving Marimo?\What are the most interesting, innovative, or unexpected ways that you have seen Marimo used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Marimo?When is Marimo the wrong choice?What do you have planned for the future of Marimo?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links MarimoJupyterIPythonStreamlitPodcast.init EpisodeVector EmbeddingsDimensionality ReductionKagglePytestPEP 723 script dependency metadataMatLabVisicalcMathematicaRMarkdownRShinyElixir LivebookDatabricks NotebooksPapermillPluto - Julia NotebookHexDirected Acyclic Graph (DAG)Sumble Kaggle founder Anthony Goldblum's startupRayDaskJupytextnbdevDuckDBPodcast EpisodeIcebergSupersetjupyter-marimo-proxyJupyterHubBinderNixAnyWidgetJupyter WidgetsMatplotlibAltairPlotlyDataFusionPolarsMotherDuckThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Business intelligence has been transforming organizations for decades, yet many companies still struggle with widespread adoption. With less than 40% of employees in most organizations having access to BI tools, there's a significant 'information underclass' making decisions without data-driven insights. How can businesses bridge this gap and achieve true information democracy? While new technologies like generative AI and semantic layers offer promising solutions, the fundamentals of data quality and governance remain critical. What balance should organizations strike between investing in innovative tools and strengthening their data infrastructure? How can you ensure your business becomes a 'data athlete' capable of making hyper-decisive moves in an uncertain economic landscape? Howard Dresner is founder and Chief Research Officer at Dresner Advisory Services and a leading voice in Business Intelligence (BI), credited with coining the term “Business Intelligence” in 1989. He spent 13 years at Gartner as lead BI analyst, shaping its research agenda and earning recognition as Analyst of the Year, Distinguished Analyst, and Gartner Fellow. He also led Gartner’s BI conferences in Europe and North America. Before founding Dresner Advisory in 2007, Howard was Chief Strategy Officer at Hyperion Solutions, where he drove strategy and thought leadership, helping position Hyperion as a leader in performance management prior to its acquisition by Oracle.  Howard has written two books, The Performance Management Revolution – Business Results through Insight and Action, and Profiles in Performance – Business Intelligence Journeys and the Roadmap for Change - both published by John Wiley & Sons. In the episode, Richie and Howard explore the surprising low penetration of business intelligence in organizations, the importance of data governance and infrastructure, the evolving role of AI in BI, and the strategic initiatives driving BI usage, and much more. Links Mentioned in the Show: Dresner Advisory ServicesHoward’s Book - Profiles in Performance: Business Intelligence Journeys and the Roadmap for ChangeConnect with HowardSkill Track: Power BI FundamentalsRelated Episode: The Next Generation of Business Intelligence with Colin Zima, CEO at OmniRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

As insurers increasingly leverage IoT data to personalize policy pricing, reconciling disparate datasets across devices, policies, and insurers becomes mission-critical. In this session, learn how Nationwide transitioned from prototype workflows in Dataiku to a hardened data stack on Databricks, enabling scalable data governance and high-impact analytics. Discover how the team orchestrates data reconciliation across Postgres, Oracle, and Databricks to align customer driving behavior with insurer and policy data—ensuring more accurate, fair discounts for policyholders. With Anomalo’s automated monitoring layered on top, Nationwide ensures data quality at scale while empowering business units to define custom logic for proactive stewardship. We’ll also look ahead to how these foundations are preparing the enterprise for unstructured data and GenAI initiatives.

How to Migrate From Oracle to Databricks SQL

Migrating your legacy Oracle data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. Discover the pros and cons of using CSV files to PySpark or using pipelines to Databricks tables. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

panel
by Marat Valiullin (Ancestry) , Tanping Wang (Visa) , Animesh Singh (LinkedIn) , Shardul Desai (Bank of America) , Bruno Aziza (Google Cloud) , Alisson Sol (Capital One) , Morgan Brown (Dropbox) , Jacqueline Karlin (PayPal) , Tirthankar Lahiri (Oracle) , Aishwarya Srinivasan (Fireworks AI) , Naresh Dulam (JPMorgan Chase) , Taimur Rashid (AWS) , Rooshana Purnyn (Hyatt Hotels Corporation) , Maya Ackerman (WaveAI) , Venkatesh Shivanna (Electronic Arts (EA)) , Jaishankar Sundararaman (Google) , Eleonore Fournier-Tombs (United Nations)

Keynotes & panels featuring industry leaders from Google, AWS, IBM, PayPal, Bank of America, Capital One, Visa, JPMorgan Chase, Hyatt Hotels Corporation, United Nations, Fireworks AI, WaveAI, EA, Dropbox, Ancestry, Oracle, LinkedIn, and more.

In today’s data-driven world, organizations are challenged to extract meaningful insights from complex, distributed information. A modern data intelligence platform brings together data management, AI/ML, and analytics to turn raw data into strategic advantage. This session explores how unified data architectures, augmented analytics, and intelligent applications are enabling smarter decisions and better business outcomes across industries. Real-world use cases—from demand forecasting to regulatory compliance—highlight the transformative impact of data intelligence. Powered by Oracle, this approach helps enterprises stay agile, informed, and competitive.

Summary In this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings. From his roots as an Oracle developer to leading the data platform at a major online travel company, Chakravarthy shares insights on managing diverse database technologies and providing databases as a service to streamline operations. He explains how his team has transitioned from DevOps to a platform engineering approach, centralizing expertise and automating repetitive tasks with AWS Service Catalog. Join them as they discuss the challenges of migrating legacy systems, integrating AI and ML for automation, and the importance of organizational buy-in in driving data platform success.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th.Your host is Tobias Macey and today I'm interviewing Chakri Kotaru about scaling successful data operations through standardized platform offeringsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the different ways that you have seen teams you work with fail due to lack of structure and opinionated design?Why NoSQL?Pairing different styles of NoSQL for different problemsUseful patterns for each NoSQL style (document, column family, graph, etc.)Challenges in platform automation and scaling edge casesWhat challenges do you anticipate as a result of the new pressures as a result of AI applications?What are the most interesting, innovative, or unexpected ways that you have seen platform engineering practices applied to data systems?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data platform engineering?When is NoSQL the wrong choice?What do you have planned for the future of platform principles for enabling data teams/data applications?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links RiakDynamoDBSQL ServerCassandraScyllaDBCAP TheoremTerraformAWS Service CatalogBlog PostThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Modernize your analytics capabilities by identifying the products that best meet your needs. See side-by-side, scripted demonstrations of three leading vendors: Strategy, Oracle and Tableau. What are the key features to consider and how do they compare in action? What are the main strengths and weaknesses of these vendors? What innovations are coming?

Join us to hear from Oracle Red Bull Racing and learn about the Oracle and Oracle Red Bull Racing partnership, highlighting the significance of how Oracle technology is the backbone of success to the Championship winning Formula 1 team. Oracle is supporting the build of the 2026 Powertrain engine and how analytics from the race simulator is helping form the next generation of drivers.

Data is the fuel for Oracle Red Bull Racing to analyze their practices and qualifying races. By using a modern technology stack on Oracle, the Oracle Red Bull team was able to increase the number of simulations it could run, allowing the team to explore more variables and increase accuracy by focusing on track conditions, the pace of the car, and tire degradation to evolve the team’s strategy. Focusing on competitor analysis also helps determine how to navigate rival strategies that give our drivers the best chance to win.

Discover how you can implement a winning data & analytics solution.

Embark on a journey to enhance your Oracle applications by migrating and modernizing them on Google Cloud. This session delves into real-world customer success stories and showcases the latest advancements. Discover best practices for seamless migrations and explore how Google Cloud optimizes performance and scalability for your Oracle applications. Uncover valuable insights from your Oracle databases and app-powered integrations with Google advanced AI, analytics, and monitoring tools.

Unlock the full potential of your mission-critical workloads on Google Cloud. Discover how our platform is purpose-built for Microsoft, Oracle, OpenShift, and more, enabling you to optimize total cost of ownership (TCO) and accelerate modernization. Learn firsthand from customers who have successfully transformed their businesses by bringing their workloads to Google Cloud.

Modernize your Oracle workloads on Google Cloud. Experience seamless migration, robust infrastructure, and familiar tools for mission-critical workloads. Unlock your data's potential with BigQuery and Vertex AI, driving business differentiation and cost reduction. Learn how the Google Cloud & Oracle partnership, combined with your expertise, can accelerate digital transformation, reduce costs and grow your customers' potential. 

Learn how Oracle Multicloud enables enterprise customers with best practices and use cases to seamlessly integrate, build, and operate across clouds by running Oracle Database on Google Cloud. This session showcases how you can extend your Oracle Database and Google Cloud investments and gain insights into architecture patterns, database migration strategies, performance tuning, and cost optimization techniques.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Newt Global's DMAP revolutionizes Oracle/MS-SQL to PostgreSQL migrations. This automated solution streamlines the entire process, from initial planning to final production deployment. Key advantages are 1. Container-driven parallelization: Dramatically reduces migration timelines by harnessing powerful computing resources. 2. Unmatched speed: For medium complexity databases, DMAP achieves in 12 weeks what other tools take 12 months due to its advanced automation capabilities, including streamlined application and complex code translation.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Integrating data from Oracle ERP to Google BigQuery? Join this session and discover how to enable seamless data integration, creating a robust data and integration fabric on Google Cloud. This capability enhances data accessibility and analytics, empowering informed business decisions. We also developed an abstraction layer to streamline integrations, fostering synergy across third-party platforms, accelerating time-to-value, and supporting a scalable, data-driven enterprise.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Build modern applications with the power of Oracle Database 23ai, and Google Cloud's Vertex AI and Gemini Foundation models. Learn key strategies to integrate Google Cloud’s native development tools and services, including Kubernetes, Cloud Run, and BigQuery, with Oracle Database 23ai and Autonomous Database, seamlessly into modern application architectures. Cloud architects, Developers, or DB Administrators will gain actionable insight, best practices, and real-world examples to enhance performance and accelerate innovation with ODB@GC.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Changes to Oracle’s long-standing licensing policies regarding Google Cloud have opened up a new avenue for customers to run their Oracle databases. Learn how Google’s infrastructure and network serve as the perfect foundation for maintaining performance and availability on your critical applications, while giving you a path to integrate those applications with industry-leading AI. Find out how customers are modernizing their workloads on Google Cloud as well as what’s next for their applications.

This session presents Schnucks’, a midwest grocer’s migration of their E-commerce application from Oracle Database to Cloud SQL PostgreSQL. It will cover challenges such as addressing the complexities of even "simple" schemas, testing data movement possibilities to minimize downtime, and transforming the database tier. Hear about the business impact, including cost savings, increased database-application proximity and the potential for similar future migrations, allowing for direct integration options from Google CloudSQL to Google Gemini AI.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.