talk-data.com
People (13 results)
See all 13 →Companies (1 result)
Activities & events
| Title & Speakers | Event |
|---|---|
|
Scaling Data Governance For Global Businesses With A Data Hub Architecture
2020-03-09 · 14:00
Tim Ward
– CEO
@ CluedIn
,
Tobias Macey
– host
Summary Data governance is a complex endeavor, but scaling it to meet the needs of a complex or globally distributed organization requires a well considered and coherent strategy. In this episode Tim Ward describes an architecture that he has used successfully with multiple organizations to scale compliance. By treating it as a graph problem, where each hub in the network has localized control with inheritance of higher level controls it reduces overhead and provides greater flexibility. Tim provides useful examples for understanding how to adopt this approach in your own organization, including some technology recommendations for making it maintainable and scalable. If you are struggling to scale data quality controls and governance requirements then this interview will provide some useful ideas to incorporate into your roadmap. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing Tim Ward about using an architectural pattern called data hub that allows for scaling data management across global businesses Interview Introduction How did you get involved in the area of data management? Can you start by giving an overview of the goals of a data hub architecture? What are the elements of a data hub architecture and how do they contribute to the overall goals? What are some of the patterns or reference architectures that you drew on to develop this approach? What are some signs that an organization should implement a data hub architecture? What is the migration path for an organization who has an existing data platform but needs to scale their governance and localize storage and access? What are the features or attributes of an individual hub that allow for them to be interconnected? What is the interface presented between hubs to allow for accessing information across these localized repositories? What is the process for adding a new hub and making it discoverable across the organization? How is discoverability of data managed within and between hubs? If someone wishes to access information between hubs or across several of them, how do you prevent data proliferation? If data is copied between hubs, how are record updates accounted for to ensure that they are replicated to the hubs that hold a copy of that entity? How are access controls and data masking managed to ensure that various compliance regimes are honored? In addition to compliance issues, another challenge of distributed data repositories is the |
|
|
Simplifying Data Integration Through Eventual Connectivity
2019-07-29 · 02:00
Tim Ward
– CEO
@ CluedIn
,
Tobias Macey
– host
Summary The ETL pattern that has become commonplace for integrating data from multiple sources has proven useful, but complex to maintain. For a small number of sources it is a tractable problem, but as the overall complexity of the data ecosystem continues to expand it may be time to identify new ways to tame the deluge of information. In this episode Tim Ward, CEO of CluedIn, explains the idea of eventual connectivity as a new paradigm for data integration. Rather than manually defining all of the mappings ahead of time, we can rely on the power of graph databases and some strategic metadata to allow connections to occur as the data becomes available. If you are struggling to maintain a tangle of data pipelines then you might find some new ideas for reducing your workload. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! To connect with the startups that are shaping the future and take advantage of the opportunities that they provide, check out Angel List where you can invest in innovative business, find a job, or post a position of your own. Sign up today at dataengineeringpodcast.com/angel and help support this show. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Upcoming events include the O’Reilly AI Conference, the Strata Data Conference, and the combined events of the Data Architecture Summit and Graphorum. Go to dataengineeringpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Tim Ward about his thoughts on eventual connectivity as a new pattern to replace traditional ETL Interview Introduction How did you get involved in the area of data management? Can you start by discussing the challenges and shortcomings that you perceive in the existing practices of ETL? What is eventual connectivity and how does it address the problems with ETL in the current data landscape? In your white paper you mention the benefits of graph technology and how it solves the problem of data integration. Can you talk through an example use case? How do different implementations of graph databases impact their viability for this use case? Can you talk through the overall system architecture and data flow for an example implementation of eventual connectivity? How much up-front modeling is necessary to make this a viable approach to data integration? How do the volume and format of the source data impact the technology and archit |
|
|
Building An Enterprise Data Fabric At CluedIn
2019-03-25 · 13:00
Tim Ward
– CEO
@ CluedIn
,
Tobias Macey
– host
Summary Data integration is one of the most challenging aspects of any data platform, especially as the variety of data sources and formats grow. Enterprise organizations feel this acutely due to the silos that occur naturally across business units. The CluedIn team experienced this issue first-hand in their previous roles, leading them to build a business aimed at building a managed data fabric for the enterprise. In this episode Tim Ward, CEO of CluedIn, joins me to explain how their platform is architected, how they manage the task of integrating with third-party platforms, automating entity extraction and master data management, and the work of providing multiple views of the same data for different use cases. I highly recommend listening closely to his explanation of how they manage consistency of the data that they process across different storage backends. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams. If you are tired of wasting your time cobbling together scripts and workarounds to give your developers, data scientists, and managers the permissions that they need then it’s time to talk to our friends at strongDM. They have built an easy to use platform that lets you leverage your company’s single sign on for your data platform. Go to dataengineeringpodcast.com/strongdm today to find out how you can simplify your systems. Alluxio is an open source, distributed data orchestration layer that makes it easier to scale your compute and your storage independently. By transparently pulling data from underlying silos, Alluxio unlocks the value of your data and allows for modern computation-intensive workloads to become truly elastic and flexible for the cloud. With Alluxio, companies like Barclays, JD.com, Tencent, and Two Sigma can manage data efficiently, accelerate business analytics, and ease the adoption of any cloud. Go to dataengineeringpodcast.com/alluxio today to learn more and thank them for their support. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to dataengineeringpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Tim Ward about CluedIn, an integration platform for implementing your companies data fabric Interview Introduction How did you get involved in t |
|
|
Discussing #Jobs #Data and #WhatsTheFuture with @TimOReilly #FutureOfData #Podcast
2018-08-30 · 15:00
Tim O’Reilly
– founder and CEO
@ O’Reilly Media, Inc.
This podcast spends time discussing Tim O'Reilly's futuristic perspective on data, analytics, AI, jobs, and organization. He sheds light on what are somethings businesses could do to stay relevant and future proof. He discussed his book and shared some of the key insights relevant to anyone thinking of staying relevant in the World led by technology and impacting the future. A must video for anyone working! Timeline: 00:28 Tim's journey. 06:03 Tim's current occupation. 10:50 Interesting work for interesting people. 15:08 Thinking behind the title "What's the future". 23:41 Culture and technology evolution. 26:29 Creating value for the shareholder. 35:06 Learning a new skill. 38:12 Labor and technology. 47:07 Investing in humans or technology? 56:02 The role of AI in Media. 59:45 How can an employee stay relevant? 1:04:28 Tim's favorite books. 1:09:38 Key takeaways. Tim's Book: WTF?: What's the Future and Why It's Up to Us by Tim O'Reilly https://amzn.to/2N5WhOn Tim's Recommended Read: AI Superpowers: China, Silicon Valley, and the New World Order by Kai-Fu Lee https://amzn.to/2N8VGLL Prediction Machines: The Simple Economics of Artificial Intelligence by Ajay Agrawal and Joshua Gans https://amzn.to/2ugQBKr The Long Twentieth Century: Money, Power and the Origins of Our Times by Giovanni Arrighi https://amzn.to/2ufhb6R Doughnut Economics: Seven Ways to Think Like a 21st-Century Economist by Kate Raworth https://amzn.to/2LcbLQc Winners Take All: The Elite Charade of Changing the World by Anand Giridharadas https://amzn.to/2utgeXF New Power: How Power Works in Our Hyperconnected World--and How to Make It Work for You by Jeremy Heimans and Henry Timms https://amzn.to/2NbBJ77 Seeing like a State: How Certain Schemes to Improve the Human Condition Have Failed by James C. Scott https://amzn.to/2ztnoRz The Struggle for Survival: An Historical, political, and Socioeconomic Perspective of St. Lucia by Anderson Reynolds https://amzn.to/2uqF22w Podcast Link: https://futureofdata.org/discussing-jobs-data-and-whatsthefuture-with-timoreilly-futureofdata-podcast/ Tim's BIO: Tim O’Reilly is the founder and CEO of O’Reilly Media, Inc. His original business plan was “interesting work for interesting people,” which worked out pretty well. O’Reilly Media delivers online learning, publishes books, runs conferences, urges companies to create more value than they capture, and tries to change the world by spreading and amplifying the knowledge of innovators. Tim has a history of convening conversations that reshape the computer industry. In 1993, he launched the first commercial, ad-supported site on the internet. In 1998, he organized the meeting where the term “open source software” was agreed on and helped the business world understand its importance. In 2004, with the Web 2.0 Summit, he defined how “Web 2.0” represented not only the resurgence of the web after the dot com bust, but a new model for the computer industry, based on big data, collective intelligence, and the internet as a platform. In 2009, with his “Gov 2.0 Summit,” he framed a conversation about the modernization of government technology that has shaped policy and spawned initiatives at the Federal, State, and local level and around the world. He has now turned his attention to the implications of AI, the on-demand economy, and other technologies that are transforming the nature of work and the future shape of the business world. This is the subject of his forthcoming book from Harper Business, WTF: What’s the Future and Why It’s Up to Us. About #Podcast: FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey in creating the data-driven future.Wanna Join? If you or any you know wants to join in or sponsor, Email us @ [email protected] Keywords: FutureOfData #DataAnalytics #Leadership #Futurist #Podcast #BigData #Strategy |
The Future of Data Podcast | conversation with leaders, influencers, and change makers in the World of Data & Analytics |
|
What Is Data Science?
2011-04-10
Mike Loukides
– author
We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products. |
O'Reilly Data Science Books
|