talk-data.com talk-data.com

Topic

Java

programming_language object_oriented enterprise

23

tagged

Activity Trend

25 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
DuckDB in Action

Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: Read and process data from CSV, JSON and Parquet sources both locally and remote Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the Technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the Book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's Inside Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Fast-paced SQL recap: From simple queries to advanced analytics About the Reader For data pros comfortable with Python and CLI tools. About the Authors Mark Needham is a blogger and video creator at @‌LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j. Quotes I use DuckDB every day, and I still learned a lot about how DuckDB makes things that are hard in most databases easy! - Jordan Tigani, Founder, MotherDuck An excellent resource! Unlocks possibilities for storing, processing, analyzing, and summarizing data at the edge using DuckDB. - Pramod Sadalage, Director, Thoughtworks Clear and accessible. A comprehensive resource for harnessing the power of DuckDB for both novices and experienced professionals. - Qiusheng Wu, Associate Professor, University of Tennessee Excellent! The book all we ducklings have been waiting for! - Gunnar Morling, Decodable

MATLAB Recipes: A Problem-Solution Approach

Learn from state-of-the-art examples in robotics, motors, detection filters, chemical processes, aircraft, and spacecraft. With this book you will review contemporary MATLAB coding including the latest MATLAB language features and use MATLAB as a software development environment including code organization, GUI development, and algorithm design and testing. Features now covered include the new graph and digraph classes for charts and networks; interactive documents that combine text, code, and output; a new development environment for building apps; locally defined functions in scripts; automatic expansion of dimensions; tall arrays for big data; the new string type; new functions to encode/decode JSON; handling non-English languages; the new class architecture; the Mocking framework; an engine API for Java; the cloud-based MATLAB desktop; the memoize function; and heatmap charts. MATLAB Recipes: A Problem-Solution Approach, Second Edition provides practical, hands-on code snippets and guidance for using MATLAB to build a body of code you can turn to time and again for solving technical problems in your work. Develop algorithms, test them, visualize the results, and pass the code along to others to create a functional code base for your firm. What You Will Learn Get up to date with the latest MATLAB up to and including MATLAB 2020b Code in MATLAB Write applications in MATLAB Build your own toolbox of MATLAB code to increase your efficiency and effectiveness Who This Book Is For Engineers, data scientists, and students wanting a book rich in examples using MATLAB.

SAS Viya

Learn how to access analytics from SAS Cloud Analytic Services (CAS) using Python and the SAS Viya platform. SAS Viya : The Python Perspective is an introduction to using the Python client on the SAS Viya platform. SAS Viya is a high-performance, fault-tolerant analytics architecture that can be deployed on both public and private cloud infrastructures. While SAS Viya can be used by various SAS applications, it also enables you to access analytic methods from SAS, Python, Lua, and Java, as well as through a REST interface using HTTP or HTTPS. This book focuses on the perspective of SAS Viya from Python. SAS Viya is made up of multiple components. The central piece of this ecosystem is SAS Cloud Analytic Services (CAS). CAS is the cloud-based server that all clients communicate with to run analytical methods. The Python client is used to drive the CAS component directly using objects and constructs that are familiar to Python programmers. Some knowledge of Python would be helpful before using this book; however, there is an appendix that covers the features of Python that are used in the CAS Python client. Knowledge of CAS is not required to use this book. However, you will need to have a CAS server set up and running to execute the examples in this book. With this book, you will learn how to: Install the required components for accessing CAS from Python Connect to CAS, load data, and run simple analyses Work with CAS using APIs familiar to Python users Grasp general CAS workflows and advanced features of the CAS Python client SAS Viya : The Python Perspective covers topics that will be useful to beginners as well as experienced CAS users. It includes examples from creating connections to CAS all the way to simple statistics and machine learning, but it is also useful as a desktop reference.

Pentaho 8 Reporting for Java Developers

"Pentaho 8 Reporting for Java Developers" is your hands-on guide to mastering the Pentaho 8 reporting platform. Packed with practical examples and exercises, this book teaches you how to create highly functional, interactive reports for your data visualization needs. Updated for the latest version of Pentaho, it provides all the tools and techniques you need to succeed. What this Book will help me do Learn the fundamental concepts of Pentaho Reporting including setup and initial configurations. Design and customize attractive, functional reports utilizing various data sources. Integrate Pentaho reports seamlessly into Java applications with full control over their interactions and design. Explore advanced reporting features like parameterization, localization, and complex layout configurations. Incorporate Pentaho reports into the broader Pentaho suite, including the BA platform and Data Integration tools. Author(s) Jasmine Kaur and None Corti bring their extensive expertise in information technology and Java development to this comprehensive guide. With years of hands-on experience working with Pentaho Reporting tools, they have a deep understanding of the challenges and solutions in report design. Their approachable writing style and emphasis on practical examples make learning intuitive and enjoyable. Who is it for? This book is ideally suited for Information Technologists who are familiar with databases and intermediate-level Java Developers looking to integrate advanced reporting functionalities into their projects. If you are eager to build pixel-perfect, professional reports or need insights into embedding reporting tools into Java applications, this book holds the answers.

Data Science with Java

Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today’s data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java. You’ll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you’ll find code examples you can use in your applications. Examine methods for obtaining, cleaning, and arranging data into its purest form Understand the matrix structure that your data should take Learn basic concepts for testing the origin and validity of data Transform your data into stable and usable numerical values Understand supervised and unsupervised learning algorithms, and methods for evaluating their success Get up and running with MapReduce, using customized components suitable for data science algorithms

Cyber-Risk Informatics

This book provides a scientific modeling approach for conducting metrics-based quantitative risk assessments of cybersecurity vulnerabilities and threats. This book provides a scientific modeling approach for conducting metrics-based quantitative risk assessments of cybersecurity threats. The author builds from a common understanding based on previous class-tested works to introduce the reader to the current and newly innovative approaches to address the maliciously-by-human-created (rather than by-chance-occurring) vulnerability and threat, and related cost-effective management to mitigate such risk. This book is purely statistical data-oriented (not deterministic) and employs computationally intensive techniques, such as Monte Carlo and Discrete Event Simulation. The enriched JAVA ready-to-go applications and solutions to exercises provided by the author at the book’s specifically preserved website will enable readers to utilize the course related problems. • Enables the reader to use the book's website's applications to implement and see results, and use them making ‘budgetary’ sense • Utilizes a data analytical approach and provides clear entry points for readers of varying skill sets and backgrounds • Developed out of necessity from real in-class experience while teaching advanced undergraduate and graduate courses by the author Cyber-Risk Informatics is a resource for undergraduate students, graduate students, and practitioners in the field of Risk Assessment and Management regarding Security and Reliability Modeling. Mehmet Sahinoglu, a Professor (1990) Emeritus (2000), is the founder of the Informatics Institute (2009) and its SACS-accredited (2010) and NSA-certified (2013) flagship Cybersystems and Information Security (CSIS) graduate program (the first such full degree in-class program in Southeastern USA) at AUM, Auburn University’s metropolitan campus in Montgomery, Alabama. He is a fellow member of the SDPS Society, a senior member of the IEEE, and an elected member of ISI. Sahinoglu is the recipient of Microsoft's Trustworthy Computing Curriculum (TCC) award and the author of Trustworthy Computing (Wiley, 2007).

R for Programmers

Unlike other books about R, written from the perspective of statistics, this book is written from the perspective of programmers, providing a channel for programmers with expertise in other programming languages to quickly understand R. The contents are divided into four parts: the basics of R, the server of R, databases and big data, and the appendices, which introduce the installation of Java, various databases, and Hadoop. Because this is a reference book, there is no special sequence for reading all the chapters. Anyone new to the subject who wishes to master R comprehensively can simply follow the chapters in sequence.

Computer Science Illuminated, 6th Edition

Each new print copy includes Navigate 2 Advantage Access that unlocks a comprehensive and interactive eBook, student practice activities and assessments, a full suite of instructor resources, and learning analytics reporting tools.

Fully revised and updated, the Sixth Edition of the best-selling text Computer Science Illuminated retains the accessibility and in-depth coverage of previous editions, while incorporating all-new material on cutting-edge issues in computer science. Authored by the award-winning Nell Dale and John Lewis, Computer Science Illuminated’s unique and innovative layered approach moves through the levels of computing from an organized, language-neutral perspective.

Designed for the introductory computing and computer science course, this student-friendly Sixth Edition provides students with a solid foundation for further study, and offers non-majors a complete introduction to computing.

Key Features of the Sixth Edition include:

Access to Navigate 2 online learning materials including a comprehensive and interactive eBook, student practice activities and assessments, learning analytics reporting tools, and more
Completely revised sections on HTML and CSS
Updates regarding Top Level Domains, Social Networks, and Google Analytics
All-new section on Internet management, including ICANN control and net neutrality 
New design, including fully revised figures and tables
New and updated Did You Know callouts are included in the chapter margins
New and revised Ethical Issues and Biographies throughout emphasize the history and breadth of computing
Available in our customizable PUBLISH platform

A collection of programming language chapters are available as low-cost bundling options. Available chapters include: Java, C++, Python, Alice, SQL, VB.NET, RUBY, Perl, Pascal, and JavaScript.

With Navigate 2, technology and content combine to expand the reach of your classroom. Whether you teach an online, hybrid, or traditional classroom-based course, Navigate 2 delivers unbeatable value. Experience Navigate 2 today at www.jblnavigate.com/2

MATLAB Numerical Calculations

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. This book is designed for use as a scientific/business calculator so that you can get numerical solutions to problems involving a wide array of mathematics using MATLAB. Just look up the function you want in the book and you are ready to use it in MATLAB or use the book to learn about the enormous range of options that MATLAB offers. MATLAB Numerical Calculations focuses on MATLAB capabilities to give you numerical solutions to problems you are likely to encounter in your professional or scholastic life. It introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. Starting with a look at basic MATLAB functionality with integers, rational numbers and real and complex numbers, and MATLAB's relationship with Maple, you will learn how to solve equations in MATLAB, and how to simplify the results. You will see how MATLAB incorporates vector, matrix and character variables, and functions thereof. MATLAB is a powerful tool used to defined, manipulate and simplify complex algebraic expressions. With MATLAB you can also work with ease in matrix algebra, making use of commands which allow you to find eigenvalues, eigenvectors, determinants, norms and various matrix decompositions, among many other features. Lastly, you will see how you can write scripts and use MATLAB to explore numerical analysis, finding approximations of integrals, derivatives and numerical solutions of differential equations.

MATLAB Symbolic Algebra and Calculus Tools

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Symbolic Algebra and Calculus Tools introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. Starting with a look at symbolic variables and functions, you will learn how to solve equations in MATLAB, both symbolically and numerically, and how to simplify the results. Extensive coverage of polynomial solutions, inequalities and systems of equations are covered in detail. You will see how MATLAB incorporates vector, matrix and character variables, and functions thereof. MATLAB is a powerful symbolic manipulator which enables you to factorize, expand and simplify complex algebraic expressions over all common fields (including over finite fields and algebraic field extensions of the rational numbers). With MATLAB you can also work with ease in matrix algebra, making use of commands which allow you to find eigenvalues, eigenvectors, determinants, norms and various matrix decompositions, among many other features. Lastly, you will see how you can use MATLAB to explore mathematical analysis, finding limits of sequences and functions, sums of series, integrals, derivatives and solving differential equation.

MATLAB Matrix Algebra

" MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Matrix Algebra introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. Starting with a look at symbolic and numeric variables, with an emphasis on vector and matrix variables, you will go on to examine functions and operations that support vectors and matrices as arguments, including those based on analytic parent functions. Computational methods for finding eigenvalues and eigenvectors of matrices are detailed, leading to various matrix decompositions. Applications such as change of bases, the classification of quadratic forms and how to solve systems of linear equations are described, with numerous examples. A section is dedicated to sparse matrices and other types of special matrices. In addition to its treatment of matrices, you will also learn how MATLAB can be used to work with arrays, lists, tables, sequences and sets."

MATLAB Optimization Techniques

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Optimization Techniques introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. It begins by introducing the MATLAB environment and the structure of MATLAB programming before moving on to the mathematics of optimization. The central part of the book is dedicated to MATLABs Optimization Toolbox, which implements state-of-the-art algorithms for solving multiobjective problems, non-linear minimization with boundary conditions and restrictions, minimax optimization, semi-infinitely constrained minimization and linear and quadratic programming. A wide range of exercises and examples are included, illustrating the most widely used optimization methods.

MATLAB Differential Equations

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Differential Equations introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. In addition to giving an introduction to the MATLAB environment and MATLAB programming, this book provides all the material needed to work on differential equations using MATLAB. It includes techniques for solving ordinary and partial differential equations of various kinds, and systems of such equations, either symbolically or using numerical methods (Euler’s method, Heun’s method, the Taylor series method, the Runge–Kutta method,…). It also describes how to implement mathematical tools such as the Laplace transform, orthogonal polynomials, and special functions (Airy and Bessel functions), and find solutions of finite difference equations.

MATLAB Control Systems Engineering

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Control Systems Engineering introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. In addition to giving an introduction to the MATLAB environment and MATLAB programming, this book provides all the material needed to design and analyze control systems using MATLAB’s specialized Control Systems Toolbox. The Control Systems Toolbox offers an extensive range of tools for classical and modern control design. Using these tools you can create models of linear time-invariant systems in transfer function, zero-pole-gain or state space format. You can manipulate both discrete-time and continuous-time systems and convert between various representations. You can calculate and graph time response, frequency response and loci of roots. Other functions allow you to perform pole placement, optimal control and estimates. The Control System Toolbox is open and extendible, allowing you to create customized M-files to suit your specific applications.

Pentaho Data Integration Cookbook - Second Edition - Second Edition

This cookbook is a comprehensive guide to using Pentaho Data Integration (Kettle) for executing ETL processes effectively. With step-by-step recipes, it covers everything from connecting to diverse data sources to implementing advanced data handling workflows. This book is a valuable resource to streamline and enhance your data integration tasks. What this Book will help me do Learn to configure Kettle to connect with various databases and applications. Understand how to embed Java code for optimized transformations. Discover techniques to reuse and manage transformations and jobs. Master the integration of Kettle with other Pentaho Suite components. Explore advanced data flow control and manipulation tactics. Author(s) The authors of this book are experienced professionals in data integration and Pentaho tools. They bring years of practical industry experience and have a passion for sharing knowledge through clear, hands-on tutorials. Their approach to writing ensures readers can take actionable insights directly to their work. Who is it for? This book is ideal for developers familiar with the fundamental concepts of Kettle who aim to delve deeper into advanced functionalities. Readers should have basic ETL knowledge and the ambition to master Pentaho Data Integration. Experienced users will find valuable tips and learn about new features to automate and enhance their processes.

Mondrian in Action

Mondrian in Action teaches business users and developers how to use Mondrian and related tools for strategic business analysis. You'll learn how to design and populate a data warehouse and present the data via a multidimensional model. You'll follow examples showing how to create a Mondrian schema and then expand it to add basic security based on the users' roles. About the Technology Mondrian is an open source, lightning-fast data analysis engine designed to help you explore your business data and perform speed-of-thought analysis. Mondrian can be integrated into a wide variety of business analysis applications and learning it requires no specialized technical knowledge. About the Book Mondrian in Action teaches you to use Mondrian for strategic business analysis. In it, you'll learn how to organize and present data in a multidimensional manner. You'll follow apt and thoroughly explained examples showing how to create a Mondrian schema and then expand it to add basic security based on users' roles. Developers will discover how to integrate Mondrian using its olap4j Java API and web service calls via XML for Analysis. What's Inside Mondrian from the ground up -- no experience required A primer on business analytics Using Mondrian with a variety of leading applications Optimizing and restricting business data for fast, secure analysis About the Reader Written for developers building data analysis solutions. Appropriate for tech-savvy business users and DBAs needing to query and report on data. About the Authors William D. Back is an Enterprise Architect and Director of Pentaho Services. Nicholas Goodman is a Business Intelligence pro who has authored training courses on OLAP and Mondrian. Julian Hyde founded Mondrian and is the project's lead developer. Quotes A wonderful introduction to Business Intelligence and Analytics. - Lorenzo De Leon, Authentify, Inc. A great overview of the Mondrian engine that guided me through all the technical details. - Alexander Helf, veenion GmbH A significant complement to the online documentation, and an excellent introduction to how to think about designing a data warehouse. - Mark Newman, Heads Up Analytics Comprehensive ... highly recommended. - Najib Coutya, IMD Group

Computational Colour Science Using MATLAB, 2nd Edition

Computational Colour Science Using MATLAB 2nd Edition offers a practical, problem-based approach to colour physics. The book focuses on the key issues encountered in modern colour engineering, including efficient representation of colour information, Fourier analysis of reflectance spectra and advanced colorimetric computation. Emphasis is placed on the practical applications rather than the techniques themselves, with material structured around key topics. These topics include colour calibration of visual displays, computer recipe prediction and models for colour-appearance prediction. Each topic is carefully introduced at three levels to aid student understanding. First, theoretical ideas and background information are discussed, then explanations of mathematical solutions follow and finally practical solutions are presented using MATLAB. The content includes: A compendium of equations and numerical data required by the modern colour and imaging scientist. Numerous examples of solutions and algorithms for a wide-range of computational problems in colour science. Example scripts using the MATLAB programming language. This 2nd edition contains substantial new and revised material, including three innovative chapters on colour imaging, psychophysical methods, and physiological colour spaces; the MATLAB toolbox has been extended with a professional, optimized, toolbox to go alongside the current teaching toolbox; and a java toolbox has been added which will interest users who are writing web applications and/or applets or mobile phone applications. Computational Colour Science Using MATLAB 2nd Edition is an invaluable resource for students taking courses in colour science, colour chemistry and colour physics as well as technicians and researchers working in the area. In addition, it acts a useful reference for professionals and researchers working in colour dependent industries such as textiles, paints, print & electronic imaging. Review from First Edition: "...highly recommended as a concise introduction to the practicalities of colour science..." (Color Technology, 2004)

Undocumented Secrets of MATLAB-Java Programming

Many people know that a major part of the functionality of the MATLAB software package is based on Java. But fewer people know how to manipulate Java to achieve improved appearance and functionality and thus heighten MATLAB software's applicability to real world, modern situations. Organized by related functionality/usage and ordered from facile to complex, this book presents examples, instruction, and code snippets in stand-alone, self-contained chapters. Requiring no prior Java knowledge, this book provides numerous online references and resources to show readers how to use and discover new components and functionalities using nothing but MATLAB itself as the discovery tool.

Essential Statistics, Regression, and Econometrics

Essential Statistics, Regression, and Econometrics provides students with a readable, deep understanding of the key statistical topics they need to understand in an econometrics course. It is innovative in its focus, including real data, pitfalls in data analysis, and modeling issues (including functional forms, causality, and instrumental variables). This book is unusually readable and non-intimidating, with extensive word problems that emphasize intuition and understanding. Exercises range from easy to challenging and the examples are substantial and real, to help the students remember the technique better. Readable exposition and exceptional exercises/examples that students can relate to Website includes java applets and Excel applications Focuses on key methods for econometrics students without including unnecessary topics Covers data analysis not covered in other texts Ideal presentation of material (topic order) for econometrics course

Entity Resolution and Information Quality

Entity Resolution and Information Quality presents topics and definitions, and clarifies confusing terminologies regarding entity resolution and information quality. It takes a very wide view of IQ, including its six-domain framework and the skills formed by the International Association for Information and Data Quality {IAIDQ). The book includes chapters that cover the principles of entity resolution and the principles of Information Quality, in addition to their concepts and terminology. It also discusses the Fellegi-Sunter theory of record linkage, the Stanford Entity Resolution Framework, and the Algebraic Model for Entity Resolution, which are the major theoretical models that support Entity Resolution. In relation to this, the book briefly discusses entity-based data integration (EBDI) and its model, which serve as an extension of the Algebraic Model for Entity Resolution. There is also an explanation of how the three commercial ER systems operate and a description of the non-commercial open-source system known as OYSTER. The book concludes by discussing trends in entity resolution research and practice. Students taking IT courses and IT professionals will find this book invaluable. First authoritative reference explaining entity resolution and how to use it effectively Provides practical system design advice to help you get a competitive advantage Includes a companion site with synthetic customer data for applicatory exercises, and access to a Java-based Entity Resolution program.