talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Solr in Action

Solr in Action is a comprehensive guide to implementing scalable search using Apache Solr. This clearly written book walks you through well-documented examples ranging from basic keyword searching to scaling a system for billions of documents and queries. It will give you a deep understanding of how to implement core Solr capabilities. About the Technology About the Book Whether you're handling big (or small) data, managing documents, or building a website, it is important to be able to quickly search through your content and discover meaning in it. Apache Solr is your tool: a ready-to-deploy, Lucene-based, open source, full-text search engine. Solr can scale across many servers to enable real-time queries and data analytics across billions of documents. Solr in Action teaches you to implement scalable search using Apache Solr. This easy-to-read guide balances conceptual discussions with practical examples to show you how to implement all of Solr's core capabilities. You'll master topics like text analysis, faceted search, hit highlighting, result grouping, query suggestions, multilingual search, advanced geospatial and data operations, and relevancy tuning. What's Inside How to scale Solr for big data Rich real-world examples Solr as a NoSQL data store Advanced multilingual, data, and relevancy tricks Coverage of versions through Solr 4.7 About the Reader This book assumes basic knowledge of Java and standard database technology. No prior knowledge of Solr or Lucene is required. About the Authors Trey Grainger is a director of engineering at CareerBuilder. Timothy Potter is a senior member of the engineering team at LucidWorks. The authors work on the scalability and reliability of Solr, as well as on recommendation engine and big data analytics technologies. Quotes The knowledge and techniques you need. - From the Foreword by Yonik Seeley, Creator of Solr Readable and immediately applicable ... an excellent book. - John Viviano, InterCorp, Inc. The go-to guide for Solr ... a definitive resource for both beginners and experts. - Scott Anthony, Business Instruments A well-dosed combination of deep technical knowledge and real-world experience. - Alexandre Madurell, Piksel, Inc.

Predictive Analytics For Dummies

Combine business sense, statistics, and computers in a new and intuitive way, thanks to Big Data Predictive analytics is a branch of data mining that helps predict probabilities and trends. Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions. Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more. Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.

The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions

The era of Big Data as arrived, and most organizations are woefully unprepared. Slowly, many are discovering that stalwarts like Excel spreadsheets, KPIs, standard reports, and even traditional business intelligence tools aren't sufficient. These old standbys can't begin to handle today's increasing streams, volumes, and types of data. Amidst all of the chaos, though, a new type of organization is emerging. In The Visual Organization, award-winning author and technology expert Phil Simon looks at how an increasingly number of organizations are embracing new dataviz tools and, more important, a new mind-set based upon data discovery and exploration. Simon adroitly shows how Amazon, Apple, Facebook, Google, Twitter, and other tech heavyweights use powerful data visualization tools to garner fascinating insights into their businesses. But make no mistake: these companies are hardly alone. Organizations of all types, industries, sizes are representing their data in new and amazing ways. As a result, they are asking better questions and making better business decisions. Rife with real-world examples and case studies, The Visual Organization is a full-color tour-de-force.

Google Maps

Create custom applications with the Google Maps API Featuring step-by-step examples, this practical resource gets you started programming the Google Maps API with JavaScript in no time. Learn how to embed maps on web pages, annotate the embedded maps with your data, generate KML files to store and reuse your map data, and enable client applications to request spatial data through web services. Google Maps: Power Tools for Maximizing the API explains techniques for visualizing masses of data and animating multiple items on the map. You’ll also find out how to embed Google maps in desktop applications to combine the richness of the Windows interface with the unique features of the API. You can use the numerous samples included throughout this hands-on guide as your starting point for building customized applications. Create map-enabled web pages with a custom look Learn the JavaScript skills required to exploit the Google Maps API Create highly interactive interfaces for mapping applications Embed maps in desktop applications written in .NET Annotate maps with labels, markers, and shapes Understand geodesic paths and shapes and perform geodesic calculations Store geographical data in KML format Add GIS features to mapping applications Store large sets of geography data in databases and perform advanced spatial queries Use web services to request spatial data from within your script on demand Automate the generation of standalone web pages with annotated maps Use the Geocoding and Directions APIs Visualize large data sets using symbols and heatmaps Animate items on a map Bonus online content includes: A tutorial on The SQL Spatial application A bonus chapter on animating multiple airplanes Three appendices: debugging scripts in the browser; scalable vector graphics; and applying custom styles

Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2

“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.” —From the Foreword by Raymie Stata, CEO of Altiscale The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances. Apache Hadoop™ YARN, YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment. You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it. Coverage includes YARN’s goals, design, architecture, and components—how it expands the Apache Hadoop ecosystem Exploring YARN on a single node Administering YARN clusters and Capacity Scheduler Running existing MapReduce applications Developing a large-scale clustered YARN application Discovering new open source frameworks that run under YARN

Deployment Guide for InfoSphere Guardium

IBM® InfoSphere® Guardium® provides the simplest, most robust solution for data security and data privacy by assuring the integrity of trusted information in your data center. InfoSphere Guardium helps you reduce support costs by automating the entire compliance auditing process across heterogeneous environments. InfoSphere Guardium offers a flexible and scalable solution to support varying customer architecture requirements. This IBM Redbooks® publication provides a guide for deploying the Guardium solutions. This book also provides a roadmap process for implementing an InfoSphere Guardium solution that is based on years of experience and best practices that were collected from various Guardium experts. We describe planning, installation, configuration, monitoring, and administrating an InfoSphere Guardium environment. We also describe use cases and how InfoSphere Guardium integrates with other IBM products. The guidance can help you successfully deploy and manage an IBM InfoSphere Guardium system. This book is intended for the system administrators and support staff who are responsible for deploying or supporting an InfoSphere Guardium environment.

Statistical Hypothesis Testing with SAS and R

A comprehensive guide to statistical hypothesis testing with examples in SAS and R When analyzing datasets the following questions often arise: Is there a short hand procedure for a statistical test available in SAS or R? If so, how do I use it? If not, how do I program the test myself? This book answers these questions and provides an overview of the most common statistical test problems in a comprehensive way, making it easy to find and perform an appropriate statistical test. A general summary of statistical test theory is presented, along with a basic description for each test, including the necessary prerequisites, assumptions, the formal test problem and the test statistic. Examples in both SAS and R are provided, along with program code to perform the test, resulting output and remarks explaining the necessary program parameters. Key features: Provides examples in both SAS and R for each test presented. Looks at the most common statistical tests, displayed in a clear and easy to follow way. Supported by a supplementary website http://www.d-taeger.de featuring example program code. Academics, practitioners and SAS and R programmers will find this book a valuable resource. Students using SAS and R will also find it an excellent choice for reference and data analysis.

The SAP Materials Management Handbook

This handbook provides a complete understanding of how to configure and implement the SAP materials management module across various types of projects. It uses system screenshots of SAP environments to illustrate the complete flow of business transactions involved with SAP MM. Supplying detailed explanations of the steps involved, it presents case studies from actual projects that demonstrate how to convert theory into powerful SAP MM solutions. The book explains how to use the SAP MM module to take care of the complete range of business functions related to purchasing and inventory management.

Beginning Oracle SQL: for Oracle Database 12c, Third Edition

Beginning Oracle SQL is your introduction to the interactive query tools and specific dialect of SQL used with Oracle Database. These tools include SQLPlus and SQL Developer. SQLPlus is the one tool any Oracle developer or database administrator can always count on, and it is widely used in creating scripts to automate routine tasks. SQL Developer is a powerful, graphical environment for developing and debugging queries. Oracle's is possibly the most valuable dialect of SQL from a career standpoint. Oracle's database engine is widely used in corporate environments worldwide. It is also found in many government applications. Oracle SQL implements many features not found in competing products. No developer or DBA working with Oracle can afford to be without knowledge of these features and how they work, because of the performance and expressiveness they bring to the table. Written in an easygoing and example-based style, Beginning Oracle SQL is the book that will get you started down the path to successfully writing SQL statements and getting results from Oracle Database. Takes an example-based approach, with clear and authoritative explanations Introduces both SQL and the query tools used to execute SQL statements Shows how to create tables, populate them with data, and then query that data to generate business results What you'll learn Create database tables and define their relationships. Add data to your tables. Then change and delete that data. Write database queries that generate accurate results. Avoid common traps and pitfalls in writing SQL queries, especially from nulls. Reap the performance and expressiveness of analytic and window functions. Make use of Oracle Database's support for object types. Write recursive queries to query hierarchical data. Who this book is for Beginning Oracle SQL is aimed at developers and database administrators who must write SQL statements to execute against an Oracle database. No prior knowledge of SQL is assumed.

Process Modeling Style

Process Modeling Style focuses on other aspects of process modeling beyond notation that are very important to practitioners. Many people who model processes focus on the specific notation used to create their drawings. While that is important, there are many other aspects to modeling, such as naming, creating identifiers, descriptions, interfaces, patterns, and creating useful process documentation. Experience author John Long focuses on those non-notational aspects of modeling, which practitioners will find invaluable. Gives solid advice for creating roles, work products, and processes Instucts on how to organize and structure the parts of a process Gives examples of documents you should use to define a set of processes

(MCTS) Microsoft BizTalk Server (70-595) Certification and Assessment Guide: Second Edition

This comprehensive guide prepares intermediate BizTalk developers to excel in the Microsoft BizTalk Server 2010 (70-595) certification exam. With in-depth coverage of essential concepts, practical examples, and end-to-end solutions, the book ensures you have the skills and knowledge necessary to become a BizTalk expert. What this Book will help me do Master the core architecture and functionalities of Microsoft BizTalk Server. Develop skills to create advanced schemas and maps with enhanced logic functionalities. Understand how to manage orchestrations, transactions, and handle exceptions efficiently. Learn administrative tasks, including configuration and troubleshooting, for BizTalk server environments. Explore integration with web services, WCF, and additional BizTalk features like EDI and BAM. Author(s) This book is written by a team of experienced BizTalk professionals who have hands-on working knowledge with Microsoft BizTalk Server. Their expertise encompasses enterprise-level solution architecture and implementation. They bring their comprehensive understanding and teaching aptitude together in this book, ensuring a balance of detailed technical content and accessible learning. Who is it for? This book is ideal for intermediate-level BizTalk developers focusing on obtaining the Microsoft BizTalk Server 2010 (70-595) certification. It is suitable for individuals with basic knowledge of BizTalk concepts and working with orchestrations. A foundation in WCF and understanding of EDI is recommended to benefit fully from the content of this book.

Microsoft Big Data Solutions

Tap the power of Big Data with Microsoft technologies Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies. Best of all, it helps you integrate these new solutions with technologies you already know, such as SQL Server and Hadoop. Walks you through how to integrate Big Data solutions in your company using Microsoft's HDInsight Server, HortonWorks Data Platform for Windows, and open source tools Explores both on-premises and cloud-based solutions Shows how to store, manage, analyze, and share Big Data through the enterprise Covers topics such as Microsoft's approach to Big Data, installing and configuring HortonWorks Data Platform for Windows, integrating Big Data with SQL Server, visualizing data with Microsoft and HortonWorks BI tools, and more Helps you build and execute a Big Data plan Includes contributions from the Microsoft and HortonWorks Big Data product teams If you need a detailed roadmap for designing and implementing a fully deployed Big Data solution, you'll want Microsoft Big Data Solutions.

Heuristics in Analytics: A Practical Perspective of What Influences Our Analytical World

A practical guide to deploying mathematical and statistical models when performing analytics The Heuristics in Analytics describes analytic processes and how they fit into the heuristic world around us. In spite of the strong heuristic characteristics of the analytical processes, this important book emphasizes the need to have the proper tools to engage analytics. It describes the analytical process from the exploratory analysis in respect to business scenarios and corporate environments, to model developments; and from statistics, probability, stochastic, mathematics, and artificial intelligence; to the deployments and possible outcomes. Describes the overall analytical process in terms of modeling, deployment, and application Offers a new perspective of the randomness in analytical modeling Presents distinct analytical approaches such as statistical, probabilistic, stochastic, and mathematical Includes case studies on the entire analytical process using telecom companies based in Brazil, Ireland, Turkey, United Sates, and Canada Randomness holds a strong impact in the everyday world. It makes sense, then, that analytics are put in place to understand business occurrences, marketplace scenarios, and consumer behavior. The Heuristics in Analytics uniquely shows how random events on a daily basis might completely change expectations, predictions, and behaviors, particularly in corporate environments, and how companies can build a proper analytical strategy to diminish the effect of randomness in business actions.

Statistics in Action

Commissioned by the Statistical Society of Canada (SSC), this volume helps both general readers and users of statistics better appreciate the scope and importance of statistics. It presents the ways in which statistics is used while highlighting key contributions that Canadian statisticians are making to science, technology, business, government, and other areas. The book emphasizes the role and impact of computing in statistical modeling and analysis, including the issues involved with the huge amounts of data being generated by automated processes.

ODS Techniques

Enhance your SAS ODS output with this collection of basic to novel ideas.

SAS Output Delivery System (ODS) expert Kevin D. Smith has compiled a cookbook-style collection of his top ODS tips and techniques to teach you how to bring your reports to a new level and inspire you to see ODS in a new light.

This collection of code techniques showcases some of the most interesting and unusual methods you can use to enhance your reports within the SAS Output Delivery System. It includes general ODS tips, as well as techniques for styles, enhancing tabular output, ODS HTML, ODS PDF, ODS Microsoft Excel destinations, and ODS DOCUMENT.

Smith offers tips based on his own extensive knowledge of ODS, as well as those inspired by questions that frequently come up in his interactions with SAS users. There are simple techniques for beginners who have a minimal amount of ODS knowledge and advanced tips for the more adventurous SAS user. Together, these helpful methods provide a strong foundation for your ODS development and inspiration for building on and creating new, even more advanced techniques on your own.

This book is part of the SAS Press program.

SAS Programming in the Pharmaceutical Industry, Second Edition, 2nd Edition

This comprehensive resource provides on-the-job training for statistical programmers who use SAS in the pharmaceutical industry

This one-stop resource offers a complete review of what entry- to intermediate-level statistical programmers need to know in order to help with the analysis and reporting of clinical trial data in the pharmaceutical industry.

SAS Programming in the Pharmaceutical Industry, Second Edition begins with an introduction to the pharmaceutical industry and the work environment of a statistical programmer. Then it gives a chronological explanation of what you need to know to do the job. It includes information on importing and massaging data into analysis data sets, producing clinical trial output, and exporting data. This edition has been updated for SAS 9.4, and it features new graphics as well as all new examples using CDISC SDTM or ADaM model data structures.

Whether you're a novice seeking an introduction to SAS programming in the pharmaceutical industry or a junior-level programmer exploring new approaches to problem solving, this real-world reference guide offers a wealth of practical suggestions to help you sharpen your skills.

This book is part of the SAS Press program.

Examples and Problems in Mathematical Statistics

Provides the necessary skills to solve problems in mathematical statistics through theory, concrete examples, and exercises With a clear and detailed approach to the fundamentals of statistical theory, Examples and Problems in Mathematical Statistics uniquely bridges the gap between theory andapplication and presents numerous problem-solving examples that illustrate the relatednotations and proven results. Written by an established authority in probability and mathematical statistics, each chapter begins with a theoretical presentation to introduce both the topic and the important results in an effort to aid in overall comprehension. Examples are then provided, followed by problems, and finally, solutions to some of the earlier problems. In addition, Examples and Problems in Mathematical Statistics features: Over 160 practical and interesting real-world examples from a variety of fields including engineering, mathematics, and statistics to help readers become proficient in theoretical problem solving More than 430 unique exercises with select solutions Key statistical inference topics, such as probability theory, statistical distributions, sufficient statistics, information in samples, testing statistical hypotheses, statistical estimation, confidence and tolerance intervals, large sample theory, and Bayesian analysis Recommended for graduate-level courses in probability and statistical inference, Examples and Problems in Mathematical Statistics is also an ideal reference for applied statisticians and researchers.

Expert Cube Development with SSAS Multidimensional Models

"Expert Cube Development with SSAS Multidimensional Models" is a comprehensive guide designed for professionals looking to elevate their competence in creating and optimizing SSAS cube solutions. Focused on the multidimensional model, this book provides a detailed, pragmatic approach to delivering high-performance Business Intelligence solutions. What this Book will help me do Master the core features of multidimensional modeling with SSAS. Develop efficient and scalable OLAP cubes for business analysis. Implement advanced calculations and measures using MDX. Optimize and troubleshoot SSAS performance for real-world scenarios. Integrate SSAS models for insightful data visualization. Author(s) The authors of this book are seasoned SSAS consultants and developers, each with years of hands-on experience working with Microsoft Analysis Services in enterprise environments. Their deep understanding of multidimensional modeling shines through in this detailed and well-structured book, providing readers with not only practical guidance but also invaluable tips drawn from real-world projects. Who is it for? This book is tailored for BI developers and data professionals who already have some familiarity with Microsoft Analysis Services and want to deepen their expertise in SSAS multidimensional models. It is ideal for those looking to enhance their ability to design, implement, and optimize robust cube solutions for complex business scenarios. With step-by-step tutorials, it caters to intermediate to advanced learners seeking to take their SSAS skills to the next level.

IBM High Performance Computing Cluster Health Check

This IBM® Redbooks® publication provides information about aspects of performing infrastructure health checks, such as checking the configuration and verifying the functionality of the common subsystems (nodes or servers, switch fabric, parallel file system, job management, problem areas, and so on). This IBM Redbooks publication documents how to monitor the overall health check of the cluster infrastructure, to deliver technical computing clients cost-effective, highly scalable, and robust solutions. This IBM Redbooks publication is targeted toward technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) responsible for delivering cost-effective Technical Computing and IBM High Performance Computing (HPC) solutions to optimize business results, product development, and scientific discoveries. This book provides a broad understanding of a new architecture.