Back Home Next

 

 

Enabling superior business decisions

Glossary of Terms

 

 

 

Glossary of Terms

Abstract (data)

Combine several items into one, summarize, perform computation, take out a sub-fact from one item to generate a new attribute, and other actions that change data.

Aggregates

Pre-calculated and pre-stored summaries that are stored in the data warehouse to improve query performance. For example, for every VAR there might be a pre-stored summary detailing the total number of licenses, purchased per VAR, each month.

 
Batch (transport)

Data can be processed for quality or distributed either in a set of many transactions- "batch" mode, or as individual transactions

 
Bulk (transport)

When data is moved bulk, the entire database is refreshed periodically. The opposite strategy is to selectively refresh the database with data changes

Business Intelligence Tools

Those client products which typically reside on a PC that are the decision support systems (DSS) to the warehouse. These products provide the user with a method of looking and manipulating the data.

Clean (data)

Process to check data for adherence to standards, internal consistency, referential integrity, valid domain, and to replace/repair incorrect data with correct data. For example, replacing an invalid postcode code with a post code derived from the state/city information. Checks data quality and scrubs data by some combination of: look-up against valid data (e.g. a list of 20 million Australian mailing addresses), look-up against domain values (e.g. a list of valid State names), domain range checks (e.g. Employees less than 15 or greater than 90 years old), consistency checks among table data, pattern analysis of exceptions, correlations, and frequency distributions

 
Cleanse (data)

See Clean

Data Content Quality

The accuracy and validity of the actual values of the data, in contrast to issues of data structure and database design

Data Mart

Separate, smaller warehouses typically defined along organization's departmental needs. This selectivity of information results in greater query performance and manageability of data. Typically, a collection of data marts (functional warehouses) for each of the organisation's business units

Data Mining

"...a collection of powerful analysis techniques for making sense out of very large datasets." - R. Kimball

Data Modelling

The process of changing the format of production data to make it useable for heuristic business reporting. It also serves as a road map for the integration of data sources into the data warehouse.

Data Staging

"The data staging area is the data warehouse workbench. It is the place where raw data is brought in, cleaned, combined, archived, and eventually exported to one or more data marts." -R. Kimball

Data Transformation

Performed when data is extracted from the operational systems, and may include integrating dissimilar data types, cleansing, summarising and processing calculations.

Data Warehouse

An architected solution for making data available for business intelligence systems. This is data from a production (legacy) systems and external data that now resides in a different environment (often a different database of a separate machine), to be used strictly for business analysis and querying, allowing the production machines to continue their work which is traditionally transaction processing.

Drill Down

The process of navigating from a top level view of overall sales down through the sales territories, down to the individual sales person level. This is a more intuitive way to obtain information at the detail level.

DSS (Decision Support Systems)

Business intelligence tools that utilise data to form the systems that support the business decision making process of the organisation.

EIS (Executive Information Systems)

These are business intelligence tools that are aimed at less sophisticated users, who want to look at complex information without the need to have complete manipulative control of the information presented.

Extract
 
Selects data from various source system platforms. Facility to
  1. specify which data is to be extracted
  2. access to physical database

 

Extraction

Select and copy data from a source database or file

Filter (data)

Process to check data for adherence to standards, consistency, valid domain; then either clean or reject invalid data

Load

Add or replace data in a designed database(s)

Metadata

Data describing other data, for example the column headers in a table. Sometimes referred to as 'data about data'.

Merge (data)

Combine two or more data sets; values or structures. See Abstract

Middleware

Software designed to establish a permanent relationship (including filtering and transformation) between source systems and a logical model. The logical model is then available as a virtual database to end-user query tools or a data migration product

Migration

See transport

OLAP (On-Line Analytical Processing)

Describes the systems used not for application delivery, but for analysing the business, e.g., sales forecasting, market trends analysis, etc. These systems are also move conducive to heuristic reporting and often involves multi-dimensional data analysis capabilities.

OLTP (On-Line Transactional Processing)

Describes the activities and systems associated with a company's day-to-day operational processing and data (order entry, invoicing, general ledger, etc.).

Parameters

A list or database or information that controls a process, for example check boxes or values. Contrast to "script" or "program"

Pump

A data pump extracts data from several mainframe and client server platforms, performs some filtering and transformation, and distributes and loads to another database(s). Usually the term pump is used rather than "replicator" to connote its applicability in a cross-platform environment

Query Tools and Queries

An application that sends native database commands, usually SQL, to extract information from a database server. Queries can either browse the contents of a single table or using the database's SQL engine perform join conditioned queries, that produce result sets involving data from multiple tables that meet certain selection criterion.

Replication

Extract data from several platforms, perform some filtering and transformation, and distribute and load to another database or databases. Usually the term replication implies limited or no transformation, and moves within a homogeneous environment. See Pump

Reverse

Reverse engineering derives a consistent set of metadata from several potential source system's metadata

Star Schema

A method of database design used by relational databases to model multidimensional data. A Star Schema usually contains two types of tables: fact and dimension. The fact table contains the measurement data, for example, the salary paid, vacation earned, etc. The dimensions hold descriptive data, for example, name, address, etc.

Scrub (data)

See Clean

Transform (data)

See Abstract

Transport

Extract data from source, interface with destination environment, load data to destination

Warehouse

A database(s) containing an architected collection of subject-oriented data originating from both the companies' transactions systems and external data

 

 

Home ]