ISDS 2001 : Test 3 Comprehensive Notes (with Highlight)

18 views9 pages
1
CHAPTER 2: DATA WAREHOUSING
Objectives: After completing this chapter, you should know:
1. DATA WAREHOUSE: is a pool of data produced to support decision making enterprise wide.
a. A repository of current and historical data of potential interest to managers
b. DW provides a single version of the truth.
c. A discipline that results in applications that provide decision support capability allows ready access to
business information and creates business insight.
A DW differs from an operational database in that most data warehouses have a product orientation and are
tuned to handle transactions that update the database. F
2. The FOUR MAJOR CHARACTERISTICS of data warehousing and what each means:
a. Subject-oriented:
Data organized by topics (sales, products, customers). Contains only info relevant to decision making
Provides comprehensive view of organization, how and why a business is operating.
b. Integrated:
Data from different sources are stored in a consistent format.
Also clarity is obtained in unit of measures, naming/labeling of attributes, etc.
(The assumption is the data warehouse is totally integrated.)
c. Time-variant: 时变的
Data at various points in time uses historic and current data to analyze trends, deviations, compare and
forecast outcomes.
Do not necessarily to be current statues except in real-time system
Every data warehouse should have a time variable.
(Example: LSU enrollment, retention, graduation data)
d. Nonvolatile:
Users cannot change the data was into the data warehouse. This ensures that the data warehouse is
almost exclusively available for access. Obsolete data can be deleted and changes are recorded as new
data.
It is exclusive available----On test!
ONE OF THEM WAS AN EXAMPLE pretty sure it was Integrated.
3. Additional characteristics of data ware housing:
a. designed for web-based usage/application
b. has relational/multidimensional structure
c. uses client/server architecture to provide easy access to end-user
d. for newer DWs, allows for real-time, up-to-the-minute, and active data access and analysis
e. contains metadata which is information that describes your data; data about data
4. The THREE MAIN TYPES of Data Warehouses and their definitions.
a. Data Mart: Is a subset of a data warehouse, consisting of a single subject area (marketing, sales,
Customer satisfaction, inventory, production, etc.)
1) Dependent Data Mart created directly from the data warehouse. This ensures that the user is
using/viewing the same data available to all other users. EDW must be constructed first. High costs
2) Independent Data Mart a small warehouse designed for a department or strategic business Unit
(SBU). Its source is not an EDW. Low costs
Give the example of Date Mart
Question was a fill in the blank: It was either dependent or independent(literally know the part in the
book that says “dependent datamart is blank” and then “independent datamart is blank”) cant member
which one it was.
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in
2
b. Operational Data Stores (ODS) a type of database often used as an interim (or staging) area for a data
warehouse, especially for customer information files (CIF).
Data are updated frequently through the course of business operations as opposed to the static contents of a
data warehouse.
(Short-term memory, fairly recent, provides a near-real-time view, integrated view of volatile)
c. Enterprise data warehouses (EDW) a large-scale data warehouse that is used across the enterprise/
company for decision support. Being large-scaled, the EDW integrates data in standard format from many
sources.
(DirecTV, Enterprise Rental used EDW).
5. The definition of METADATA 元数据 and be able to identify examples
Metadata Data about data; describes the contents of a data warehouse, its structure (such as field name, data
type, default value, length), meaning, syntax and the manner of its use.
Ex: Design view in Access or SQL view.
CASE 2.2
Test Q: What is the example of Metadata? Number , revenue….
Which of the following describes how data are organized and how to use them effectively?
a. Data directory
b. Data indexing
c. Data profiling
d. Metadata
6. The MAJOR COMPONENTS OF THE DATA WAREHOUSE PROCESS and describe those components:
1) Data Sources: transaction data (OLTP) such as CRM, ERP, ACCESS & SQL data, web logs from the internet,
external data
(ex: census data, legacy systems (reference to outdated computer systems)
2) ETL (Extraction, Transformation, Load) Process 提取转换加载: Extraction, Transformation, Load, data are
extracted from external sources using ETL software.
(ex: MS SQL Server SSIS)
3) Comprehensive Database -Enterprise Database with Metadata: used to support all decisions, Meta data are
maintained so it can be used by IT professionals.
4) Data Marts (optional)
5) Middleware Tools: tools that access the contents of the data warehouse. These are the front-end applications
that users have to interact with data, including data mining, queries, OLAP, predictive analyses, reporting and
visualization tools
(ex: MS SQL MS, MS Excel with PowerPivot, and others).
Which of the following are the major components of DW process? -Only the above ans.
Question: What is one of the main processes?
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in
3
7. Describe a TWO-TIERED AND THREE-TIERED ARCHITECTURE and know the advantages and disadvantages
of each.
Three-Tiered Architecture:
a. Client Workstation (Tier 1) allows the end user to request application functions (Ex: ACCESS) which, in turn,
and requests data content (ex: ACCESS database files)
b. Application Server (Tier 2) responsible for execution of programs
c. Database Server (Tier 3) houses the data base or data warehouse
Advantage: separates the application and database functions, allowing for greater capacity and performance of
the respective servers.
Disadvantage: Increased cost due to more hardware requirements
Two-Tiered Architecture:
a. Client Workstation (Client Tier) allows end-user to request both the application functions and data content
from one server.
b. Application Server & Database/Data Warehouse run on the same server (hardware platform)
Advantage: More economical than three-tiered
Disadvantage: Can have performance problems for large DW using data-intensive applications
8. Understand the architecture of web-based data warehousing.
Advantage: Ease of access; platform independence; lower cost
Page loading speed is the major consideration
9. Server computer hardware that provides a specific service used by other computers.
Application Server computer hardware responsible for the efficient execution of procedures (programs).
Database Server sometimes referred to as the back-end; holds the database or data warehouse.
Client software allows users to request a server’s content or function; Front-end
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in

Document Summary

A dw differs from an operational database in that most data warehouses have a product orientation and are tuned to handle transactions that update the database. The four major characteristics of data warehousing and what each means: subject-oriented: Provides comprehensive view of organization, how and why a business is operating: integrated: Data from different sources are stored in a consistent format. Also clarity is obtained in unit of measures, naming/labeling of attributes, etc. (the assumption is the data warehouse is totally integrated. : time-variant: . Data at various points in time uses historic and current data to analyze trends, deviations, compare and forecast outcomes. Do not necessarily to be current statues except in real-time system. Every data warehouse should have a time variable. (example: lsu enrollment, retention, graduation data: nonvolatile: Users cannot change the data was into the data warehouse. This ensures that the data warehouse is almost exclusively available for access.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers

Related Documents