FIT5195 Study Guide - Final Guide: First-Order Logic, Predicate Variable, Query Language

155 views5 pages

orangewildebeest422

2 Jul 2018

School

Monash University

Department

Information Technology

Course

FIT5195

Professor

All

For unlimited access to Study Guides, a Grade+ subscription is required.

1) Big Data; Data Warehouse

Big Data is a collection of huge amount of data that requires special database management

systems to analyse and take out useful insights from it. Analysis & insights from this data is

considered as Big Data Analytics.

A data warehouse is a centralised repository of organisational data sourced from a number of

original data sources such as internal information systems, external data feeds and other

supplementary data as required. The purpose is to support organisational decision-making by

providing an integrated data source for business intelligence and other decision support

applications.

When we compare a big data to a data warehouse, we find that a big data solution is a

technology and that data warehousing is an architecture. They are two very different things. A

technology is just that – a means to store and manage large amounts of data. A data warehouse

is a way of organizing data so that there is corporate credibility and integrity. When someone

takes data from a data warehouse, that person knows that other people are using the same data

for other purposes.

2) Enterprise Data Warehouse; Independent Data Mart

An enterprise data warehouse is a unified database that holds all the business information an

organization and makes it accessible all across the company. This is the simplest approach of all

the data warehouse architecture which follows a basic flow where in all internal sources are

gone through ETL process and loaded in data warehouse.

On the other hand, Independent data mart is a decentralized database that acts a stand-alone

system that are built by drawing data directly from operational or external sources of data, or

both. Each data mart is miniature data warehouse which supports particular user group

requirements of the organisation.

Talking about complexity, both the architectures are complex in nature. Enterprise Data

warehouse looks architecturally simple, but, it needs a very large warehouse and dealing with

number of external sources and posting them into one repository makes it a complex

development project. Whereas, independent data mart deals with multiple ETL process (one

each at least for each data mart) and handling data quality for these becomes very complex.

However, independent data mart is most used architecture as well most successful one.

3) Federated Data Mart Architecture; Dependent Data Mart

Dependent Data Mart is centralized data warehouse architecture which is the combination

of Enterprise data warehouse and Independent Data Mart. This extracts data from external

sources into a single repository and then further divides the data into data mart for all user

group needs.

find more resources at oneclass.com

Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in

Federated is similar to a dependent data mart except that there is no physical warehouse

present in the layer between external source and individual data mart. There is a virtual

temporary storage area where the data is placed temporarily after the ETL process and later

it is passed on to the data mart layer.

Both dependent and Federated functions are the same, but dependent data mart is

expensive as it incurs the cost of both Enterprise data warehouse as well independent data

marts whereas Federated excludes physical warehouse cost. Theoretically, Dependent data

marts are most efficient architecture. However, if Federated is successfully implemented

(considering the technical possibilities of virtual database), it is the most recommended

architecture by Kimball.

4) Subject-Oriented; Integrated

Integrated Data warehouse: By the term integrated data warehouse we mean that it

combines the data from multiple sources which in turn is cleansed and integrated to be

present in a single form. Since it comes from several operational systems, all the

inconsistencies must be removed. Consistencies include naming convention, measurement

of variables, encoding structures, physical attributes of data and so forth. Example, Data can

be pulled from sales and marketing department and put in data warehouse in order to get

total yearly revenue. There will single definition of revenue for all departments.

Subject Oriented Data Warehouse: This relates to the design of the data ware house

schema. Warehouse as a tool for decision support are designed with the view of the data

from the perspective of decision making, managerial user. This is inherently a data or subject

oriented perspective focussed on concepts like customers, products and suppliers. This

makes it easier to answer the kinds of business questions like “How many customers do we

have?”

In the same way that a subject oriented view necessarily cuts across artificial data schema

boundaries, it also requires an integrated view of data that cuts across the system and

business unit boundaries.

5) Time Variant; Non-Volatile

Time Variant: Most transaction databases are designed with little consideration given to

temporal aspects of the data contained within them—for example, if a customer changes an

address, then a transaction-processing database will often not keep track of the previous

address, only providing facilities for recording the new, current address. With a data

warehouse, however, there is often a requirement to analyse how data changes over time:

in the words to preserve the “time-variant” nature of the data. This led to the emergence of

time variant data warehouse. A key aspect of the ability to analyse the temporal aspect of

data is the capability to reproduce historically accurate reports. Overwriting data with

find more resources at oneclass.com

Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Big data is a collection of huge amount of data that requires special database management systems to analyse and take out useful insights from it. Analysis & insights from this data is considered as big data analytics. A data warehouse is a centralised repository of organisational data sourced from a number of original data sources such as internal information systems, external data feeds and other supplementary data as required. The purpose is to support organisational decision-making by providing an integrated data source for business intelligence and other decision support applications. When we compare a big data to a data warehouse, we find that a big data solution is a technology and that data warehousing is an architecture. A technology is just that a means to store and manage large amounts of data. A data warehouse is a way of organizing data so that there is corporate credibility and integrity.

FIT5195 Study Guide - Final Guide: First-Order Logic, Predicate Variable, Query Language

Document Summary

Get access

Related Documents

FIT5195 Study Guide - Final Guide: Waterfall Model, 18 Months, Big Bang

FIT5195 Study Guide - Midterm Guide: 18 Months, Pragmatism, Financial Institution

FIT5195 Study Guide - Final Guide: Savings Account, Overdraft, Referential Integrity