Saturday, 22 May 2010

Data Warehouses and Transactional Databases

1.0 Introduction Data Warehouses and Transactional Databases

Data Warehouses is a kind of information system store the theory, this theory puts emphasis on utilizing some special materials to store the way, let materials included, help, analyse and process especially, in order to produce valuable information and make decision according to this.

Data Warehouses utilizing the storage way of the materials to be preserved, have once store in, the characteristic not changed with time, the materials stored in at the same time must include time attribute, usually a materials storage will all contain a large number of historical materials, utilize the particular analysis way, since explore particular information out among them.

Transactional Databases Can be considered as the materials collection that can be inquired and revised automatically. The database has many kinds of model, store, have various materials of forms get, can carry on magnanimity large-scale system widely used from a most simple one in all fields database that materials store

2.0 Data Warehouses

The materials storage, proposed in 1990 by father W.H.Inmon in the materials storage, the main function was to deal with (OLTP) the organization through the on-line trade of the information system A large number of materials accumulated for years, store the structure through the peculiar materials of storage theory of materials, doing one has systematic analysis to put in order, analyse and process (OLAP) on-linely by various analytical methods of profit , the materials mining (Data Mining) It goes on, and then support it like DSS (DSS) , in charge of information system (EIS) Setting-up,it help policymaker can getting more effective fast last valuable information since a large number of materials, draft in order to make policy and the external environmental change of fast reaction, help builds and constructs the commercial intelligence (BI) .

Generally speaking, materials storage can by getting related database, set up for materials one degree of databases of multidimension that storage develop specially, if set up by one degree of databases of multidimension, its structure can be divided into a written complaint of a star and snow flake form structure, include several and link one degree of information forms, and a fact information form.

The organizational system of the materials storage is not merely only the application of the tool technological side of information, draw and carry out surface need, have deep understanding to industry knowledge, marketing management, market orientation, tactics rule person who paint relevant terms even more in rule, could really give play to materials storage and follow-up to analyze the value of the tool, improve the competitiveness of organizing.

2.1 Data Warehouses Characteristic

Subject-Oriented: Different from general OLTP system, the materials modelling of the materials storage, classifies materials according to their meanings to the same theme district (subject area) emphatically ,Called the theme direction. Give an example such as Party, Arrangement, Event, Product,etc.

Integrated: The materials are combined and identical in the materials storage from the enterprise every OLTP system.

Time-Variant: The change of the materials, is able to note down and track and change in the materials storage, facilitate reflecting the materials orbit that can vary with time

Nonvolatile: Once the materials will not be replaced or deleted after confirming writing into, even if the materials are also wrong

2.2 Data Warehouses function

Merge the materials of the source of the heterogeneous materials into the single homogeneity structure. To analyzing the efficiency that inquires but not dealing with the trade, organize the materials with the simple structure. Include effective identical equal to alreadying and and direct against and analyze the conversion materials formatted. Offer the steady materials recorded on behalf of the commercial affair.
Fixed newer with other materials, but not carry on the frequent trade. Simplify security requirement.

3.0 Transactional Databases

Databases have many kinds of models. For example used in large-scaly to the one that store as like Network Database ,RelationalDatabase, Object-Oriented Database, DAP and Hierarchical Database .

Form according to just a few array in form generally. General, every tabulation of China shows one According to the type, according to existing in different forms that walks among them.

3.1 Disadvantage and Advantage
The advantage of the database:

 Reduce the materials or data to repeat (Reduced Redundancy)
 Combine the materials (Integrated data)
 Integrality data type (Integrity)
 Compatibility of the materials
 Can protect the security of the materials and personal informatiom

The disadvantage of the database:

 increase the cost
 The system is complicated
 Need to back up frequently, otherwise the unexpected state will cause the great injury
 Manage the teacher (DBA) in database Non- easy to train


3.2 Data classification

Distinguish in accordance with the materials model
 Hierarchical data model : Make up this arborescent stratum way of the materials, to put it briefly regard the materials as the type attitude of a group of arborescent structure
 Network data model : A database, its sub element can to put it briefly regard the materials as the type attitude (research and develop first) of a group of reticular formations to the one or more mothers element
 Relational data model : To put it briefly have relevant forms to regard the materials as some each other, succeed in most with database of this kind of at present ( Probably because the idea is simple in order to do in fact) .
 Database: Include a lot of tables.
 Form: Include a lot of record.
 record: Include a lot of ields

The advantage of combined the materials:
1. reduce the materials to repeat, improve the sharing and combining
2. make the materials independent of procedure: for example, Can utilize one database administrative system come interconnected system manage materials jointly, and every procedure down the order to the administrative system of the database and then.
3. close and related and organization among the materials, the inquiry of the user can do the fast response.
4. security and control of concentrating on melting: for example the centralized management of the materials, can increase the controlling and guarantee security, it is big in the materials amount, can set up the database and manage the teacher (database administrator) To maintain the database system.

3.3 The basic function of the administrative system of the database

The materials are defined: Must be able to define and the materials projects of different types of management. The materials are dealt with: To user offers is ability more about materials,(the newly-increased and revising, inquiring and deleting) .

The materials are safe: Can set up user's account, password and authority and prevent the materials from letting out or destroying.

The materials are backed up: Offer the materials to back up, can reduce to the state backed up and reduce losses while damaging unfortunately.

What the user touches is application program, and that is really exposed to with the real materials is that the administrative system of the database hides the real environment of database, represent abstract structure, needn't consider magnetic rail, indicator, overflow location true issue of environment make application program of designing, and enable and design simplifying.

3.4 Database Management System (DBMS)

The administrative system of the database (database management system) Advantage separated form application program.

The first, simplify and design: Ex: In the dispersing type database, it is very complicated that the materials are stored, if there is no DBMS that the procedure must all know all materials.

The second, strengthen one to look at: The access of all restriction databases is all dealt with by one DBMS, it is simple to deal with a one that look at and change.

Third, it is materials independence.

3.5 On-Line Transaction Processing system

The On-Line Transaction Processing system (OLTP), the related type database is most suitable for the materials which manages the modification. Usually, these related type databases will have several users to carry out the trade that will alter the instant materials at the same time. Though the user's specific materials require that usually only need to consult a few materials to tabulate, but will send out a lot of requirements at the same time.

OLTP database is to design usedding for letting the trade type application program only write into and deal with the materials needed in single exchange as soon as possible. OLTP database will generally carry out the following projects:
 Support the to run side by side a large number of user to be fixed and newly-increased and revising the materials.
 Display the modification of organizing state continuity, but does not store its record.
 Contain a lot of materials, including using for verifying a large number of materials of the trade.
 Have complicated structure.
 Can make fast response to trade activity to change to.
 Offer the technological infrastructure, in order to support the daily job organizing.
 The specific trade can deposit and withdraw a small amount of data relatively by finishing rapidly. OLTP system is to design and adjust usedding for dealing with and entering several hundred or thousands of transactions of the database at the same time.

The materials in OLTP system will be organized and supported the trade such as the following projects mainly:

 Record the order from the sales outlet terminal machine or Web platform.
 The order when the volume of stock is dropped to appointed standard time, in order to require more source of goods.
 Follow the trail of and assemble in the factory for the component of the final products.
 Record staff's data.

4.0 Data Warehouses VS Transactional Databases

In brief, the database is a affairs -oriented design, the data warehouses faces theme design. The Transactional database generally stores the on-line trade materials, it is generally historical materials that the data warehouses is stored.

The design of database is to try hard to avoid being redundant, generally adopt the rule to accord with normal form to be designed, data warehouses it designs to be inclined to introduce redundant, adopt, design against normal form way.

The database is designing to catch the materials, the data warehouses is designed to analyze the materials, two basic elements of it link forms and fact form. It is the angle such as time looking at problem to link, the department, the ones that linked forms and put are the definitions of these things, the materials that will be inquired about are put in the fact form, there is ID linked at the same time.

Only speaking from the concept, a bit hard to understand. Any technology is for employing serving, combine and employ and can understand very easily. Take banking as an example. Database affairs materials platform of system, customer will write database into in every transaction that bank make, written down, here, can be interpreted as keeping accounts with the database briefly. The data warehouses is the materials platform of a analyticl system, it obtains the materials from the affairs system, gather, process, offer the basis of decision to policymaker. For example how much trade does it take place some some one month such as branch such as bank, what branch this balance of deposits at present. If deposit many, consumption trade also more, this area set up ATM while being necessary then.

Obviously, trading volume of bank enormous, even ten million calculate with a million usually. The affairs system is real-time, this requires prescroption, the customer takes dozens of seconds to be stood to deposit a sum of money, this requires the database to store the materials of very short period. And the analyticl system is afterwards, it should offer to pay close attention to all valid materials in time slot. These materials are magnanimity, calculate and stand up a little slower too in summary, but, have achieved the goal if can offer the valid analysis materials.

The data warehouses , under the condition that already a large number of has existed for the database, in order to further excavate the resources of the materials, need and produce in order to make policy, it is by no means the so-called ‘large-scale database’'. Then, the data warehouses is compared with traditional database, what difference is there? Let us read W.H.Inmon definition about the data warehouses first: The theme -oriented, integrate one, materials set correlated with and can't be revised time.


5.0 Conclusion

Data warehouse has multidimensional data structures and OLTP(Online transaction processing) has 3NF(The third normal form) data structure. The multidimensional design of data warehouse enabled end-user with different interests to look at the data from difference cross-section. 3NF data of OLTP enabled efficient data manipulate and simple query.

Database tables of data warehouse have much more indexes about OLTP. Indexes can enhance the efficiency of conducting data analysis. It involve large amount of data in data warehouse. While too much indexes in OLTP may hurt the performance for data manipulation.

Data redundancy in data warehouse is common to make better about the efficiency of data analysis. On the otherhand, Data redundancy in OLTP is rare and to facilitate data manipulation.

Data in data warehouse are historical data(see chapter 2), it inculdes derived and aggregated values of RAW data. Data in OLTP inculdes derived data and aggregated values.






6. Reference

MSDN, viewed February 1


Database outline is explained elementarily, viewed February 2
< http://www.e-zone.com.hk/discuz/viewthread.php?tid=6561 >

The detailed introductions of difference of the database and Data warehouse viewed February 1
< http://bbs.flash2u.com.tw/dispbbs_155_80485.html >

"The Story So Far". 2002-04-15. viewed February 1


Pendse, Nigel and Bange, Carsten "The Missing Next Big Things", viewed February 10
< http://www.olapreport.com/Faileddozen.htm >


Gray, J. and Reuter, A. Transaction Processing: Concepts and Techniques, 1st edition, Morgan Kaufmann Publishers, 1992.

Kroenke, David M. and David J. Auer. Database Concepts. 3rd ed. New York: Prentice, 2007.

Beynon-Davies, P. (2004). Database Systems. 3rd Edition. Palgrave, Houndmills, Basingstoke.

Connolly, Thomas and Carolyn Begg. Database Systems. New York: Harlow, 2002.

No comments: