Healthcare Databases
 
   
 

15 people have rated this page an average of 4.47 out of five stars.  See their suggestions and questions.

Audio Lecture Evolution of Databases Lecture SWF version
Start Access and make a Table in Healthcare Databases

Start Access and Make a Table( SWF version)

Create Table by Entering Data( SWF version)

Introduction
Flat databases
Relational databases
Hierarchical databases
Distributed databases
Data-less databases
What do you know?
Presentations
Recently Asked Questions

More
Minute evaluation
George Mason University

Recently asked:  "How does a computer differentiate between a flat, relational and hierarchical databases whey you query the server for a specific data? " See answer to this and other questions.

Introduction to Evolution of Databases

At the core of all information systems are databases. Information needs to be stored, and this is what databases are for. In fact, databases not need to be computer systems, although we typically talk about databases in terms of ones stored and processed by computers. It seems crucial to ask a basic question in the beginning? What is a database?

Consider the following questions: Is an Excel table with students’ grades a database? Is your notebook a database? Is a phonebook a database? Is the GMU schedule of classes a database? Is a medical record of a patient a database? Is a list of nurses working in a hospital a database?

The answer to almost all of the above questions is yes, with the exception of a notebook which allows free text, drawings, etc. to be written onside.  More precisely, database is a collection of data with defined structure and purpose.   Obviously, a computer database is a database stored on a computer. In order to use databases, computers need to be equipped with a Database Management System (DBMS). Many commercial and publically available DBMSs are available.


Flat Databases

Flat data models have the least amount of structure.  They typically take the form of one large Table, where the first row is the list of the variables and subsequent rows are data.  Each case has a row of data. Many statisticians use flat models. 

Here is an example of a flat file for students in a class:

Student ID Name Midterm grade Final grade Address Zip code ... ... ..
4561 Ali Safaie B A 1311 Manor Park 22101 ... ... ..
7878 Mike Smith C B 1619 Ozkan Street 44115 ... ... ..
8954 Mike Smith Jr. A C 2121 Euclid 563 22101 ... ... ..
Table 1:  Example of a flat file

Advantages

Most software include free access to flat data files.  For a small number of cases, flat databases do a reasonably fast job.

Disadvantages

Flat databases waste computer storage by requiring it to keep information on items that logically cannot be available.  For example, if we are keeping information on zip codes, flat files require us to enter a missing information for zip codes in foreign countries.  Hierarchical and structural databases avoid this problem by defining different tables for classes of countries in which zip code is not available.  Thus flat files keep large sparse data full of missing information.  When the size of database is large, search through the data takes a long time.

Flat databases are not conducive to complicated search queries that divide the database further.  For example, in a flat file about students it would be difficult to find all students that live in a certain zip code and have received grade of C.  Such a query will require repeated pass through the data.  First pass may identify all students who live in a zip code, second pass may identify all students who have grade of C and the third pass may find students in both groups.  Such multiple passes through data are inefficient and take long time periods.  Since every simple if statement (if zip code is equal to 22101 then ...) in a computer program takes a fraction of a second, efficient search methods are important for large databases.

Relational Databases

In a relational data base, one stores a record with related fields as data.  All items on the same record are said to be related to each other.  Thus, one may store the information that patients have medical records in the following fashion:

  • Data item:  Patient ID number
  • Data item:  Medical record number
  • Data item:  First name

This information in essence says that the patient ID number, name and medical record number belong to the same person.  To effectively store both information items and the relationships among these items, relational data are kept in table formats.  The first row of the table shows the name of the variables and subsequent rows are data.   All items in the same row usually belong to the same case and are related.   One column of the table is treated as the key to the table.  Numbers or characters in this column corresponds to items in other Tables.  Thus it is possible to move from one table to another.

In a relational database, tables do not need to be of the same size but they all need to have a key column that connects them to each other and that uniquely defines the elements inside the table.

Example

Here is an example of a Table of grades for the example introduced under flat files:

Student ID
Key column
Name Mid-term Final
4561 Ali Ghadiri B A
7878 Mike Smith C B
8954 Mike Smith Jr. A C
Table 2: Table for "Students grades" 

This is an example of an additional table for contact information:

Student ID
Key column
Address Zip
8954 2121 Euclid 563 22101
4561 1311 Manor Park 22101
7878 1619 Ozkan Street 44115

Table 3:  "Students' contact information"

When a query is made, the relational database searches through its tables to find the answer.  Often the answer involves information pooled together from different tables.   For example, a query for names of people with grade of A, can be answered from the Table of Grades.  The query for grades of people in zip code area 22101 must be answered from both Tables. 

Can you try to answer this question:  What is the average final grade of the persons in zip code area 22101?  Did you notice how you move from one table to another to get your answer.  You use the key shared between the two tables, to find the relevant information in the other table.

Advantages

A relational database makes life difficult for the designer of the database but easy for the user of the database.  It is more difficult for the designer because many possible relationships must be anticipated.  It is easier for the user because data can be examined from many different perspectives. 

In addition, if Tables are appropriately defined it is possible to avoid having to enter missing information for variables that are not logically possible.  This helps data entry and data processing speed.

A relational database is also easy to modify because adding new concepts involves adding new Tables, not altering old ones.


Hierarchical Databases

Hierarchical database models are one type of relational data models in which the relationship between any two adjacent item is similar to a father-child relationship.   Hierarchical database models resemble a-cyclical tree structures (directional tree structures that do not include circular paths).    

Example

The file directory in your desk top is an example of a hierarchical database system.   A folder may contains other folders which may contain other files, which contain data. 

Advantages

In hierarchical models, children inherit the relationships and characteristics of their fathers.  An operation on the father affects the children; if you tell the computer to do an operation on a folder, the operation affects all the folders and files that it contains.  For example, if you delete the top folder, you would delete all of the folders and files it contains.  This feature saves time for the person maintaining the files.

Sometimes, when the real world relationship that is being modeled is not hierarchical, it is difficult to fit relationships into a hierarchical database model.  In these circumstances, an operation on the father should not be executed on the children.

Distributed Databases

Most databases physically reside in one place and are managed by a single computer.  They may be backed up to another place but the elements of the database are not maintained in different places.   In a distributed database, data are kept in different settings and on different computers.  One or several central computers maintain indexes to where the data are.   Using the address of the data then computers can communicate and find the information needed.

Distributed databases need not only addresses for where the data are but also need an audit trail of who has updated data or retrieved it.  Audit data are needed in order to pinpoint errors in the system and in order to understand where confidentiality of the system breaks down.  When a computer requests data from another, an audit trail is created by storing who sent data where and when.   When this computer passes the data to another, the information needs to be updated in the original computer.  As the number of computers receiving the data increases the task of auditing becomes more difficult.  At some point (at least theoretically), it is necessary to cutoff the original computer from being updated about where the information has traveled.

Example

Imagine a healthcare organization with multiple geographically distributed sites. Each site manages its own patients, but sometimes patients may visit ther locations (i.e., when on vacation). One way of implementing information system in such as organization is through distributed database. Each site can maintain data (i.e. medical and financial records) cocrresponding to own patients, while the data are accessible from all other sites.

Advantages and disadvantages

The information system manager must make decisions on whether to use centralized or de-centralized databases based on a number of issues including the following:

  • Decentralized databases exchange files and therefore may exchange corrupted files or viruses that may affect the entire system.  Security of these databases are difficult to maintain.  Though the section on Data-less information systems discusses how these systems can be used to create more confidential information exchanges.
  • In decentralized databases the type of data to be exchanged, the process of addressing the data, and the protocol for updating the data must be agreed upon ahead of time and plans must be in place for updating the process.
  • In centralized databases lack of backup or inadequate back up may result to complete loss of data while in distributed data systems data loss is limited to nodes affected.
  • De-centralized databases are more flexible and allow different units to update and maintain their own data.  At the same time, this increased flexibility runs the risk that some units may institute changes that may make them less accessible by others.
  • When different groups and systems are involved in maintaining the data, then there is more opportunity for differences in quality of data to emerge.  A decentralized database needs to have procedures for determining the quality (accuracy, recency, reliability, etc.) of the data.

Federated Databases

Federated databases are an approach to solve some of the problems with distributed datbases. Their goal is to provide transparent interface to distributed datbases. Users are able to access federated databases without knowing where physically required data are stored and in what formats.

In a federated database, all local databases maintain their autonomy. It is local databases that decide if to joint the federation, execute incoming queries and how. Privacy and security are maintained locally.  Federated databases are, thus, ideal for multiple organizations, i.e. healthcare providers, that wish to share information.

What Do You Know?

Advanced learners like you, often need different ways of understanding a topic. Reading is just one way of understanding. Another way is through writing. When you write you not only recall what you have written but also may need to make inferences about what you have read. Please complete the following assessment:

  1. List the advantages of flat, relational and hierarchical databases.

  2. Using the tables presented in this lecture, what are the zip codes of the students who got the grade of A.  Describe, step by step, which table you look up what datum you use to find the answer to the question.

  3. How does relational databases avoid entering missing information for variables which are logically missing. 

  4. Enter the data in Table 2 and 3 into Access.  See how you can start Access and make a tableSee how you can make a table by entering data in Access.  To report your work, take a picture of the screen by pressing control and print screen buttons.  Once the picture has been taken, go to Word document and paste it.

  5. Describe advantages and disadvantages of distributed and federated databases. 

If you are taking the course online, submit your responses as a word document attached to an email to your instructor, otherwise bring your work to class.  Do not copy-and-paste. Write answers in own words. 

Presentations

Following resources are available to assist you:

  1. See slides (or updated version of slides) or listen to narrated lecture on evolution of databases.  Listen to an Arabic translation of the lecture.

  2. To understand how to create tables inside Access see:

Narrated lectures and videos require use of Flash.  

Recently Asked Questions

Ask a question and we will answer it within the next 48 hours.  If you have no questions, please review the answer to the questions asked by others: 
 

Question: How can i merge cells in a row without altering the other rows in a table. Answer: If your goal is to display specific row with merged cells, this can be done by specific queries as described in section 8. It is, however, not possible to permanently modify table to have different cells in different rows. In databases, each row in a table always have the same cells (fields).  This question was asked on 9/7/2009 7:47:00 PM and answered on 9/8/2009 10:23:03 AM.

Question: For question #2 am I using the data tables from the flat databases example, relational database example, or both? Also, am I telling you how I find out the answer to your question in my head or how I would type the code into Access to get my answer? Answer: You should use the data tables from the relational database. In this lecture you do not need to know how to write queries in Access yet, so it is sufficient to describe steps.  This question was asked on 8/25/2008 8:53:06 AM and answered on 8/26/2008 10:42:34 PM.

Question: Why can relational databases allow for missing data that is logically not possible? Answer: Because they do not force the collection of such data. Since data are collected in separate tables, only relevant data is collected and irrelevant and inappropriate data is not collected at all. In contrast, in flat files inapprorpriate data needs to be collected or marked as not available.  This question was asked on 2/12/2006 5:29:14 PM and answered on 2/13/2006 4:30:31 PM.

Question: what is an index adn a cluster? Answer: I do not see where we have used the word cluster, so I am not sure the context on which you are asking this question. Index is used often in the first lecture and it is not used to designate any specific technical meaning. It is intended to mean a pointer to content of another web page.  This question was asked on 9/26/2005 10:31:59 AM and answered on 9/26/2005 8:27:52 PM.

Question: ARe there currently any data-less information systems being utilized in any industry sector? If so, how long was the implementation phase? Answer: I am not aware of such system in practice, though with growth of HIPAA regulations more of these systems should be expected. Of course, Google is a good example of dataless system.   This question was asked on 9/6/2005 5:24:52 PM and answered on 9/6/2005 9:02:34 PM.

Question: Regarding question #3 from the "What do you know" section: I feel that the lecture didn't elaborate on this topic very much, other than to say that the Tables need to be appropriately defined. Answer: There is a video that shows how tables are organized and we will get into this topic at much length later, when we discuss entity relationship diagrams  This question was asked on 9/6/2005 4:10:40 PM and answered on 9/6/2005 9:01:33 PM.

Question: Is it possible to have a database that is a combination of two (ie, say a relational AND hierarchical Database)? Answer: No. The definition of one excludes the other. For example a hirearchhial database does not allow other types of relationships and therefore cannot be a relational database or a flat database.  This question was asked on 9/5/2005 2:41:12 PM and answered on 9/5/2005 11:58:06 PM.

Question: Would the concept of a patient's electronic medical record within a system such as the Kaiser Permanente of California be an example of a data-less system? Answer: No. These medical records acutally have data in them. The best example of a dataless system is your browser. When you do a google search the data is assembled for you. You do not have access to the data before then.   This question was asked on 2/10/2005 9:28:12 AM and answered on 2/10/2005 2:54:46 PM.

Suggested Changes

Add your own suggestions or read below suggestions made by others regarding how to improve this session:
 

Comment: The lecture was very clear and understandable.   Left on 2/12/2006 5:23:40 PM

Comment: The lecture material and explanation of Table creation was clear. Practice in creating the tables was very helpful in learning how to create the tables.  Left on 9/1/2006 10:15:27 PM

Comment: Lesson was well organized and easy to understand. It would be easier to understand the issue of data redundancy for flat files vs relational databases if the examples used included say two sets of grades for one student.  Left on 9/7/2006 11:13:14 AM

Comment: The lecture was easy and informative.  Left on 8/29/2006 5:33:36 PM

Comment: The overview was understandable.  Left on 2/13/2005 11:18:09 AM

Comment: Presentation of topics is well organized, easy to understand, and without fluff.  Left on 9/5/2007 2:53:16 PM

Comment: test  Left on 1/15/2005 1:36:51 PM

Comment: So far, I like the class and the lecture was clear and easy to follow.  Left on 8/26/2008 9:07:59 AM

Comment: Enjoyed the lecture and the background material. Looking forward to improving my knowledge about databases through the course.  Left on 9/2/2008 8:44:37 AM

Comment: This chapter is well organized as it fulfills its objectvive of introducing the concept of databases in general.  Left on 9/3/2008 8:58:37 PM

[1/2]

More

  1. Relational database: a practical foundation for productivity
  2. Use of relational databases in health care
  3. Informatics Resources to Support Health Care Quality Improvement in the Veterans Health Administration.
  4. Integrating Query of Relational and Textual Data in Clinical Databases
  5. Veteran Administration information services
  6. Overview of the Veterans Health Administration (VHA) Quality Enhancement Research Initiative (QUERI).
  7. Arabic translation of lecture and the narrated lecture:
     

This page is part of the course on Healthcare Databases, the lecture on Evolution of Databases.  It was last edited on Saturday August 30, 2008 by Farrokh Alemi, Ph.D.  © Copyright protected.