CIS 111

Week 12 Notes: Files and Databases


12.1) All Databases Great & small
  
  Database - collection of related files that are made useful in ways that
  traditional filing systems can not accomplish.
  
  As we saw in Access, databases can be queried in ways that other systems will
  not allow.  (sorted, specific criteria, cross reference)
  
  Shared databases - used by several people
  
  Distributed database - stored on more than one computer
  
  
  The Database Administrator - person in charge of database.  Creates and
  maintains the database.
  
  Jobs of the database administrator
  
  Database Design 
  
  Coordinate with users
  
  System security
  
  Backup and recovery
  
  Performance monitoring
  
  
  
12.2) The Data Storage Hierarchy the Key Field

  Data Storage Hierarchy

  binary digit - 0 or 1
  
  byte - 8 bits
  
  character - a singleton set of data.  Usually represented by one byte.
  
  Field - unit of data consisting of one or more characters
  
  Record - a collection of related fields
  
  file - collection of related records
  
  Database - collection of related files
  

12.3) File Handling Basic Concepts

  Program files - contain instructions
  
  Data files - files that contain data.  Remember, never use a word to define a
  word.
  
  Two types of data files
  
  Master file - contains relatively permanent records 
  
  Transaction files - temporary holding file that holds changes made to the 
  master file
	 
  Batch vs. Online Processing
  
  Batch processing - data is collected over a period of time before it is
  collected.
  
  Online processing - real time processing, data is processed immediately
  
  Q) Pros and Cons??
  
  Storage
  
  Offline storage - data is not directly accessible for processing, it is saves
  on disk or tape
  
  Online storage - data is available for direct processing, under the direct
  control of the CPU
  
  
  File Organization
  
  Sequential File Organization - files stored in the order in which they were 
  saved
  
  Direct File Organization - stored records in no particular order. Retrieved by
  a unique key.
  
  Indexed-sequential file organization - combination of the above.  Stored in 
  order and retrieved by a unique key.
  
  
12.4) File Management Systems

  File Management System - creates, stores, and retrieves files one at a time.
  
  Disadvantage of a File Management System - 
  
  Data redundancy - files stored more than once.
  
  Lack of data integrity - Data is inaccurate, inconsistent
  
  Lack of program Independence - data files only worked in one particular
  program
  
12.5) Database Management Systems

  DBMS - controls the structure of a database, allows files to be cross 
  referenced
  
  Advantages 
  
  Reduced data redundancy
  Improved data integrity
  More program independence
  Increased user productivity
  Increased security
  
  Disadvantages
  
  increased cost
  Data vulnerability - if someone breaks in, they have access to ALL the files.
  Privacy - people who have access have access to all the files.
  
12.6) Types of database Organizations
  
  Hierarchical database - resembles a family tree, objects are 'related' to 
  each other
  
  Network database - child can have more than one parent
  
  relational database - individual fields within a record are 'related' to each 
  other
  
  Object Oriented Database - can handle graphics, video, and audio in addition
  to test and numerical data
  
  Current applications - Medical Information Systems, Engineering Information
  systems, Geographics databases, training and education.
  
12.7) Features of a DBMS

  Data dictionary - a small database that stores data definitions
  
  Utilities - allow you to maintain a database by creating, editing, deleting
  data, records, and files.
  
  Query Languages - easy to use computer language for making queries.  SQL
  stands for Structured Query Language.  It is the query language used by most
  major DBMSs including Oracle, Sybase, dBase, Paradox, and Access.
  
  Report Generator - produce an on-screen or printed out document of all or 
  part of the database.
  
  Access security - allows different levels of access to classified data.
  
  System Recovery - attempt to recover lost data
  
  mirroring - frequent simultaneous copying of the database
  
  reprocessing - DBA goes back to a known point before failure
  
  rollforward - recreating the current database using a previous database state
  
  Rollback - undo unwanted changes to get the database to a point before
  corruption
  
12.8) Features of a DBMS 

  Data Mining - computer assisted process of siftng through and analyzing
  vast amounts of data
  
  ex: - buying a house
  
  Preparing data for the Data Warehouse
  
  1) Data sources - gather the data
  2) Data fusion and cleaning - checked for errors and consistency of formats
  3) Data and meta-data - shows the origin of the data
  4) Data Warehouse - contains cleaned up data and meta-data
  
  Software for finding and analyzing
  
  Query and reporting tools (ex. Focus Reporter and Esperant) Verify Hypotheses
  
  Multidimensional Analysis Tools - "data surf" to explore all dimensions of a 
  particular subset of data
  
  Intelligent agents - computer programs that roam through networks performing
  complex work tasks for people.  In terms of data tools - can find correlations
  in data that are not obvious
  
12.9) Ethics of databases - concerns of Accuracy and Privacy

  Can't get the whole story
  Not gospel
  Know the boundaries
  Find the right words
  History is limited
  
  Privacy
  
  fund-raisers, tele-marketers
  Finances, Health, Employment, Communication
  
  
   GAME
    
  
Homework 12:
Due: 4/17 
in Interactive computing book
PP Ch. 1
1.24 #1 
Checked out during lab