Massachusetts Institute of Technology - Department of Urban Studies and Planning


11.521
Spatial Database Management and Advanced Geographic Information Systems
11.523
Fundamentals of Spatial Database Management

Relational Database Design--from E-R Diagram to SQL Statements

March 15, 2018


Administrative

Today:


URISA proceedings database: Another example to practice relational thinking

ER diagram of URISA schema


Introduction to Relational Database Design (based on Lecture by Tom Grayson)


Mini WEBSIS -- A Step by Step Example (developed by Jinhua Zhao)

Let's design a successively more complicated database schema for managing teacher/student/course data

First, Let's look at the student records portion of MIT's Data Warehouse

Next, let's build our model of subject enrollment beginning with a simple data model and then adding complexity:

Version I:  Simplest Model - keeping track of classes, students, and teachers

Entities: Teachers, Students, Courses
Attributes:
Teachers: ID, Name, Email, Office, Phone...
Students: ID, Name, Department, Year, Email, Phone...
Courses: ID, CoureNumber, Name, Desription, WWW, Level, Credit...
Relationships:
Students vs. Courses: Many to Many Relationship
Teachers vs. Courses: Many to Many Relationship       
Matchup Tables
Students-Courses matchup table
Teachers-Courses matchup table

Constraints:
Primary Key
Foreign Key
Schema:
Schema Version I

Sample queries

1. How many graduate students are there in 11.521?
2. What classes does Joe Ferreira instruct?

Remark I: Always consider what we can do with the database (thinking about possible queries) even when we are still at the preliminary stage of the database design.


3. Which course has more than two instructors?
4. How many course credits has Julie taken? Among these, how many are high-level credits? Has she satified the graduate credits requirement for her MCP degree, which is defined as at least 124 total credits including at least 48 high-level credits?

Remark II: Usually only one to many relationships exist in the final E-R diagram.

-- For one to one, integrate the tables into one unless special cases such as 1) too big table, 2) permission issues like, table "teachers" vs. table "teachers_confidendial".
-- For many to many, use matchup table to break it to two one-to-many relationships.

Version II: Add Consideration of Classroom

New Entities: Place
Attributes:
Place: Building, Room, Type, Size
New Relationships:
Places vs. Courses: Many to Many Relationship
Schema:
Schema Version II

Sample queries

1. How many, and which courses, utilize computer labs?
2. Has Jaechoel ever been assigned to Room 37-312 for any coureses?
3. Joe is teaching another class--11.220 QR, can room 4-234 hold all the students in this course?

Remark III:  When constructing a query, first go through the logic flow using the E-R diagram.

Version III: Add Consideration of Student Grades

New Entities:  None
New Attributes: Grade Score, in which table?
New Relationships: None
Schema:
Schema Version III

Sample queries:
1. Has Jing Su taken course 11.201? If so, what is her score in it?
2. How many people have taken any courses tought by Joe and received an A? Has anybody ever failed in his classes?

Remark IV:  A new entity, or a new attribute? If it is attribute, what is it specific to, i.e., what table is it dependent on?


3. What is Jeff's GPA? --credit-weighted average grade points
This is a pretty hard question--in what table do we find scores? how do we transfer them to grade points? where do we find the credits for the course? how do we compute the weighted average?
Lookup Table: Score-Grade Point Transform
Schema Version III-2

Remark V:  Learn to use look up table otherwise the 3rd normal form of database design is violated.

By the way, what if we consider teachers' evaluation? where do we put this information? We can argue either way--evaluation is course specific or professor specific or course-professor specific?
Schema Version III-3

Version IV: How about TAs? Are they a special element of the course?

New Entities:  We might add in TA entity but not Necesssarily.
New Attributes: None
New Relationships: Yes
TA is also a student. It is just a special relation between courses and students.
Many-many relationship

Schema:
Schema Version IV

Remark VI: Again, what's necessary? A new entity or a new relationship
Sample queries
1. Is Jinhua a TA for Jane or Tom?  (self-join)

Remark VII:Is there going to require a self-join in our SQL query?

2. If it is TA's responsibility to reserve a computer lab for the class, does Jinhua have to do it for 11.220?
3. With whom (instructors) has Jinhua ever worked as a TA?
4. After 9/11, U.S. INS strengthened the requirement of international students being registered as full time students. ISO tells them that they have to work or study more than 24 hours per week to be regarded as a full time student. Does Jinhua satify this requirement? Suppose one course credit (either studying or TAing) accounts for one hour work per week.
5. Difficult question--is there such a case that student A works as TA for student B in one course while student B works as a TA for student A in another course? If so, who are they and what are the courses?

Remark VIII: Extend the database step by step.

Remark IX:  Quite often, the potential power of the database could go beyond what we expected when we designing it.

Further Enhancement--How to deal with time?

All the above questions lack any consideration of time.
1. How many credits of courses has Julie taken? --> What if we ask, how many credits of courses has Julie taken in year 2002?
2. Has Jaechoel ever been to Room 37-312 to take any coureses?--> Was Jaechoel there or supposed to be there in Room 37-312 on March 4th, 2003?
3. Did Joe offer any courses in Fall 2003 when he was on sabbatical?

Question1 : Where do we put "TIME" in? Is it an entity, attribute or relationship? To what extent, do we want the time to be specified?

Questino2: Up to now, the whole structure is centered on "COURSE"? What if when we want to consider the relationship of "Advisor and Advisee". A moment ago, we said there is not a direct link between teachers and students, is it true?

Question3: How about RAs? What links students to 
professors as RA? --Projects. We could create a new entity of projects and match up table of stu_project and tea_project.

Down to the earth--SQL Statement

1. Table Structure Setup

Create Tables

--create table
create table students

    (studentid     number(9,0),
     fname         varchar2(20),
     lname         varchar2(20),
     department    varchar2(30),
     year          number(1,0),
     email         varchar2(30),
     phone         varchar2(20)
    );

 
Remark X:  Think about the data type. Tradeoffs to be made.

Department:
Number: 1, 2, 3, ..., 24--easy to standandized but hard to inteprete, look up required
String: DUSP, CIVIL, TPP, ... --easy to understand but hard to standandize, problem of multiple, slightly different names

Phone:
Number with country code, area code and telephone number: easy to utilize but not flexible
String as a whole: full flexibility but unable to check the rules, hard to utililize the number.

Specify Primary Key

--specify primary key
ALTER TABLE students ADD CONSTRAINT pk_studentid PRIMARY KEY (studentid);

Create Table with Constraints

--or create table with constraints
drop table students;
create table students
    (studentid     number(9,0) CONSTRAINT pk_students PRIMARY KEY,
     fname         varchar2(20),
     lname         varchar2(20),
     department    varchar2(30),
     year          number(1,0),
     email         varchar2(30),
     phone         varchar2(20)
);

Specify Primary Key (Multi-columns)

ALTER TABLE stu_cou
    ADD CONSTRAINT pk_stu_cou PRIMARY KEY (studentid,courseid);

Specify Foreign Key

ALTER TABLE tea_cou
    ADD CONSTRAINT fk_teacherid
    FOREIGN KEY (teacherid)
    REFERENCES teachers (teacherid);

Drop Tables

--drop table
drop table students CASCADE CONSTRAINTS;


Complete SQL Statement See file "create.sql"

2.Records Input and Modification 

Insert Records

--insert data
insert into students
(studentid, fname, lname, department, year, email, phone)

values (912384234, 'Michael','Sable','DUSP',5,'msable@mit.edu','1-617-234-5678');

Delete Records

--Delete Records
delete from students
where studentid=912384234;

Modify Records

--Modify Records
UPDATE students

SET studentid = 912384233, phone='1-617-234-5679'
WHERE fname='Michael' and lname='Sable';

Complete SQL Statement See file "insert.sql"


Remark II:  Command Summary

Table Related:

Record Related:

3.Constraint Test

Insert Records with Duplicate Primary Key
SQL> insert into students
  2  (studentid, fname, lname, department, year, email, phone)
  3  values (912384234, 'Jacky','Smith','DUSP',3,'jacks@mit.edu','1-617-234-5623');
insert into students
*
ERROR at line 1:
ORA-00001: unique constraint (JINHUA.PK_STUDENTS) violated

Wrong Data Type or Data Length

SQL> insert into students
  2  (studentid, fname, lname, department, year, email, phone)
  3  values (912384233, 'Jaecheol','kim','DUSP',G,'jaecheol@mit.edu','1-617-234-5238');
values (912384233, 'Jaecheol','kim','DUSP',G,'jaecheol@mit.edu','1-617-234-5238')
                                           *
ERROR at line 3:
ORA-00984: column not allowed here

SQL> insert into students
  2  (studentid, fname, lname, department, year, email, phone)
  3  values (9123842275, 'Jaecheol','kim','DUSP',5,'jaecheol@mit.edu','1-617-234-5238');
values (9123842275, 'Jaecheol','kim','DUSP',5,'jaecheol@mit.edu','1-617-234-5238')
        *
ERROR at line 3:
ORA-01438: value larger than specified precision allows for this column

4. Using the Database to construct queries.

select count(*)
  from students s, stu_cou sc, courses c
 where s.studentid=sc.studentid and c.courseid=sc.courseid
   and s.year<=4
   and c.courseid=100100;

select s.fname, s.lname, c.coursenumber, c.name
  from students s, stu_cou sc, courses c
 where s.studentid=sc.studentid and c.courseid=sc.courseid
   and s.year<=4
   and c.courseid=100100;
    
select t.fname, t.lname, c.coursenumber, c.name
  from teachers t, tea_cou tc, courses c
 where t.teacherid=tc.teacherid and c.courseid=tc.courseid
   and t.fname='Joe' and t.lname='Ferreira';

select t.fname, t.lname, c.coursenumber, c.name
  from teachers t, tea_cou tc, courses c
 where t.teacherid=tc.teacherid and c.courseid=tc.courseid
   and t.fname='Joe' and t.lname='Ferreira'
   and c.courselevel='high';
    
select c.courseid, c.coursenumber, c.name, count(t.teacherid) intructors
  from teachers t, tea_cou tc, courses c
 where t.teacherid=tc.teacherid and c.courseid=tc.courseid
 group by c.courseid, c.coursenumber, c.name
having count(t.teacherid)>=2;
    
    
select sum(c.credit) totalcredits
  from students s, stu_cou sc, courses c
 where s.studentid=sc.studentid and c.courseid=sc.courseid
   and s.fname='Julie' and s.lname='Kirschbaum';

select s.fname, s.lname, sum(c.credit) totalcredits
  from students s, stu_cou sc, courses c
 where s.studentid=sc.studentid and c.courseid=sc.courseid
 group by s.fname, s.lname;

select sum(c.credit) totalcredits
  from students s, stu_cou sc, courses c
 where s.studentid=sc.studentid and c.courseid=sc.courseid
   and s.fname='Julie' and s.lname='Kirschbaum'
   and c.courselevel='high';

Additional SQL queries needed to answer Question #5 in Version IV above: Is there such a case that student A works as TA for student B in one course while student B works as a TA for student A in another course? If so, who are they and what are the courses?

The 'create.sql' and 'insert.sql' files linked above do not include the TA table. This file create_pt2.sql answers the above question after creating and populating the TA table by adding Jinhua Zhao as a student in 11.220 and as a TA for 11.521 along with adding Jacky Smith as a TA for 11.220 (she has already been inserted as a student in 11.521).

Theoretical Background Revisited

Entity
Relationship
E-R Modeling Process
From E-R Model to Database Design
Database Normalization

Database Design Rules of Thumb

Remark II:  They are important but too abstract. We never understand them until we actually do it.


Good luck applying these ideas to problem set B!


Home | Syllabus | Lectures | Labs | CRN |MIT

Created by Jinhua Zhao, March 5th, 2003; last modified by Joe Ferreira, March 14, 2018