11.521 |
Spatial
Database Management and Advanced Geographic Information
Systems
|
11.523 |
Fundamentals
of Spatial Database Management
|
CREATE view public.v_gispapers AS
SELECT m.code, keyword, m.paper
FROM keywords k, match m
WHERE m.code = k.code AND
(keyword LIKE '%GIS%' OR
keyword LIKE '%GEOGRAPHIC INFORMATION%' OR
keyword LIKE '%MAPPING%');
CREATE VIEW public.v_combo1 as
SELECT m1.paper, m1.code AS giscode, m2.code AS othercode
FROM match m1, match m2
WHERE m1.code IN (select distinct code from v_gispapers) AND
m2.code NOT IN (select distinct code from v_gispapers) AND
m1.paper = m2.paper;
CREATE VIEW public.v_combo2 AS
SELECT DISTINCT paper, othercode, keyword
FROM v_combo1 c, keywords k
WHERE c.othercode = k.code;
SELECT othercode, keyword, count(paper) as papers
FROM v_combo2
group by othercode, keyword
having count(paper) > 10
ORDER BY papers DESC, othercode;
First, Let's look at the student records portion of MIT's Data Warehouse
Entities: Teachers, Students, Courses
Attributes:
Teachers: ID, Name, Email, Office, Phone...
Students: ID, Name, Department, Year, Email, Phone...
Courses: ID, CoureNumber, Name, Desription, WWW, Level, Credit...
Relationships:
Students vs. Courses: Many to Many Relationship
Teachers vs. Courses: Many to Many Relationship
Matchup Tables
Students-Courses matchup table
Teachers-Courses matchup tableConstraints:
Primary Key
Foreign Key
Schema:
Schema Version I
Sample queries
1. How many graduate students are there in 11.521?
2. What classes does Joe Ferreira instruct?Remark I: Always consider what we can do with the database (thinking about possible queries) even when we are still at the preliminary stage of the database design.
3. Which course has more than two instructors?
4. How many course credits has Julie taken? Among these, how many are high-level credits? Has she satified the graduate credits requirement for her MCP degree, which is defined as at least 124 total credits including at least 48 high-level credits?
Remark II: Usually only one to many relationships exist in the final E-R diagram.
-- For one to one, integrate the tables into one unless special cases
such as 1) too big table, 2) permission issues like, table "teachers"
vs. table "teachers_confidendial".
-- For many to many, use matchup table to break it to two one-to-many
relationships.
New Entities: Place
Attributes:
Place: Building, Room, Type, Size
New Relationships:
Places vs. Courses: Many to Many Relationship
Schema:
Schema Version IISample queries
1. How many, and which courses, utilize computer labs?
2. Has Jaechoel ever been assigned to Room 37-312 for any coureses?
3. Joe is teaching another class--11.220 QR, can room 4-234 hold all the students in this course?Remark III: When constructing a query, first go through the logic flow using the E-R diagram.
New Entities: None
New Attributes: Grade Score, in which table?
New Relationships: None
Schema:
Schema Version IIISample queries:
1. Has Jing Su taken course 11.201? If so, what is her score in it?
2. How many people have taken any courses tought by Joe and received an A? Has anybody ever failed in his classes?
Remark IV: A new entity, or a new attribute? If it is attribute, what is it specific to, i.e., what table is it dependent on?
3. What is Jeff's GPA? --credit-weighted average grade points
This is a pretty hard question--in what table do we find scores? how do we transfer them to grade points? where do we find the credits for the course? how do we compute the weighted average?
Lookup Table: Score-Grade Point Transform
Schema Version III-2
Remark V: Learn to use look up table otherwise the 3rd normal form of database design is violated.By the way, what if we consider teachers' evaluation? where do we put this information? We can argue either way--evaluation is course specific or professor specific or course-professor specific?
Schema Version III-3
New Entities: We might add in TA entity but not Necesssarily.
New Attributes: None
New Relationships: Yes
TA is also a student. It is just a special relation between courses and students.
Many-many relationshipSchema:
Schema Version IV
Remark VI: Again, what's necessary? A new entity or a new relationshipSample queries
1. Is Jinhua a TA for Jane or Tom? (self-join)
Remark VII:Is there going to require a self-join in our SQL query?
2. If it is TA's responsibility to reserve a computer lab for the class, does Jinhua have to do it for 11.220?
3. With whom (instructors) has Jinhua ever worked as a TA?
4. After 9/11, U.S. INS strengthened the requirement of international students being registered as full time students. ISO tells them that they have to work or study more than 24 hours per week to be regarded as a full time student. Does Jinhua satify this requirement? Suppose one course credit (either studying or TAing) accounts for one hour work per week.
5. Difficult question--is there such a case that student A works as TA for student B in one course while student B works as a TA for student A in another course? If so, who are they and what are the courses?
Remark VIII: Extend the database step by step.
Remark
IX: Quite often, the potential power of the database could go
beyond what we expected when we designing it.
All the above questions lack any consideration of time.
1. How many credits of courses has Julie taken? --> What if we ask, how many credits of courses has Julie taken in year 2002?
2. Has Jaechoel ever been to Room 37-312 to take any coureses?--> Was Jaechoel there or supposed to be there in Room 37-312 on March 4th, 2003?
3. Did Joe offer any courses in Fall 2003 when he was on sabbatical?
Question1
: Where do we put "TIME" in? Is it an entity, attribute or
relationship? To what extent, do we want the time to be specified?
Questino2: Up to now, the whole structure is centered on "COURSE"?
What if when we want to consider the relationship of "Advisor and
Advisee". A moment ago, we said there is not a direct link between
teachers and students, is it true?
Question3: How about RAs? What links students to professors
as RA? --Projects. We could create a new entity of projects and match
up table of stu_project and tea_project.
1. Table Structure Setup
Create Tables
--create table
create table students
(studentid number(9,0),
fname varchar2(20),
lname varchar2(20),
department varchar2(30),
year number(1,0),
email varchar2(30),
phone varchar2(20)
);
Remark X: Think about the data type. Tradeoffs to be made.
Department:
Number: 1, 2, 3, ..., 24--easy to standandized but hard to inteprete, look up required
String: DUSP, CIVIL, TPP, ... --easy to understand but hard to standandize, problem of multiple, slightly different names
Phone:
Number with country code, area code and telephone number: easy to utilize but not flexible
String as a whole: full flexibility but unable to check the rules, hard to utililize the number.
Specify Primary Key
--specify primary key
ALTER TABLE students ADD CONSTRAINT pk_studentid PRIMARY KEY (studentid);
Create Table with Constraints
--or create table with constraints
drop table students;
create table students
(studentid number(9,0) CONSTRAINT pk_students PRIMARY KEY,
fname varchar2(20),
lname varchar2(20),
department varchar2(30),
year number(1,0),
email varchar2(30),
phone varchar2(20)
);
Specify Primary Key (Multi-columns)
ALTER TABLE stu_cou
ADD CONSTRAINT pk_stu_cou PRIMARY KEY (studentid,courseid);
Specify Foreign Key
ALTER TABLE tea_cou
ADD CONSTRAINT fk_teacherid
FOREIGN KEY (teacherid)
REFERENCES teachers (teacherid);
Drop Tables
--drop table
drop table students CASCADE CONSTRAINTS;
Complete SQL Statement See file "create.sql"
2.Records Input and Modification
Insert Records
--insert data
insert into students
(studentid, fname, lname, department, year, email, phone)
values (912384234, 'Michael','Sable','DUSP',5,'msable@mit.edu','1-617-234-5678');Delete Records
--Delete Records
delete from students
where studentid=912384234;
Modify Records
--Modify Records
UPDATE students
SET studentid = 912384233, phone='1-617-234-5679'
WHERE fname='Michael' and lname='Sable';
Complete SQL Statement See file "insert.sql"
Remark II: Command Summary
Table Related:
Insert Records with Duplicate Primary Key
SQL> insert into students
2 (studentid, fname, lname, department, year, email, phone)
3 values (912384234, 'Jacky','Smith','DUSP',3,'jacks@mit.edu','1-617-234-5623');
insert into students
*
ERROR at line 1:
ORA-00001: unique constraint (JINHUA.PK_STUDENTS) violated
Wrong Data Type or Data Length
SQL> insert into students
2 (studentid, fname, lname, department, year, email, phone)
3 values (912384233, 'Jaecheol','kim','DUSP',G,'jaecheol@mit.edu','1-617-234-5238');
values (912384233, 'Jaecheol','kim','DUSP',G,'jaecheol@mit.edu','1-617-234-5238')
*
ERROR at line 3:
ORA-00984: column not allowed here
SQL> insert into students
2 (studentid, fname, lname, department, year, email, phone)
3 values (9123842275, 'Jaecheol','kim','DUSP',5,'jaecheol@mit.edu','1-617-234-5238');
values (9123842275, 'Jaecheol','kim','DUSP',5,'jaecheol@mit.edu','1-617-234-5238')
*
ERROR at line 3:
ORA-01438: value larger than specified precision allows for this column
select count(*)
from students s, stu_cou sc, courses c
where s.studentid=sc.studentid and c.courseid=sc.courseid
and s.year<=4
and c.courseid=100100;
select s.fname, s.lname, c.coursenumber, c.name
from students s, stu_cou sc, courses c
where s.studentid=sc.studentid and c.courseid=sc.courseid
and s.year<=4
and c.courseid=100100;
select t.fname, t.lname, c.coursenumber, c.name
from teachers t, tea_cou tc, courses c
where t.teacherid=tc.teacherid and c.courseid=tc.courseid
and t.fname='Joe' and t.lname='Ferreira';
select t.fname, t.lname, c.coursenumber, c.name
from teachers t, tea_cou tc, courses c
where t.teacherid=tc.teacherid and c.courseid=tc.courseid
and t.fname='Joe' and t.lname='Ferreira'
and c.courselevel='high';
select c.courseid, c.coursenumber, c.name, count(t.teacherid) intructors from teachers t, tea_cou tc, courses c where t.teacherid=tc.teacherid and c.courseid=tc.courseid group by c.courseid, c.coursenumber, c.name having count(t.teacherid)>=2;
select sum(c.credit) totalcredits
from students s, stu_cou sc, courses c
where s.studentid=sc.studentid and c.courseid=sc.courseid
and s.fname='Julie' and s.lname='Kirschbaum';
select s.fname, s.lname, sum(c.credit) totalcredits
from students s, stu_cou sc, courses c
where s.studentid=sc.studentid and c.courseid=sc.courseid
group by s.fname, s.lname;
select sum(c.credit) totalcredits
from students s, stu_cou sc, courses c
where s.studentid=sc.studentid and c.courseid=sc.courseid
and s.fname='Julie' and s.lname='Kirschbaum'
and c.courselevel='high';
The 'create.sql' and 'insert.sql' files linked above do not include the TA table. This file create_pt2.sql answers the above question after creating and populating the TA table by adding Jinhua Zhao as a student in 11.220 and as a TA for 11.521 along with adding Jacky Smith as a TA for 11.220 (she has already been inserted as a student in 11.521).
Entity
Relationship
E-R Modeling Process
From E-R Model to Database Design
Database Normalization
- 1st normal form: Atomic Information
- 2nd normal form: Non-key fields are dependent on the entire primary key
- 3rd normal form: Non-key fields are dependent only on the primary key
Database Design Rules of Thumb
Remark II: They are important but too abstract. We never understand them until we actually do it.
Good luck applying these ideas to problem set B!
Created by Jinhua Zhao, March 5th,
2003; last modified by Joe Ferreira, March 14, 2018