Identification of Entities from Given Case Study: Database Exam Questions Assessment Answer
Question 1: (Total 10 marks)
VIT offers many course streams (e.g. MITS, BITS, MBA and many short courses) and each course has a course coordinator who belongs to faculty. Course stream has specific course code, course name, fee and student belong to that course. VIT students can enrol only in one course at a time and each course can have one or no students.
VIT need to record student id, name, phone number, address, gender and date of birth for each student. Students study different units offered by VIT. Each unit is taught by a faculty member. Faculty member id, name, gender, address, date of birth and salary has been recorded. Every semester each student can enrol in more than one unit and faculty member can teach more than one unit. A faculty member can teach in multiple course streams. Also, a unit can be taught by many faculty members or one.
Faculty members may work on multiple projects. For each projects the name, project id, area, duration and associated faculty member ids are maintained.
For the above given case study identify all entities, attributes for each entity, primary keys, foreign keys, relationship between entities and cardinalities. To answer assume you identify an entity student and course then it should be written as
Entity, attributes and keys: Student (StudentId(PK), Name,………..,CourseCode(FK) )
Relationship and cardinalities: Student(0,M) ---<belongs to>--- (1,1)Course
faculty(id(PK),member_id, name, gender, address, dob, salary)
course(id(PK), code, name, fee)
enrollment(id(PK),student_id(FK),course_id(FK), faculty_id(FK), date)
project(id(PK), name, area, duration)
Question 2: (Total 3+3+4=10 marks)
VIT wants to store the record of its research projects. For each project, VIT records information of projects, (Project Code, Project Title, Project Manager, Project Budget) and faculty (Faculty ID, Name, Hourly Rate for project) and Course (Course Code, Course Name) to which faculty belonged. Below is a sample of data of a project.
|Project Code:||P-2000||Project Manager:||Sara|
|Project Title:||Student at risk identification system||Project Budget:||$20,000|
|Facutly ID||Faculty Name||Course Code||Department Name||Project Hourly Rate|
|F100||Roger||MITS||Master of IT and System||$45.00|
|F125||Gerson||BITS||Bachelor of IT and System||$32.00|
|F234||Aaron||MITS||Master of IT and System||$50.00|
|F111||Jones||BITS||Bachelor of IT and System||$39.00|
The un-normalised table (called Project) that corresponds to the above format is as follows:
Project(ProjectCode, ProjectTitle, ProjectBudget, ProjectManager, FacultyID, FacultyName, Course Code, Course Name, Project HourlyRate)
Where required, you may make assumptions. However, your assumptions should not contradict the above situation and all assumptions should be stated in your answer.
- Identify the repeating group of attributes and transform the above un-normalised table into tables that are in 1st Normal Form.
- Identify any partial dependencies and transform into tables that are in 2nd Normal Form.
- Identify any transitive dependencies and transform into tables that are in 3rd Normal Form.
A) Assuming that each record is unique given table is considered as 1NF.
B) Repeating groups are
2 NF will be as follow
Project(ProjectCode(PK), ProjectTitle, ProjectBudget, ProjectManager, )
CourseFaculty(FacultyID, FacultyName,DepartmentName, CourseCode, CourseName,ProjectHourlyRate, ProjectCode(FK))
Non-key column in CourseFaculty my cause to change in other non-key column to change for example
CourseCode change may cause to change the CourseName to be updated as well. This is transitive functional dependency which can be further handled by 3NF
Project(ProjectCode(PK), ProjectTitle, ProjectBudget, ProjectManager)
Faculty(FacultyID(PK), FacultyName, ProjectHourlyRate)
Question 3: (Total 3x4=12 marks)
Consider the tables below and then attempt the queries that follow
a. Write down a query to display the employee name and city for all employees who work for 'Marketing' and earn more than $74,000, and give output based on the sample data in the above tables
b. Write down a query to display information of employees and its department who are receiving salary more than $50,000, and give output based on the sample data in above tables.
c. Write down an insert statement to add a record for “Insurance” department in department table and then write down a query to update the department of “Roger Clarke” to the “Insurance” department.
d. Write a query to restrict user employee from updating and inserting data into the above table. However, they are allowed to read the information from the database.
a. select Fist_name, Last_name, City from Employee where Employee_id in (select Employee_id from Works where Yearly_Salary>'$74000' and Department_id in (select Department_id where Department_name="Marketing"));
b. select t1.Employee_id,t1.Fist_name, t1.Last_name, t1.City, t3.Department_id,t3.Department_name from Employee as t1 left join works as t2 on t2.Employee_id=t1.Employee_id left join Department as t3 on t3.Department_id=t2.Department_id where t2.Yearly_Salary>'$50000';
c. insert into Department(Department_id,Department_name) value('D2003','Insurance');
update Department set Department_name="Insurance" where Department_name="Roger Clarke";
d. ALTER TABLE works ADD CONSTRAINT employee_fk FOREIGN KEY (Employee_id) REFERENCES Employee(Employee_id) ON UPDATE RESTRICT ON DELETE RESTRICT;
Question 4: (Total 10 marks)
In the modern era, information becomes an integral part of our daily life. All public and private sector business and companies store huge amount of data in databases and apply data analysis/data mining techniques to find a hidden pattern in that data and make critical decisions. Consider a current scenario of COVID-19 and assume you have information about all the patients, their symptoms, treatment, current condition (critical, recovered or deceased) and their travel information. In such situation discuss the importance of database management system, data analysis and data mining techniques and suggest a technique you may use to identify new potential patients.
First of all we have decide where and how to store data points, Verity of cloud storage available to handle self managed big data handling (google cloud, Amazon and Azure are the major providers).
Once data is stored we need heavy lifting of data read and matching various conditions(this piece is handled using ML/AI techniques), As High Computation power and Memory is required today’s people use serverless computation services Like AWS Lambda/GCP Cloud-functions etc. For data Load-Extraction-Transform for meaningful analysis or hypothesis testing.
In the convid-19 scenario User entity is linked with various(big number) of locations (gps tracking) over the timestamp, so we need to identify locations collisions between infected user and non-infected user to highlight red or green alert.
Question 5: (Total 8 marks)
Consider a scenario, where you are hired to improve the performance of database searches. When you analysed the records stored in the database you found the records are unordered and a linear search is used to retrieve a record. Discuss the techniques you would use to reduce the search time of such database. Justify your answer using one example.
we need to perform query analysis and do required indexing to manage the slow queries and optimize the performance.