- The schema for the LibraryDB database is given below. LibraryDB is a database that keeps track of information about the books and their circulation in a library.
The primary keys are underlined. The foreign keys are denoted by asterisks (*).
Borrow (transactionID:text, personID*:text ;borrowdate:date, returndate:date)
Book (ISBN:number, title:text, Volnumber:number, year_published:numeric, NoofCopies:number; Price:number; PublisherID:text)
Person (personID:text, firstname:text, lastname:text, address:text, city:text,
publisher (publisherID:text, publisherfullname:txt)
Description of the schema
borrow -- keeps track of the check-ins and check-outs of the books.
person -- keeps track of the people who borrow books from the library.
book -- contains information about the books that are available in the library.
publisher -- keeps track of the publisher information.
- Implement the above described database in MS Access. You are required to create the above 4 tables with the given attributes. Name the database as ‘LibraryDB ’ .(5 Marks)
- Populate all tables with suitable data (at least 3 records per table- suitable to answer given queries) using the following SQL statements or data sheet view in MS Access.
INSERT into TableName
VALUES (“..”,”..”,.....) (5 Marks)
Write SQL queries for the following questions and execute the queries on the ‘LibraryDB’ database you created in MS Access.
Prepare a list of ISBN number, title, Volnumber, NoofCopies of all Books in the library. (5 Marks)
- Assume that you want to count how many books in the library which are older than 10 years (i.e year_published is before 2008). Write a query to find the number. (5 Marks)
- Display details of all books that cost more than the average price of the books in the library.
(Hint: use sub-query). (5 Marks)
- Display details of all books which are published in year 2017 and published by “ACER Press” publishers (publisherfullname is “ACER Press”)
(Hint: you may need to join Book table and publisher table) (10 Marks)
Create a new table named “Borrower” that includes personID, firstname, lastname, address of the borrower, borrowdate and returndate.
Hint : you may need to join 2 tables Person and Borrow. (5 Marks)
Sales tracking is important in many online stores. Figure 1 shows the ER diagram that captures important information in the tracking of sales in a Bookstore. Specifically, the ER diagram captures relationships between authors, Books, customers, publishers, warehouses and shopping-basket. Figure 1: Bookstore ER Diagram
- Convert the ER diagram into a relational database schema. Be certain to indicate primary keys (underline).
Eg Book (ISBN: NUMBER; title: text,.....) (5 Marks)
- Identify 2 attributes in the ER diagram of figure 1 that might represent a composite attribute, and explain why/how it might represent a composite attribute. (4 Marks)
- Determine the cardinality of the relationships (listed below) in the above ER diagram. The cardinality shows how much of one side of the relationship belongs to how much of the other side of the relationship. Write the business rule that you would use to mark the cardinality, as shown in the example given in figure 2.
- Book – publisher
Relationship between shopping-basket --- Customer
A customer may submit any number of orders on to the shopping-basket. However, each order must be submitted by exactly one customer.
Figure 2: Example of cardinality of business rule for the ‘shopping-basket --- Customer’ relationship
- Prepare a brief report on the following “Big data technologies”. (6 Marks)
Use the following heading for writing the report for each topic:
- Capabilities and Limitations
Marks are allocated for each part as below:
Marking Rubric for Assignment 1 -Total Marks 100
Q1. Model organisational information requirements using conceptual data modelling techniques and Query Writing
Evidence of accurate and well-
Evidence of good query writing skills.
Demonstrated reasonable query writing skills.
Did not demonstrate evidence of understanding the topic.
Q2. Convert the conceptual data model into relational data model.
Demonstrated excellent ability to think critically.
Demonstrated an ability to think critically.
Demonstrated reasonable ability to think.
Demonstrated some ability to think critically but not complete.
Did not demonstrate ability to think critically.
Q3. Model organisational information requirements(with Big data technologies)
Demonstrated excellent knowledge on the topic
Demonstrated good knowledge on the topic.
Demonstrated reasonable knowledge on the topic
Demonstrated some knowledge on the topic.
Did not demonstrate any knowledge on the topic.
2 a) Relational Database Schema
TblPublisher (Fldname: text, Fldaddress: text, Fldphone: text, FldURL: text)
TblAuthor (Fldname: text, Fldaddress: text, FldURL: text)
TblBook (FldISBN: number, Fldtitle: text, Fldprice: currency, Fldyear: number, FldauthorName*: text, FldpublisherName*: text)
TblCustomer (Fldemail: text, Fldname: text, Fldaddress: text, Fldphone: text)
TblShoppingBasket (FldbasketID: number, FldcustomerEmail*: text, FldISBN*: number, FldNoOfBooks: number)
TblWarehouse (Fldcode: text, Fldaddress:text, Fldphone:text)
TblStocks (FldwarehouseCode*: text, FldISBN*:number, FldNoOfBooks: number)
(Janalta Interactive Inc 2015)
2 b) Composite Attributes
The composite attribute can be separated into two or more meaningful attributes.
The BookStore database has following composite attributes-
- Address – an address attribute may be separated into StrretAddress, City, State, Postcode attributes. So, an Address is the composite attribute.
- Name – a Name attribute may be separated into FirstName, LastName, MiddleName attributes. So, a Name is the composite attribute.
3 a) Hadoop
Hadoop is an open source system which is used to store up and process a Big Data. It works in distributed environment with number of systems as cluster systems. “MapReduce” algorithm is developed by the Google to store the Big Data and the Hadoop is based upon the “MapReduce”.
It is very difficult to process the Big Data but Hadoop has resolved the problem to process the big data. It splits the big task into smaller tasks and distribute to number of systems to process. This is called Map and after processing it collects the results from all the systems and generates final result to show to the user.
Hadoop is very capable in the following fields-
- It is compatible on all platforms.
- It is very flexible. A server can be added or removed dynamically, but the Hadoop doesn’t stop. It works in the same manner.
- Hadoop library is very fault tolerance. It detects the fault. It does not rely on any hardware to detect the fault.
- It is very inexpensive.
Hadoop has some limitations like below-
- It is not appropriate for small data. It works well on Big Data.
- The batch processing is only supported in Hadoop.
- The processing speed is very slow in Hadoop because of Big Data and MapReduce algorithm.
- It is not applicable on real time applications as it does not support the small data.
- It is not proficient in iterative processes.
Hadoop is the best technique to process the Big Data. It does not suit on small data but works very well in Big Data as it is only made to resolve the issues that are being occurred in the Big Data processing.
3 b) MapReduce
“MapReduce” is used to make applications to process Big Data. It is the software framework. Hadoop is also based upon MapReduce. It makes the applications which works in distributed environment in parallel.
It divides the large task into smaller tasks and distribute to the number of systems to process Big Data. It assigns the small tasks to the parallel systems that are connected into distributed environment. After processing from each system; it sorts all the results and combine them to produce final result. Reduce works in the following steps- shuffle, Sort and Reduce. It works in the pair of <key, value>.
MapReduce is very capable in the following fields-
- It is very scalable. It is independent of any environment.
- It is not so much expensive.
- It may work on structured or unstructured data both.
- It is very secure and works with the HBase security.
- The data in copied for backup when the data comes to process at the main node. The MapReduce works on this concept. It is very helpful in backup techniques.
MapReduce has some limitations like below-
- It is appropriate only for Big Data but not for small data.
- It gives output very late means the processing speed is very slow.
- It does not work well in iterative processing.
- It is little bit hard to understand.
- It is not appropriate for real time applications.
MapReduce is the base of the technology used to process and store the Big Data.It is highly in demand by big companies.
 Janalta Interactive Inc, Functional Dependency, 2015.
 teach-ict.com, Third Normal Form, n.d.
 TechTarget, Hadoop, 2018.
 IBM, What id MapReduce, n.d.