Overview
For your first graded team project, you will be updating a database table with data imported from an excel spreadsheet. In addition to updating the database table, you will be performing data transformation and data clean-up activities.
While there are several approaches to performing the updates and data clean-up, you must do all via SQL Server Express using SQL.
Grading: This project is 10% of your course grade and will be graded according to the following:
1. The updated table results
2. SQL queries used (save the Step-by-step queries in a word document).
3. Output Report of List of HIT Companies (see list of columns in Appendix A).
Instructions: Save your Step-by-step queries as you work through the project. You will then copy your queries in your Word document that will be submitted
1. Create database and table: You will be updating the HIT Companies information (in your Test database) from which you imported data from the Excel spreadsheet (HIT companies.xlsx) at the beginning of this semester (week 2).
a. Copy the table that you created in the Test database per to the following:
b. Add a “Company_Id” if one does not already exist, data type of Integer 4-digits. This will also be the Primary Key (PK).
c. Add an attribute (“Co_Updated”) to the HIT_Companies table and insert a value of “2019-08- 28’ date in ‘yyyy-mm-dd’ format to all companies (each row). No need to include time zone, or seconds.
d. Modify the values of the “Status” field/attribute to the HIT_Companies table and insert a value of “Active” to all companies for each row.
2. Data transformations and clean-up:
a. You will need to perform data clean-up activities to ensure that all data is in its correct format and each attribute has valid and consist data.
i. If the data field (attribute) is numeric, default value is zero (0).
New Database name: Healthcare_Industry
Table name: HIT_Companies
CIS3730- Fall 2019 Project #1: DML using SQL Page 2 of 4
ii. If the attribute is Character, default value is “blank” unless otherwise stated in the data
list and type.
iii. If the attribute is a date field, format should be “yyyy-mm-dd”
b. Eliminate data redundancies if applicable.
3. Update the HIT_Companies table with the most current information: In the Project #1
(iCollege) folder, you will find an Excel spreadsheet, HIT_Companies Updated v12 10-09-
2019.xlsx, with more current data.
a. This will be used to update the HIT_Companies table in the Healthcare_Industry Database.
See “3. Table Updates:” below for instructions.
b. There may/may not be additional data fields in this Excel file to which you will need to include
(add columns) to the HIT_Companies table.
4. Table Updates: The updates may be done in a number of ways, so you will need to think
through how you want to do this. I recommend testing this using a small sample before you
execute using the entire list and updating your master table.
a. If a company is listed in the most current list (HIT_Companies v12 10-09-2019.xlsx), you will
need to
i. Update the company information in the HIT_Companies table with the most current
information.
ii. Update the “Co_Updated” field with ’2019-10-09’
b. If a company is Not listed in the most current list (HIT_Companies v12 10-09-2019.xlsx), you will
need to
i. Change the status of that company to “Inactive” in the HIT_Companies table, (use “Status”
field/attribute)
c. If there is a new company listed from the HIT_Companies v12 10-09-2019.xlsx file,
i. Add the new company to the list,
ii. Include the “Co_Updated” field with ’2019-10-09’
iii. Be sure to add the value of “Active” to the “Status” column
d. Do not remove any companies from the table or list. By using the “Status” column, we will
know whether or not it is still applicable/ relevant.
5. Data transformations and clean-up:
a. You will need to perform data clean-up activities to ensure that all data is in its correct format
and each attribute has valid and consist data.
i. If the data field (attribute) is numeric, default value is zero (0).
ii. If the attribute is Character, default value is “blank” unless otherwise stated in the data
list and type.
iii. If the attribute is a date field, format should be “yyyy-mm-dd”
iv. All fields should be “NOT NULL”