World Development Indicators (WDI) are the primary World Bank collection of development indicators, compiled from officially recognized international sources.
WDIs present the most current and accurate global development data available, and includes national, regional and global estimates.
The dataset for this assignment can be downloaded from world-development-indicators that encapsulates world indicators data for counties around the world, decomposed into categories of activity and metrics.
You also have yearly snapshot data for these countries that could be construed as time series.
Download the dataset by selecting number of countries and variables you want to work with.
This assignment proposes a considered and justifiable scheme for an information dashboard to present a country’s economic health within an information-rich, intuitively comprehensive, single screen dashboard format.
Your reasoned thinking, research, and critical evaluation of both the problem resolution and proposed solution form a substantive part of this work and will present the rationale for the proposal in the form of a report.
A data set comprising wide-ranging economic data for many countries is provided for exploration, evaluation, and experimentation to justify the approaches and decisions made.
This assessment requires a complete data analysis, and a working dashboard prototype to be presented.
It should make a feasible and justifiable, worked schema that demonstrates the critical evaluation of the processes of data preparation, validation, anonymization, ethics, algorithmic fairness, analysis and/or modelling and prediction and justifiable research into composition, layout, function, and form. Justification of the approaches taken for statistical analysis and visualisation is expected and worked examples should be provided.
The requirements for the proposed dashboard are:
1. A single-screen presentation of at least 10 countries’ economic health and the profile of their economic activity profile.
2. Clear, effective representational presentation of all factors in a coherent, intuitively comprehensive form.
3. A schema applicable to the full range of countries presented in the dataset without modification to the dashboard form or structure. (i.e., the dashboard should support a side-by-side comparison of multiple countries and/or financial years).
4. Presentation of future prediction/trend for economic profile based on historical data.
5. Relational modelling showing relative performance against stochastically defined groups of countries within the data set.
The requirements for the proposed statistical analysis are:
1. Define a research objective based on the dataset. For instance, to compare the trade situation of the least developed countries with developed countries.
2. Based on the objective, select at least 10 suitable countries of your choice.
World Development Indicators
3. Choose a set of indicators according to the objective with at least 10 years of data.
4. Start to complete the following tasks. Also, present and interpret your findings and results in the report as much as you can and show the thorough SAS analytics steps.
4.1. Do a comprehensive descriptive statistical analysis (e.g., Mean, Median, Mode, Standard deviation, Skewness and Kurtosis) on the data.
4.2. Do a correlation analysis for the indicators as much as you think is enough for the defined objective.
4.3. Do a regression analysis. Explain why the selected regression technique is better for the defined objective and show if you’ve found any similar research in the literature.
4.4. As a researcher, do a comparative analysis of the hypothesis testing approaches and explain when and why you use them? Then define two hypotheses related to the objective and test them.
5. In general, describe the steps that you’ve taken for data preparation, outlier detection, dealing with missing data, and data privacy protection.
You must use SAS programming for the statistical analysis part. You can use Tableau, Power BI or MS Excel technologies/applications to experiment with analysis and develop the dashboard (other tools could be discussed with the instructor).
The submission of the assignment will be in the form of a report (guideline 40 pages) that presents the proposal, explains the rationale for the approaches used to analyse and display the data components, and critically evaluates subject domain research (data ethics and data visualisation) and the final implemented prototype.
Assessed intended learning outcomes
On successful completion of this assessment, you will be able to:
Knowledge and Understanding
1. Analyse a data science project to devise a structure for its implementation, analysis, and evaluation, justifying any decisions made.
2. Critically assess the relative strengths and uses of a range of statistical analysis techniques (including t-tests, ANOVA, linear regression, multiple regression models and categorical data analysis, test of hypothesis).
3. Present and visualise the statistical results, analysing key findings.
4. Evaluate the quality of graphs according to their expressiveness and effectiveness.
B- Practical, Professional or Subject Specific Skills
1. Understand the history and context of data science ethics, skills, challenges, and methodologies the term implies.
2. Will learn how to work with a dataset that possibly is not in your domain expertise, and you don’t have prior knowledge and understanding of that field.
3. Develop skills in presenting quantitative data using appropriate displays, tabulations, and summaries.
4. Understand the nature of sampling variation and the role of statistical methods in developing and testing hypotheses.
5. Select and use appropriate statistical/ML methods in the analysis of complex datasets.
6. Present findings based on statistical analysis in a clear, concise, and understandable manner.
7. Select the proper visualization methods for a given data analysis and presentation problem.
C- Transferable Skills and other Attributes
1. Technical report writing.
2. Ability to use tools and techniques for statistical analysis.
3. Presenting data in a manner accessible to non-technical stakeholders.
4. Data Science Ethics, Information governance, information Literacy and Data Protection.