Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave

Defining Income Equality

Income equality is defined as the disparity in how income is distributed among groups, individuals, countries, social classes, or populations. It is used to categorize socioeconomic status. Therefore, the working class, the middle class, and the upper class are determined by their income level. Income inequality is affected by many other inequalities such as wealth, political power, and social status. Though this seems like a negligible matter, income is a vital factor in managing quality of life. Currently, the majority can only be accessed by the level of income one has. For instance, access to health care, education, and housing are determined by income level. Income level may vary based on factors such as gender, sexual identity, age, race or ethnicity, etc. which in turn lead to a wider gap between the upper and working classes. Race or ethnicity is the major factor that is being affected by the income inequality issue.  In the US, income inequality is measured by using household income. It then compared it by quintile. Another way of measurement which is common is the Gini index and which summarizes the distribution of income into a single number (Saavedra and Twinam, 2020; Neñer et al. 2013; Brynjolfsson and Mitchell, 2017).

The notable disproportion of income is one of the imminence bother in many countries in the world.  The probability of plummeting poverty is one of the well-founded reasons to decrease the world’s stream level of economic inequality. The basis of universal equal opportunities ensures that there is sustainable growth and improvement in the stability of an economy in a specific country.  While certain Governments in different nations have been at the forefront to address the issue of inequality, there are still some challenges being faced by these governments in their quest to solve this issue. In this study, we will use the concept of machine learning techniques to provide solutions to the income equality issue (Piketty and Saez, 2003; Dincer and Gunalp, 2012; Tsui, Enderle, and Jiang, 2018; Hoffmann, Lee, and Lemieux, 2020).

The notion of machine learning has made the use of the data to be so essential. Currently, data is expensive since the majority of the business, institutions, Governments want to utilize the usage of data to make decisions. Machine learning models have proved to be very essential in making this pursuit possible. In other words, data is key to the Machine Learning Ecosystem. This is because data formulate problems, determines the direction of the business, organizes communities, etc. In the future, it would be next to impossible for any business to succeed without data. The predictability power of the machine learning algorithms makes this direction visible and noticeable to everyone. 

In the United States, the difference in median household incomes between white and black Americans grew from $23,800 in 1970 to $33,000 in 2018. Median black household income was 61 % of median white household income in 2018 from 56 % in 1970. However, this reduced to 63 % in 2007.  In 2010, the top 5 % of the US households got more than 24 % of the total after-tax income. On the other hand, the lowest 20 % received 6 %. Income inequality in the US has increased since 1979. From the year 1979 to 2007, the mean after-tax income rose by 18 % for the bottom fifth of the population. It increased by 275 % for the top 1 %. There has also been a sharp imbalance in age, race, and sex. However, the gender wage gap has been decreasing and this is caused by the shrinking wages for men but still, the disparity exists. Black and Hispanic, and female-headed families have the highest probability of getting the lowest income.

The Importance of Income and Quality of Life

Another factor that shows the disparity is the type of jobs. The disparity is more on gender. These specializations are financial specialists, securities, commodities, and financial services sales agents, financial clerks, credit authorizers, checkers, and clerks.

Erosion of wages for less-educated workers has been one of the principal causing the rise in income inequality in the US. Also, the tax cuts difference which is benefiting the richer more than the poor is one of the reasons for the rise in income inequality. In 2015, the US had the greatest level of income inequality among the top 20 countries in the world. The US was ranked 63rd in income inequality in 2013. It had a Gini coefficient score of more than 41. The countries with balanced income equality have a Gini coefficient score of 0. 

There are four main factors that influence. These factors are categorized as economic factors, demographic factors, cultural factors, political factors, macroeconomic factors, and environmental factors.

The factors under economic factors that influence income inequality are a country’s wealth, technological growth, economic development, and the growth of economic structure. Growth in GDP will lead to an increase in inequality sometimes and the inequality falls later. Income inequality increases at the beginning of the movement of laborers and later decreases when the majority of the labor force is already in the industrial sector. When the GDP increases, the wealthy people who are majorly entrepreneurs and resource owners have more chances to expand their income. Therefore, the disparity in income will be noticed at this particular time. However, after some time, the income obtained by the wealthy is distributed and thus decreasing the income inequality. This approach is well known as the inverted U hypothesis. Some studies have supported this approach but some studies show that the GDP is an insignificant factor that influences income inequality. This gives a gap and uncertainty about the influence of GDP on income inequality. While some studies have used energy consumption per capita as an indicator of the wealth of a country, the most used indicator is the GDP per capita. The studies that used energy consumption to measure the country’s wealth found that the country’s wealth was an insignificant influencer of the income inequality. The labor force movements between different economic sectors play a major role in the formation of inequality. In addition, the growth of economic structure such as shares of industrial, agricultural, and service sectors has to be considered as factors influencing the income inequality. Also, the ratio of the manufacturing workers to service workers determined the income inequality (Silber, 2012; Chu, and Hoang, 2020; Shin, 2012)

The demographic factors that influence income inequality are the age structure of the population, urbanization, education level, composition of households, education expenditure, etc. Higher population density is associated with lower income inequality. This is led by the advanced social organization. However, some studies have shown that an increasing population density and urbanization increase the level of income inequality. In this scenario, the disparity of income is higher between the rural and urban areas. The influence of urbanization is a significant factor in income inequality.

Measurement of Income Inequality in the US

There is no clarity on the influence of age structure on income inequality. However, some studies have shown that older people have a larger dispersion of income and therefore, the proportion of the older people in a country determined the level of income inequality. A country with a larger proportion of older people has higher income inequality. A larger share of the population aged 40-59 in the population aged 15-69 reduced the income inequality.  This means that a larger proportion of older and more experienced individuals drops demand for the individuals and the wage premium for experience, and in return, the inequality is lower. A larger share of children aged 0-14 years increases income inequality. 

This is because the birth rate is higher in families with smaller incomes and therefore, it explains the higher income inequality. The higher the difference in types of households, the higher the income inequality. This is because every type has a different income. The single-female-headed households have a low income and thus it leads to a higher income inequality compared to other families.

The education level and inequality are one of the factors that influence income inequality.  The education level is measured by the number of years in school. Different studies have found contradictory results on the influence of education level on income inequality. Some studies found that an increase in the education level reduces the level of income inequality while other studies found that the level of education increased the level of income inequality. However, theoretically, a higher education inequality should lead to higher income inequality. The higher share of the population which has both high and low educational levels is normally associated with higher income inequality. Therefore, we can state that the education level influences income inequality (Rougoor and Van Marrewijk, 2015; Huber, and Stephens, 2014; Scheviakov, 2011).

The political factors that influence income inequality are the shares of the government, democratization, liberalization, etc. A larger proportion of government spending is shaped by transfers such as pensions, grants, subsidies, etc. which are redistributed equally in the society. This means that a higher share will lead to the reduction of income inequality. It should also be noticed that income inequality is lower in the public sector than in the private sector.

In a democratic society, the needful people have more political rights and therefore, have a higher possibility of achieving larger distribution of income. Expansion of franchises and improvement of civil liberties lowers income inequality. It is easier to manage redistribution in an authoritarian society.  When there is a higher centralization of an authoritarian regime includes more opportunities to reduce the differences between incomes in different regions. Income inequality is significantly lower in communistic countries (Jacobs, and Dirlam, 2016; Patel et al. 2018).  

The data that will be used in this study was obtained from the Kaggle website (https://www.kaggle.com/datasets/wenruliu/adult-income-dataset). The data contain information about the US Adult income which was obtained in 1994. The data is majorly used for machine learning tasks. It will best suit our study since the data aimed to understand the factors that influence the income level in the US. The data has 15 columns and 48842 rows. The target variable is labeled income and it contains information on whether the participants earned less than $50K or more than $ 50 K.

Factors Contributing to Income Inequality

In this study, we will seek to understand the feature importance of using algorithms and also to try and predict the customer income based on factors obtained from the dataset. The model that we will use is Logistic regression, KNearestneighbors, random forest, Support Vector Machine, Decision tree, and Neural network. We will first clean the data by checking whether there is any missing data. We will use the appropriate method to handle the missing data. The methods that we will consider are using the median, and mode to replace the rows with missing values. Also, we may consider deleting the rows with missing values. This will be determined by the percentage of the rows with the missing values. The next step will be to check on the data type. We will convert the data into the right data types. Later, we will check for inconsistent values in the data and replace them with missing values. For instance, values such as ‘?’ will be replaced with missing values. We will conduct descriptive based on the information on the data. The descriptive statistics will give us a rough idea of what to expect in the final output. The descriptive statistics will be represented using the data visualization technique.

Once, the data is cleaned, we will define the dependent and independent data. Later, we will partition the data into training and testing. The models will be built using the training data set while the model will be validated using the testing data set.

The validation metrics that will be used are accuracy score, recall, precision, and f1 score. They will introduce data reduction techniques to determine if they will improve the accuracy of the models. We will compare the validation of the initial model with all the independent variables and the model with reduced features. We will also check whether resampling techniques will improve the model accuracy. The same procedure used in the feature reduction will be used here and the model will be compared to the initial model based on the validation metric identified above. Finally, we will conclude the research by identifying the best model and the best approach.  

The majority of the working class worked in the private sector. More than 90 % of the workers were working in the private sector. The majority of the workers went up to high school. About 50 % of the total workers went up to high school. Followed by those who went to college. The majority of the participants were married and living with their partners, they were followed by those who never married. The majority of the workers were husbands, and they were followed closely by those who were single (not in a family). More than 80 % of the participants were white, followed by the blacks and others. More than 50 % of the participants were males. Majority of the workers were earning less than $ 50K. 

The median age for those who were receiving less than $ 50K was 35 years. 0-25 % of those who were receiving less than $ 50 K were about 27 years old. 50-75 % of those who received less than $ 50 K were 45 years old and 0-100 % of those who received less than $ 50 K were 75 years old.

Economic Factors

The median age for those who were receiving more than $ 50K was about 42 years. 0-25 % of those who were receiving more than $ 50 K were about 36 years old. 50-75 % of those who received more than $ 50 K were 50 years old and 0-100 % of those who received more than $ 50 K were 71 years old. 

The median hours worked per week for those who received less than $ 50 K was about 38 hours per week. 0-25 % of those who received less than 50 K worked for 0-38 hrs. 50 – 75 % of the participants who received less than $ 50K worked for 38-40 hours per week. 75 -100 % of those who received less than $ 50 K worked for 40-42 hours.

The median hours worked per week for those who received more than $ 50 K was about 40 hours per week. 0-25 % of those who received more than 50 K worked for 0-40 hrs. 50 – 75 % of the participants who received more than $ 50K worked for 40-48 hours per week. 75 -100 % of those who received more than $ 50 K worked for 48-66 hours.   

The median age of the participants was 37 year. 0-25 % of the participants were 18 -29 years old. 50-75 % of the participants were 37-48 years old. 75 – 100 % of the participants were 48 – 75 years old. 

The median hours worked per week was 40 hours. 0-25 % of the participants worked for 30-40 hours per week. 50 -75 % of the participants worked for 40-45 hours per week and 75-100 % of the participants worked for 45-52 hours per week.

Table 1: Model validations

Algorithm

Accuracy

Recall

Precision

F1 score

SVM

78.96

95.7

14.8

25.7

Logistic Regression

79.21

70.7

25.95

37.86

Random Forest

85.13

73.3

61.6

67

Neural Network

76.25

51.9

42.3

46.7

Decision Tree

80.56

60

61.9

60.95

KNN

76.69

54.13

32.1

40.3

The results above show that random forest had the highest accuracy (85.13 %). SVM had the highest recall score (95.7 %), Decision tree had the highest precision score (61.9 %) and decision tree had the highest F1 score (60.95 %).

Table 2: Model validation for Reduced model 

Algorithm

Accuracy

Recall

Precision

F1 score

SVM

79.11

85.5

17.8

29.4

Logistic Regression

76.99

54.7

35.8

43.3

Random Forest

80.96

82.04

28.57

42.4

Neural Network

78.08

72.02

17.26

27.85

Decision Tree

68.04

38.41

50.4

43.6

KNN

75.63

50.5

27.7

35.8

The results above show that random forest had the highest accuracy (80.96 %). SVM had the highest recall score (85.5 %), Decision tree had the highest precision score (50.4 %) and decision tree had the highest F1 score (43.6 %).

Table 3: Model validation for Resampled data

Algorithm

Accuracy

Recall

Precision

F1 score

SVM

59.15

94.5

19.87

32.84

Logistic Regression

61.95

82.06

31.09

45.09

Random Forest

92.4

89.05

96.8

92.7

Neural Network

63.03

78.62

36.3

49.69

Decision Tree

90.54

86.76

95.8

91.05

KNN

72.54

69.8

80

74.54

The results above show that random forest had the highest accuracy (92.4 %). SVM had the highest recall score (94.5 %), Random Forest had the highest precision score (96.8 %) and random forest had the highest F1 score (92.7 %). 

Table 4: Model validation for Resampled data + Reduced features

Algorithm

Accuracy

Recall

Precision

F1 score

SVM

59.22

93.14

20.37

33.34

Logistic Regression

68.04

74.61

55.18

63.44

Random Forest

70.42

89.25

46.8

61.4

Neural Network

65.15

73.5

47.94

58.03

Decision Tree

61.72

65.2

51.16

57.33

KNN

60.31

61.77

55.2

58.3

The results above show that random forest had the highest accuracy (70.4 %). SVM had the highest recall score (93.1 %), KNN had the highest precision score (55.2 %) and Logistic regression had the highest F1 score (63.4 %).

From the analysis, it can be noticed that random forest produced the highest accuracy compared to the other models. Feature reduction reduced the accuracy of the model while resampling increased the accuracy of the models

Conclusion

In this project, we aimed at creating a model that can predict income disparity in the job industry. We have created some descriptive statistics to provide a high-level analysis of the same. We created different models to predict income disparity. Random forest was the best model since it produced the highest accuracy (92. 4 %). This model can be trusted to predict income disparity because of its high accuracy prediction.

References

Brynjolfsson, E. and Mitchell, T., 2017. What can machine learning do? Workforce implications. Science, 358(6370), pp.1530-1534.

Chu, L.K. and Hoang, D.P., 2020. How does economic complexity influence income inequality? New evidence from international data. Economic Analysis and Policy, 68, pp.44-57.

Dincer, O.C. and Gunalp, B., 2012. Corruption and income inequality in the United States. Contemporary Economic Policy, 30(2), pp.283-292.

Hoffmann, F., Lee, D.S. and Lemieux, T., 2020. Growing income inequality in the United States and other advanced economies. Journal of Economic Perspectives, 34(4), pp.52-78.

Huber, E. and Stephens, J.D., 2014. Income inequality and redistribution in post-industrial democracies: demographic, economic and political determinants. Socio-Economic Review, 12(2), pp.245-267.

Jacobs, D. and Dirlam, J.C., 2016. Politics and economic stratification: Power resources and income inequality in the United States. American Journal of Sociology, 122(2), pp.469-500.

Neñer, J., Cardoso, B.H.F., Laguna, M.F., Gonçalves, S. and Iglesias, J.R., 2022. Study of taxes, regulations and inequality using machine learning algorithms. Philosophical Transactions of the Royal Society A, 380(2224), p.20210165.

Patel, V., Burns, J.K., Dhingra, M., Tarver, L., Kohrt, B.A. and Lund, C., 2018. Income inequality and depression: a systematic review and meta?analysis of the association and a scoping review of mechanisms. World Psychiatry, 17(1), pp.76-89.

Piketty, T. and Saez, E., 2003. Income inequality in the United States, 1913–1998. The Quarterly journal of economics, 118(1), pp.1-41.

Rougoor, W. and Van Marrewijk, C., 2015. Demography, growth, and global income inequality. World Development, 74, pp.220-232.

Saavedra, M. and Twinam, T., 2020. A machine learning approach to improving occupational income scores. Explorations in Economic History, 75, p.101304.

Scheviakov, A., 2011. Income inequality as factor of ecomomic and demographic growth.

Silber, J. ed., 2012. Handbook of income inequality measurement (Vol. 71). Springer Science & Business Media.

Shin, I., 2012. Income inequality and economic growth. Economic Modelling, 29(5), pp.2049-2057.

Tsui, A.S., Enderle, G. and Jiang, K., 2018. Income inequality in the United States: Reflections on the role of corporations.

Cite This Work

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2022). Understanding Income Inequality: Causes And Impacts - An Essay.. Retrieved from https://myassignmenthelp.com/free-samples/comp3340-data-mining/machine-learning-ecosystem-file-A1E60B8.html.

"Understanding Income Inequality: Causes And Impacts - An Essay.." My Assignment Help, 2022, https://myassignmenthelp.com/free-samples/comp3340-data-mining/machine-learning-ecosystem-file-A1E60B8.html.

My Assignment Help (2022) Understanding Income Inequality: Causes And Impacts - An Essay. [Online]. Available from: https://myassignmenthelp.com/free-samples/comp3340-data-mining/machine-learning-ecosystem-file-A1E60B8.html
[Accessed 26 April 2024].

My Assignment Help. 'Understanding Income Inequality: Causes And Impacts - An Essay.' (My Assignment Help, 2022) <https://myassignmenthelp.com/free-samples/comp3340-data-mining/machine-learning-ecosystem-file-A1E60B8.html> accessed 26 April 2024.

My Assignment Help. Understanding Income Inequality: Causes And Impacts - An Essay. [Internet]. My Assignment Help. 2022 [cited 26 April 2024]. Available from: https://myassignmenthelp.com/free-samples/comp3340-data-mining/machine-learning-ecosystem-file-A1E60B8.html.

Get instant help from 5000+ experts for
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing: Proofread your work by experts and improve grade at Lowest cost

loader
250 words
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Plagiarism checker
Verify originality of an essay
essay
Generate unique essays in a jiffy
Plagiarism checker
Cite sources with ease
support
Whatsapp
callback
sales
sales chat
Whatsapp
callback
sales chat
close