Conduct independent research and compile a review report on the use of word embeddings in business and its possible ethical issues. Your report should include the following requirements in order:
a) Describe two possible applications of word embedding in business.
Hint: For each application, mention what are the motivations/benefits, how it works, what datasets are involved and its results (if known), etc.
b) Discuss two popular implicit biases that usually occur in word embedding applications and their possible ethical issues.
Hint: Describe each bias, give examples and explain why and how biases occur and may lead to ethical issues.
c) Suggest two most important measures/best practices that you think can be used to alleviate the ethically significant harms of these bias problems. Provide justification of your choices and challenges of implementing these measures.
Hint: Your suggestions should align with the harms that you have discussed in the previous section (question 1b). You may review the lecture slides and select the relevant knowledge points. You may also need to perform research on literature to explain and support your points.
There is a case study provided and you are required to analyse and provide answers to the questions outlined below. You can use lecture material and literature to support your responses.
Fred and Tamara, a married couple in their 30’s, are applying for a business loan to help them realize their long-held dream of owning and operating their own restaurant. Fred is a highly promising graduate of a prestigious culinary school, and Tamara is an accomplished accountant. They share a strong entrepreneurial desire to be ‘their own bosses’ and to bring something new and wonderful to their local culinary scene; outside consultants have reviewed their business plan and assured them that they have a very promising and creative restaurant concept and the skills needed to implement it successfully. The consultants tell them they should have no problem getting a loan to get the business off the ground. For evaluating loan applications, Fred and Tamara’s local bank loan officer relies on an off-the-shelf software package that synthesizes a wide range of data profiles purchased from hundreds of private data brokers. As a result, it has access to information about Fred and Tamara’s lives that goes well beyond what they were asked to disclose on their loan application. Some of this information is clearly relevant to the application, such as their on-time bill payment history. But a lot of the data used by the system’s algorithms is of the sort that no human loan officer would normally think to look at, or have access to—including inferences from their drugstore purchases about their likely medical histories, information from online genetic registries about health risk factors in their extended families, data about the books they read and the movies they watch, and inferences about their racial background. Much of the information is accurate, but some of it is not. A few days after they apply, Fred and Tamara get a call from the loan officer saying their loan was not approved. When they ask why, they are told simply that the loan system rated them as ‘moderate-to-high risk.’ When they ask for more information, the loan officer says he doesn’t have any, and that the software company that built their loan system will not reveal any specifics about the proprietary algorithm or the data sources it draws from, or whether that data was even validated. In fact, they are told, not even the system’s designers know how what data led it to reach any particular result; all they can say is that statistically speaking, the system is ‘generally’ reliable. Fred and Tamara ask if they can appeal the decision, but they are told that there is no means of appeal, since the system will simply process their application again using the same algorithm and data, and will reach the same result.
Provide answers to the questions below based on what we have learnt in the lecture. You may also need to perform research on literature to explain and support your points.
a) What sort of ethically significant benefits could come from banks using a big-data driven system to evaluate loan applications?
b) What ethically significant harms might Fred and Tamara have suffered as a result of their loan denial? Discuss at least three possible ethically significant harms that you think are most important to their significant life interests.
c) Beyond the impacts on Fred and Tamara’s lives, what broader harms to society could result from the widespread use of this loan evaluation process?
d) Describe three measures/best practices that you think are most important and/or effective to lessen or prevent those harms. Provide justification of your choices and challenges of implementing these measures.
Hint: your suggestion should align with the harms that you have discussed in the previous sections (questions 2-b and 2-c). You may review the lecture slides and select the relevant knowledge points. You may also need to perform research on literature to explain and support your points.
a) Two possible applications of word embedding in business
Word embedding is widely used in ‘Natural Language Processing (NLP)’, which is a sub-field of machine learning.
One of the applications of word embedding can be found in analyzing the survey response. Word embedding solves a common business problem of not having enough tools and time to effectively and easily analyse the survey responses. Vector representation of words adapted to survey data-sets can solve the complex relationship between the context within which the responses were made and these responses being reviewed (Zhong et al. 2017).
Word embedding is also applicable to the domain of analysis of verbatim comments, which are very important for organisations particularly those which are customer-centric. Word2Vec, a kind of word embedding helps figure out the specific context within which a verbatim comment was made. This algorithm can be very useful in understanding the sentiment of customers towards a business (Zhong et al. 2017).
b) Two popular implicit biases
One of the implicit biases is technical bias. It emerges when there are mathematical, software and hardware constraints. An over fitted algorithm is considered bias as its interface is perfect for the training purpose and non-generalizable for new cases. The different types of mathematical models trained on the same data may have prediction accuracies; however, they will also have a different amount of biases. It is more due to the different cost functions, which the different mathematical models optimize (Ruder, Vuli? and Søgaard 2019).
Emergent bias erupts while evaluating the results for things applied in the context. When making algorithmic results to form a decision, there may arise an ethical problem. The decision thus proposed to contradict the existing normative values to be found in society. Algorithmic decision-making can lead to unfair practices such as unfavorable treatment offered to a particular group or individual (Ruder, Vuli? and Søgaard 2019).
c) Two most important measures/best practices
One of the ways to control word embedding biases can be to take the help of the data scientists in identifying the best model for a specific situation context. One model cannot be suitable for different situation contexts. The selection of the right model can be done by consulting with data scientists. Data scientists possess the expertise needed to identify the most suitable model (Bender and Friedman 2018). They consider different strategies before they build a model. It is like troubleshooting different ideas to identify the best model than to wait for the vulnerabilities to happen and then fix it.
a) Two possible applications of word embedding in business
In case there is insufficient data for one group, the weighting might be used to increase its importance in training. However, this needs to be done with utmost caution as it can lead to new unexpected biases. A sample consisting of a small group of people might require forcing the chosen model to consider the trends those few could have been identified. If this happens a large weight multiplier must then be used or else the model will have a higher risk of picking up new trends. It is, therefore, crucial to be careful with selecting the weights particularly the large ones (Bender and Friedman 2018).
a) Ethically significant benefits
Banks need to assess any loan applications on a scale of low-to-high risks to conclude whether their investment is into safe hands. The use of Big Data can help banks understand customer behavior based on their inputs received from various channels such as investment patterns, financial backgrounds, motivation to invest, shopping trends, etc. (Jagtiani, Vermilyea and Wall 2018). Therefore, Big Data can help to identify which loan to approve or else.
b) Ethically significant harms
As a result of a loan denial, Fred and Tamara might have suffered three ethically significant harms. These are as under (Calem, Correa and Lee 2019):
- Credit Rating– A rejected loan won’t affect the credit score; however, creditors may review a loan being applied later on. The hard inquiry, which the creditors will execute can hurt the credit scores.
- Longer Wait for Next Loan Application– As the loan is being rejected, Fred and Tamara would have to wait longer than six months before they could apply for another loan in some other banks. Instead, excellent credit scores help to get loans easily than someone whose credit score is not stellar.
- Fear to Lose Further on Credit Rating– As the loan applied by Fred and Tamara has been rejected, they will have to be super cautious before applying for the next loan. If they do not do so, this might affect their credit rating further. It is to be noted here that credit rating indicates whether the customer is eligible for a credit and the amount, which the person can borrow. In addition to this, a rejected loan may also affect the interest rate charged against a loan applied.
c) The broader harms to society
The loan evaluation process in the bank where Fred and Tamara had applied for a loan is increasingly dependent on one algorithm, which perhaps has few flaws and lacks significant measures to check its validity. Such a loan evaluation system won’t just affect the true dreams of many potential entrepreneurs like Fred and Tamara; but also disclose their lifestyles without their permissions to do so (Calem, Correa and Lee 2019).
d) Three measures/best practices
The harms that might have caused to Fred and Tamara can be reduced by some good measures. These are as under (Lenz 2016):
- Evolving attitudes need to be closely monitored– Banks should ensure that individuals who are responsible for development work on machine learning receive appropriate training on fair lending practices. This would encourage them to take necessary actions in the best interests of the bank, the software vendor and the customers.
- Potential bias should be put to consecutive test trials– There are some promising and potential technological solutions are emerging, which promise to help companies on the test and correct several times for bias in their algorithmic systems. Banks are expected to perform continuous monitoring of their algorithmic system to identify potential bias problems to head towards a more accurate algorithmic solution.
- The rationale for algorithmic features needs to be documented– As part of a risk management strategy, banking companies should already have defenses prepared for claims before these are raised. Institutions such as banks using algorithmic solutions in all credit transactions need to consider a few things. These are like how best to comply with the legal requirements for submitting any statements of specific reasons to entertain any request for record retention and information.
Bender, E.M. and Friedman, B., 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, pp.587-604.
Calem, P., Correa, R. and Lee, S.J., 2019. Prudential policies and their impact on credit in the United States. Journal of Financial Intermediation, p.100826.
Jagtiani, J., Vermilyea, T. and Wall, L.D., 2018. The roles of big data and machine learning in bank supervision. Forthcoming, Banking Perspectives.
Lenz, R., 2016. Peer-to-peer lending: opportunities and risks. European Journal of Risk Regulation, 7(4), pp.688-700.
Ruder, S., Vuli?, I. and Søgaard, A., 2019. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65, pp.569-631.
Zhong, J., Ogata, T., Cangelosi, A. and Yang, C., 2017, September. Understanding natural language sentences with word embedding and multi-modal interaction. In 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 184-189). IEEE.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2020). Word Embeddings In Business: Applications, Biases, And Ethical Issues - An Essay.. Retrieved from https://myassignmenthelp.com/free-samples/bus5pb-principles-of-business-analytics.
"Word Embeddings In Business: Applications, Biases, And Ethical Issues - An Essay.." My Assignment Help, 2020, https://myassignmenthelp.com/free-samples/bus5pb-principles-of-business-analytics.
My Assignment Help (2020) Word Embeddings In Business: Applications, Biases, And Ethical Issues - An Essay. [Online]. Available from: https://myassignmenthelp.com/free-samples/bus5pb-principles-of-business-analytics
[Accessed 29 November 2023].
My Assignment Help. 'Word Embeddings In Business: Applications, Biases, And Ethical Issues - An Essay.' (My Assignment Help, 2020) <https://myassignmenthelp.com/free-samples/bus5pb-principles-of-business-analytics> accessed 29 November 2023.
My Assignment Help. Word Embeddings In Business: Applications, Biases, And Ethical Issues - An Essay. [Internet]. My Assignment Help. 2020 [cited 29 November 2023]. Available from: https://myassignmenthelp.com/free-samples/bus5pb-principles-of-business-analytics.