Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave
CS50 Introduction to Artificial Intelligence with Python
Answered

Question:
Reinforcement Learning
You will program an RL agent that plays Blackjack with a dealer. If you aare not familiar with the game, please search on the internet to learn the rules. In summary, a player attempts to beat the  dealer by getting a count as close to 21 without going over 21. It is up to each individual player  if an ace is worth 1 or 11. Face cards are 10 and any other card is its pip value. The game will be played between a player and the dealer. The objective is to develop an agent that  will act as the player and try to win against the dealer.
 
We have provided the starter code that  has everything to run the game except that the user is playing against the dealer. You will need to  develop an agent that can play the game automatically (replacing the human player's input) and win  money in the long run (after many hands). If you run the starter code, you should get an output like the image below. You have to modify the  code so that instead of taking input from the user, the agent plays based on a policy.
 
human player's input Table
Part 1
In this part, you need to modify the provided code to integrate it with an agent so your agent can  play with the computer dealer for as many hands as needed. You should implement two agents:
 
Agent 1 - alway select “hit” or “stay” randomly
Agent 2 - follow the same rule as the dealer (hit if count is less than 17)

You should simulate 1,000 hands and report the overall win or loss for each agent. Submission You should submit the code and a text file contains the result of the 1,000 hands of each agent.
Part2
In this part you should develop a reinforcement learning approach to learn a policy to play against  the dealer to win the maximum amount of money (or loss as little as possible). You should model the  Blackjack game as a MDP(Markov Decision Process) problem and develop a Q-Learning (DT) approach to  learn a pohcy.
Submission
- Code for Q-Learning
- Evaluation of your Q-Learning (reward curve during learning of every 50 hands) in a PDF file
- Learned Q-table
- Complete code of playing with your learned policy.
- Result summary of 1,000 hands

support
close