Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave
CE314 Natural Language Engineering

Task
Regular expression (40%)
(You can store your code in output part1_regularexpression_studentID.py)
 
1: Write a regular expression that can find all amounts of money in a text. Your expression should be able to deal with different formats and currencies, for example £50,000 and £117.3m as well as 30p, 500m euro, 338bn euros, $15bn and $92.88. Make sure that you can at least detect amounts in Pounds, Dollars and Euros. (20pts)

For full marks: include the output of a Python program that applies your regular expression to the following BBC News Web site:

https://www.bbc.co.uk/news/business-41779341

2: Write a regular expression that can matching all phone numbers listed below: (You can write a python program to check the matching results)
 
555.123.4565
+1-(800)-545-2468
2-(800)-545-2468
3-800-545-2468
555-123-3456
555 222 3342
(234) 234 2442
(243)-234-2342
1234567890
123.456.7890
123.4567
123-4567
1234567900
12345678900
NLTK (10%)
1. Find the 50 highest frequency word in Wall Street Journal corpus in NLTK.books (text7), submit your code as the name: part2_NLTK_studentID.py (All punctuation removed and all words lowercased.)
Language modelling:
1. Build an n gram language model based on nltk’s Brown corpus, provide the code. (You can build a language model in a few lines of code using the NLTK package, you can use bigram, trigram or higher order grams) (20pts)

2. After step 1, make simple predictions with the language model you have built in question 1. We will start with two simple words – “I am”. Let your n gram model to tell me what will be the next word, show me both code and module generated results. (15 pts)

3. Based on the work of question 1 and question 2, generate a few sentences start with “I am”. (15 pts)

support
close