A Critical Assessment of Big Data Approach to Violent Crime Analysis During Lockdowns

Learning Outcomes for this Module

LO1 Apply big data analytic algorithms, including those for visualization and cloud com-puting techniques to multi-terabyte datasets.

LO2 Critically assess data analytic and machine learning algorithms to identify those that satisfy given big data problem requirements

LO3 Critically evaluate and select appropriate big data analytic algorithms to solve a given problem, considering the processing time available and other aspects of the problem.

LO4 Design and develop advanced big data applications that integrate with third party cloud computing services

LO5 Critically assess and interpret primary research to identify its applicability to a given big data problem scenario.

Academic Integrity Statement: You must adhere to the university regulations on academic conduct. Formal inquiry proceedings will be instigated if there is any suspicion of plagiarism or any other form of misconduct in your work. Refer to the University’s Assessment Regulations for Northumbria Awards if you are unclear as to the meaning of these terms. The latest copy is available on the University website.

  • Do NOT submit code from other people or web sources as your own, this is plagia-rism.
  • Do NOT work with other students and submit identical code, this is collusion.
  • Do NOT buy your assignment on the Internet, or have your work done by someone else. This is Ghosting.
  • Both Ghosting, plagiarism and collusion are academic misconduct, which is not al-lowed and may result in you being asked to leave the University and lose any fees paid.

The aim of this assignment is to introduce a practical application of Big Data and Cloud Computing using a realistic big data problem. Students will implement a solution using an industry leading Cloud computing provider together with the distributed processing environment Apache Spark. This will involve the selection of problem appropriate algorithms and methods.

During the coronavirus pandemic, the UK experienced several periods of legally restricted movement (‘Lockdowns’) where most of the population were ordered to stay at home.

According to several authors lockdowns resulted in increased domestic violence, but did this result in an overall increase in recorded violent crime?

In this assignment you will use the UK Police Official Street level crime data (Police 2021) which has been made available under an Open Government License (OGLv3) since 2010 and placed in an appropriate cloud server.

  1. What crime categories does the Police data use?
  2. Have the same categories been used consistently?
  3. Are levels of violent crime constant, increasing, or decreasing
  4. Using data prior to the first lockdown predict violent crime levels for April 2020
  5. Compare predicted crime levels with actuals, and so determine whether significant changes to violent crime occurred
  6. The Leicester region was put in a local lockdown in July 2020. Verify your findings (if any) with respect to that region only.

In this assignment you will investigate these claims using real, publicly available data set(s) that will be made available to you and placed in an appropriate cloud server. These include:

Street Level Crime Data published by the UK Home Office. This dataset contains 19 million data rows giving a crime type, together with their location as a latitude and longitude.

  1. Process the given data efficiently using Apache Spark on a cloud Infrastructure as a Service (IaaS) platform
  2. Filter the dataset so that crimes refer to relevant events only.
  3. Using appropriate visualizations, statistics, or machine learning, determine whether violent crime increased, decreased, or remained static during the pandemic lockdown.
  4. Using Markdown cells, explain the reasoning behind your code so that it is clear what each block is intended to achieve, and why.
  5. Output data should be written to IaaS Cloud storage
  6. Using Blackboard, your submission should a Python programme that is executable by Apache spark-submit using the module container.

Individual work (20%) – Report: “A Critical Assessment of the Big Data Approach to Violent Crime Analysis during Lockdowns.”

You are each required to write a critical report with the title given above.

The idea of this research report is to analyse critically the Big Data approach to Crime Analysis undertaken in the practical work. The report should identify advantages and disadvantages of the approach for technical, social, and ethical perspectives. It should not be a description of the technical work nor describe programming, nor repeat the complete content of the code submission. (Reproducing graphs and results is permitted)

A very poor contribution showing little awareness of subject area.  Lack of clarity.  Communication of knowledge is either inarticulate and or irrelevant.

Code fragments from the Internet may have replaced student written content to the extent that it is not possible to determine what the student has understood. Only partial functionality has been achieved.

Knowledge is limited or superficial. Some awareness of concepts and critical appreciation are apparent, but there are major omissions or misunderstandings. Writing is not clear and there is no argument. Incorrect solutions or non-functioning software solutions have been given.     

Knowledge is barely adequate.  Writing is fluent, but mostly, description and or assertion are used rather than argument or logical reasoning.  A basic understanding of the key issues may have been demonstrated, but insufficient focus is evident in the work presented. Source code is functional, but poorly structured and commented. There may be some validation errors or security flaws.

Knowledge base is up-to-date and relevant to an appropriate breadth and depth for level 7. The student has demonstrated the ability to apply theory and concepts, across domains and identify their interrelationship. A critical appreciation is demonstrated, which is supported by appropriate references. Writing is clear if a little uneven. Source code is functional, structured and commented. Code is valid and mostly secure.

As above but there is clear evidence of independent thought and reasoned conclusions. Literature is fully supported by citation using appropriate references and there is development of a critical appreciation of opposing arguments. Presentation of work is fluent, focused and accurate. Source code is fully object oriented, secure, and completely-validates without being verbose.

Exceptional scholarship is demonstrated.  There is a sustained ability to confront the current limits of knowledge in a relevant area or applied ‘real world’ contexts where demands of theory and practice may conflict.  Argument is fluent, sustained, and convincing. Source code is of a professional standard. Clearly exceeds taught material.   

