2. How did the authors claim they were going to evaluate their work and compare it to others?
1. The main issue the authors wanted to discuss is that when to use Big Data. According to the authors Big Data is the new emerging technology which is used to handle large and complex database systems which old and conventional tools are not able to handle. Most of the organizations remain in confusion that whether they should use Big Data or not. On what projects they should use them. What is the threshold value to select Big Data for any project? Most of the large and big organizations are using this technology. There are many other reasons which are involved in the failure of Big Data. According to a study done by Lavastorm the main reason behind Big Data failure was the treatment of Big Data as a project whose beginning and ending are already known (Lavastorm, 2014). An organization must not see Big Data as an extension of pre-existing technologies rather it should be used as an exploration tool to find out new ideas and targeted research. Also they expect that if they are using Big Data as a tool it must give them high return of investment (Chen, Chiang & Storey, 2012). According to the authors there should be a complexity level which an organization should follow.
2. To sort out the above problem the author has given a method and they have termed it as BigDAF (Big Data Project Assessment Framework). According to this method different complexity levels are defined from CL1 (less complex) to CL5 (highest complexity). A formula is defined by the authors to calculate the complexity level. The data collected is viewed in three forms: volume, velocity and variety (Labrinidis & Jagadish, 2012). All these values are calculated on the current values. Volume is defined as how much data is received in a given time. Velocity is defined as the time it takes to analyse that data. Variety is defined as the types of data that are received. On a scale of 100 these parameters are given weightage. Volume is given the highest weightage (60) as it is more important than any other parameter. Velocity is given 10 and variety is given 30 weightage. Then the formula for complexity level is:
CL= (60*Complexity of volume + 10*Complexity of velocity + 30*Complexity of Variety)
Based on this formula complexity level is categorized in:
- Between 100-200: Basic Business Intelligence issue that can be solved with conventional tools (Vercellis, 2009).
- Between 200-300: This problem can be solved with more advanced tools.
- Between 300-400: These problems need to be solved with the help of Big Data.
- Between 400-500: These are very complex problems and immediate investment in Big Data is needed.
References
Chen H., Chiang, R.H.L. and Storey, V.C., (2012) Business Intelligence and Analytics: From Big Data to Big Impact, MIS quarterly, 36(4), 1165-1188.
Labrinidis, A. and Jagadish, H.V., (2012) Challenges and opportunities with big data. Proceedings of the VLDB Endowment, 5(12), 2032-2033.
Lavastorm Analytics, (2014) Why Most Big Data Projects Fail [online] Available from: https://www.lavastorm.com/assets/Why-Most-Big-Data-Projects-Fail-White-Paper.pdf [accessed 18/06/2018].
Vercellis, S., (2009) Data mining and optimization for decision making. West Sussex: John Wiley & Sons Ltd.