Assessment 1: Literature review
Background information
Introduce topic, key terms, and topic relevant to the study
Big data technology is defined as a highly large set of data that could be assessed an organization via using smart computers in order to expose relationships, prototype, and trends. Furthermore, big data is practiced by the company to reveal data regarding the communication and behavior of the individual. It is evaluated that the implementation of big data in the company is boosting as the rivalry among the project grows. There is a large number of corporations apply big data technology in order to gain competitive benefits in the industry. The technology is applied by the company for operating several functions of business like comprehending the targeting customers, optimizing business procedure, enhancing healthcare, enhancing research, enhancing security, monetary transaction, and optimizes machine performance (Assunção, et. al., 2015).
This report will emphasize on comprehending how big data performs and assessing its benefits and limitations. It would evaluate the uses of big data in the company by understanding its impact on Google LLC. Moreover, an example of certain other companies would be used in this report to comprehend how they use big data technology in their organizations to obtain competitive benefits.
Describe scope and organization of literature review
The scope of this research is wider as it emphasizes on how big data works and its several pros and cons. This report will evaluate the example of Google LLC to comprehend how it practices big data when performing the function of the business. It would not just be limited to Google as an illustration of different company form different fields will be assessed in this report in order to develop knowledge about uses of big data in the business organization.
Review past and present literature
In previous studies, it is analyzed that management of data is a big issue for the company as; it creates higher cost and takes more time for it. But, in the present times, it is evaluated that using big data in business involves saving of cost, time reductions, development of new product, control of online reputation and understand the condition of the market. Due to its benefits, the number of the company is depending on big data. But, at the same time, different companies avoid using big data in their business because of its limitation like complex procedure, misleading information, security concerns, and inconsistency of data collection and updates can diverse the real figures (Lafuente, 2015).
Clarify the purpose of your study
The goal of this research is to comprehend how big data deals and how companies are implementing it to enhance the operation of the business. This report will assess several benefits and limitation associated to use of big data in business companies. It would evaluate the illustration of companies dealing in different fields to comprehend how each of them uses big data in their operations.
Theory of big data
Massively Parallel Communication (MPC) theory
According to Hilbert (2016), MPC theory assesses the parallel computation model to assess the parallel complexity of different multi-way join algorithms and prove lower bound on the extent of management and interactions. Under the MPC model, computation is illustrated by a cluster of p machines by using shared nothing architecture. The shared nothing approach is highly implemented in the contemporary big management system. The computation deals in different rounds where each phase contains certain local computation followed by an international exchange of information amid the machines. At the end of each round, the machines have to harmonize such as wait for all machines in order to finish earlier proceeding to next phase. The input resources of size m (in tuples) are primarily evenly allocated between the p machines such as, each machine stores m/p. It also assesses how input association is usually allocated in an allocated file system such as HDFS. After the computation is attained, the output result is demonstrated in the union of the output of the p machines.
MapReduce model
In opposed to this, Hartmann et al. (2016) stated that MapReduce theory is another programming model and software structure that is initially developed by Google. This theory proposed to provide and decline the processing of the high amount of research in parallel on a huge bunch of commodity hardware in an authentic and fault-tolerant way. In opposed to this, a MapReduce job segregates the set of input data into separate blocks that creates concern for map projects in terms of completing an effective manner. The framework defines the output of maps that are then input to decline tasks. Furthermore, both input and output of job are saved in a file system. The structure is used to make scheduling of tasks, assessing them and re-implementing the failed projects.
Review of literature on instruments
According to Power (2014), the big data can be defined as the gathering of a set of data, which are complex and larger and corporations cannot process them by practicing a conventional form of data processing technologies. It could be in both unstructured and structured forms. It can be easier for a corporation to process structured information while, unstructured information is complex to process. The aspect of big data technology is not new because, different small and large company apply it to increase more data for better supporting the corporation and dealing with their customers more effectively.
In contrast to this, Kitchin (2014) evaluated that 3V model can be used to process the big data technology. This model states that big data technology has high volume, velocity, and range of information resources that are required for a new type of processing technologies. This model aids to get benefits like the optimization of the process, nearly discovery, and improved the resulting building. Moreover, a new “V” is added by a corporation to describe the big data technology.
As per the view of De Mauro Greco and Grimaldi (2016), every day, billion of a gigabyte of information is created by technology and individuals worldwide, which can be gathered by a company. But, these data are considerably large and difficult that makes it complex for a corporation to operate it via implementing a conventional form of technologies. In contrast to this, Viceconti Hunter and Hose (2015) illustrated that the structure and size of big data created it special and unique as it demonstrates those data that opens new doors and the way data could be used for starting new doors. In order to make sense of big information, new skills and technologies are necessary to assess the flow of resources and draw a conclusion. In addition, Apache Hadoop is a technology that can be generally practiced by business when implementing big data in the company for processing them by implementing the simple programming theories.
Tan et al. (2015) exhibited that Netflix is a good illustration to comprehend how a corporation can implement the big data in order to increase the competitive benefits. Furthermore, Netflix uses big data in order to assess the depth information associated with the traffic of their customers and addressing the issues to predict the future demand. A corporation can get perception regarding their customers and comprehend which kind of content, they can prefer that enable them to provide an accurate recommendation as per the preference of subscribers.
In opposed to this, Auffray et al. (2016) evaluated that Google LLC exercises big data when performing their operation of the business in order to get competitive benefits. A corporation deals in computer hardware, software, and internet industry. A corporation provides its facilities in all over the world and its parent company ‘Alphabet’ has hired more than 88,110 workforces internationally. Google is proficient to deal with the big data each day, the process of company 3.5 billion requests. Each request made by its user assessment database of 20 billion WebPages.
In the view of Mittelstadt and Floridi (2016), big data enables corporation insight into competitive benefits, coming and going trends of customers that aid corporations to make faster business judgments. But, there is a certain limitation of using big data in business companies. However, there are several disadvantageous of implementing big data in business companies. The key disadvantageous of big data is that it needs special computer power to work adequately. The standard version of Hadoop is not competent to assess real-time information. There are certainly ethical concerns associated with big data as it violates the principle of privacy. It can result in gaining social stratification.
In opposed to this, Couldry and Powell (2014) stated that there is uncertainty regarding the trustworthiness of information that can adversely impact on the effectiveness of business strategies of the company. As, there are different technological activities, big data technology can face the uncertainty regarding breaches of data. There is certain limitation related to using big data technology. The number of a company using big data technology as it is growing continuously. For illustration, Woolworths limited is a leading supermarket chain in Australia. Its store is located in over 995 areas across the nation. This corporation uses business analytics to assess the buying behavior of its customers. It has spent more than $20 million in order to purchase stakes inAnalytics Company for implementing big data. A corporation has practically spent $1 billion to assess spending on habits of the consumer by implementing big data. Effective utilization of big data technology facilitates enterprises to generate offers and discount for its customers as per the shopping pattern that resulted in increasing the revenue of corporation.
Reliability and validity estimates
Reliability and validity estimates are a key part to assess the consistency and accuracy of gathered data. For this research, reliability will be measured by using the survey through questionnaire and validity would be measured by using the peer-reviewed journal article (Phillips-Wren and Hoskisson, 2015).
Description of samples
For gathering the secondary data, convenience non-probability sampling technique is selected by a researcher in which sample was selected on a convenience basis of the investigator. For reviewing the literature, 10 journal articles were selected from different articles and 10 books were selected regarding research issues.
Summary
As per the above discussion, it can be summarized that big data technology is a renowned technique in the current business scenario. Big data is gathering of information set that is large and unstructured together with cannot be evaluated by conventional processing techniques. In order to practice big data, special processing programs are necessary that makes logical to set of big data. It can be summarized that a large number of corporation exercises big data technology because of its advantageous like the insight of marketing, development of business strategies, forecasting the demand of customers, real-time analytics, competition assessment, understanding regarding consumer trends and others. But, certain companies are avoiding using big data in their operation because of its cons like untrustworthiness of data, complex to manage, difficult processing, higher investment, and others, and needing of a special program. A large number of corporations uses big data even knowing its cons and investigators are addressing new techniques in order to enhance the efficiency of big data analytics. There are certain companies like Netflix, Woolworths, and Google use big data for generating the business strategies and enhancing their business operation. Big data technology also enables the corporation to get competitive benefits. It can be concluded that big data technology has a favorable impact on the business environment hence; organizations are implementing it to provide personalized plans for customers and declining their carbon footprint. This factor depicts how business company use big data to enhance the productivity of the company and sustain its future growth.
Assessment 2: Research proposal
Research Methodology
Research methodology is a way to complete the research in a systematic manner. This methodology part contains different elements such as research approach, research instruments, research design, research questions, sampling approach, data collection method, data analysis method, research limitation and research scheduling (Vera-Baquero, Colomo-Palacios, and Molloy, 2016).
Research approach
It is a significant element of research methodology as it illustrates the causes for selecting the particular procedure of investigation. Research approach can be categorized into different parts such as inductive and deductive approach. Inductive approach is practiced to assess the depth observation of facts and also built the theoretical aspect related to research issue. In opposed to this, a deductive approach is implemented to develop the hypothesis regarding research concern. As per the formulation of a hypothesis, an investigator will accept or reject the procedure of investigation associated with research concern (Ji-fan Ren, et. al., 2017).
For the purpose of this research, the deductive approach will be used by an investigator to complete the research in a systematic manner. A deductive approach is concerned with building hypothesis as per the current theory and then designing a strategy of research to test the hypothesis. This approach will be suitable to develop the hypothesis and prove it as per the theory (Borgman, 2015).
Research design
A research design illustrates the whole structure of the investigation. By using the research design, an investigator can get the way to comprehend the fact of an investigation in order to obtain the useful with reliable data. It aids an investigator in the procedure of planning of the research study and also helps in answering the different question associated with the investigation. There are two kinds of research design such as qualitative and quantitative research design. The quantitative investigation focuses on numerical data assessment by using statistical techniques and models. In contrast to this, qualitative investigation makes competent to an investigator in order to build hypothesis and ideas for making the base for quantitative investigation. It also enables an investigator to gather depth problem and address possible solution associated with research issues (Manogaran, and Lopez, 2017).
For this research, a mixed research design will be used to complete the main aim of the research. In this way, qualitative research will be selected to gather the conceptual information regarding research concern. In contrast to this, quantitative research will be used to gather the numeric information regarding research issue. Quantitative data would be gathered by using survey through questionnaire while, qualitative research would be pooled by using the literature review. It would aid to create the authenticity of research outcome (Taylor, and Schroeder, 2015).
Research questions
Primary questions
What is the role of using big data in a business organization?
Secondary questions
What is the theoretical perspective of using big data in an organization?
What are the benefits and limitation of using big data in an organization?
Research Hypothesis
H0: There is no significant association between big data and business performance of the company.
H1: There is a significant association between big data and business performance of the company.
Research instruments
The research instrument is significant for an investigator to assess the way of data collection where, the investigation would be conducted. There are several research strategies, which enables an investigator in order to make reliable and feasible research outcomes. These strategies could be a survey, focus group, literature review and a case study (Sheng, Amankwah-Amoah, and Wang, 2017).
There are different instruments for completing the research. In this research, an investigator will use survey through a questionnaire in order to gather the reliable facts and figures. Along with this, the literature review will be used to collect the conceptual understanding of research concern.
Sample size and sampling approach
The sampling technique can be practiced to choose the sample from the high amount of population to obtain feasible data. There are two techniques of sampling named probability and non-probability sampling techniques. In this way, probability sampling can be practiced by an investigator to provide an equal probability of being selected respondents in the investigation. In contrast to this, non-probability sampling enables an investigator to select the participant in a systematic manner (Opresnik and Taisch, 2015).
Probability sampling method will be used to choose the sample from a large number of population. This technique provides an equal chance of selecting the research participants in the research. Under this method, the random sampling method will be used to select the respondents on a random basis. In this research, 50 employees will be selected from the IT department of Google Company. These employees will be selected from the different geographical location of Australia (Sheng, Amankwah-Amoah, and Wang, 2017).
Data collection method
Data gathering technique is significant to attain the aim and objectives of investigation as this technique facilitates an investigator to choose the data significantly. There are different techniques for collecting the data like a primary and secondary collection of data (Sheng, Amankwah-Amoah, and Wang, 2017).
In this research, both primary and secondary data gathering technique will be used to pool the information regarding research issues. In this way, primary data would be beneficial to gather the fresh and innovation information regarding research issue. Hence, the researcher will conduct the survey through a questionnaire for this research in terms of pooling primary information. Secondary data would be gathered by different sources such as journal articles, textbooks, academic publications, articles and online sources (Opresnik and Taisch, 2015).
The expected outcome of the project
This research would be beneficial to gain awareness regarding different theories of the big model such as Massively Parallel Communication (MPC) theory and MapReduce model. It would be also effective to increase the understanding about using the big model in a business organization. This would be also significant to develop the knowledge with respect to the benefits and limitation of using the big model in a business organization. It would also demonstrate a different example of using big model.
Data analysis method
Data analysis is a significant part of research methodology as it provides the feasible and consistent results. There are certain techniques for data analysis like SPSS, Excel and content analysis. In such a way, content analysis is used to assess the qualitative based data. The content analysis incorporates different approaches like conventional, summative and directed. These approaches are practiced in terms of interpreting the meaning from the content of text data.
After gathering the research, an investigator will use statistical data analysis method in this investigation. This method will be beneficial to assess the pooled data in a comprehensible way. Along with this, Ms-Excel software will be used to present the charts, graphs, and tables in a meaningful manner (Taylor, and Schroeder, 2015).
Research limitation
There are different limitations in completing the investigation. For this research, research will face limitation related to time, cost and resources. Inadequate time may create the complexity to complete each activity on time. Moreover, lack of cost may also generate difficulty to gather the primary and secondary data. In addition, lack of resources may create the issues in terms of getting the reliable outcome and depth information regarding research issue (Manogaran, and Lopez, 2017).
Ethical consideration
An investigator will incorporate data protection act 1988 where, an investigator has no right of disclosing the confidential information to other during and after completion of the investigation. An investigator will not manipulate the data in order to conduct the research in an ethical manner. An investigator will inform take prior permission from the manager of Google in order to conduct the ethical research on their IT employees. This would lead to complete the research in an ethical manner (Opresnik and Taisch, 2015).
Research schedule
Research schedule is beneficial to present the activities and their time that will be taken by an investigator during the research. Following the research schedule will be used for this investigation in order to complete the main aim of the research:
Research milestone
|
Initial Day
|
Last Day
|
Duration (Days)
|
Identification of research topic
|
19-08-18
|
23-08-18
|
5
|
Review of literature
|
24-08-18
|
30-08-18
|
7
|
work on research proposal
|
31-08-18
|
12-09-18
|
13
|
Data collection
|
13-09-18
|
13-11-18
|
60
|
Data analysis
|
14-11-18
|
14-12-18
|
30
|
Final report submission
|
15-12-18
|
31-12-18
|
17
|
![]()
From the above chart and table, it can be illustrated that gathering data through different primary and secondary sources will take more time as compared to other activities that would be performed in this research.
Conclusion
From the above research proposal, it can be concluded that a deductive approach will be used in this research in order to formulate the hypothesis. It can be also summarised that mixed research design will be practiced in terms of collecting the qualitative and quantitative research design. Both primary and secondary sources will be used to pool the facts and figures regarding research issues. It can be also concluded that this research will take 132 days in order to complete the main aim and objectives of the investigation.
References
Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A. and Buyya, R., 2015. Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, pp.3-15.
Auffray, C., Balling, R., Barroso, I., Bencze, L., Benson, M., Bergeron, J., Bernal-Delgado, E., Blomberg, N., Bock, C., Conesa, A. and Del Signore, S., 2016. Making sense of big data in health research: towards an EU action plan. Genome medicine, 8(1), p.71.
Borgman, C.L., 2015. Big data, little data, no data: Scholarship in the networked world. USA: MIT Press.
Couldry, N. and Powell, A., 2014. Big data from the bottom up. Big Data & Society, 1(2), p.2053951714539277.
De Mauro, A., Greco, M. and Grimaldi, M., 2016. A formal definition of Big Data based on its essential features. Library Review, 65(3), pp.122-135.
Hartmann, P.M., Zaki, M., Feldmann, N. and Neely, A., 2016. Capturing value from big data–a taxonomy of data-driven business models used by start-up firms. International Journal of Operations & Production Management, 36(10), pp.1382-1406.
Hilbert, M., 2016. Big data for development: A review of promises and challenges. Development Policy Review, 34(1), pp.135-174.
Ji-fan Ren, S., Fosso Wamba, S., Akter, S., Dubey, R. and Childe, S.J., 2017. Modeling quality dynamics, business value and firm performance in a big data analytics environment. International Journal of Production Research, 55(17), pp.5011-5026.
Kitchin, R., 2014. The data revolution: Big data, open data, data infrastructures and their consequences. USA: Sage.
Lafuente, G., 2015. The big data security challenge. Network security, 2015(1), pp.12-14.
Manogaran, G. and Lopez, D., 2017. A survey of big data architectures and machine learning algorithms in healthcare. International Journal of Biomedical Engineering and Technology, 25(2-4), pp.182-211.
Mittelstadt, B.D., and Floridi, L., 2016. The ethics of big data: current and foreseeable issues in biomedical contexts. Science and Engineering Ethics, 22(2), pp.303-341.
Opresnik, D. and Taisch, M., 2015. The value of big data in servitization. International Journal of Production Economics, 165, pp.174-184.
Phillips-Wren, G. and Hoskisson, A., 2015. An analytical journey towards big data. Journal of Decision Systems, 24(1), pp.87-102.
Power, D.J., 2014. Using ‘Big Data for analytics and decision support. Journal of Decision Systems, 23(2), pp.222-228.
Sheng, J., Amankwah-Amoah, J. and Wang, X., 2017. A multidisciplinary perspective of big data in management research. International Journal of Production Economics, 191, pp.97-112.
Tan, K.H., Zhan, Y., Ji, G., Ye, F. and Chang, C., 2015. Harvesting big data to enhance supply chain innovation capabilities: An analytic infrastructure based on the deduction graph. International Journal of Production Economics, 165, pp.223-233.
Taylor, L. and Schroeder, R., 2015. Is bigger better? The emergence of big data as a tool for international development policy. GeoJournal, 80(4), pp.503-518.
Vera-Baquero, A., Colomo-Palacios, R., and Molloy, O., 2016. Real-time business activity monitoring and analysis of process performance on big-data domains. Telematics and Informatics, 33(3), pp.793-807.
Viceconti, M., Hunter, P.J. and Hose, R.D., 2015. Big data, big knowledge: big data for personalized healthcare. IEEE J. Biomedical and Health Informatics, 19(4), pp.1209-1215.