Big data is a very important technology for any organization because many new web-services developed by which companies cannot handle their data. The main advantage of this innovation is that it can store large of amount of data at a time and it is more secure rather than other storage elements (Alrawais, Alhothaily, Hu, & Cheng, 2017). The main objective of this report is to describe various kinds of threat and issues faced in the big data and different types of methods to reduce security issues of this technology. There are main three concepts associated with big data, for example, volume, velocity, and variety. It uses two methods for analysis such as predictive analytics, and user behavior analytic method which are described in this report (Hashem, et al., 2015).
Overweight of Big data
Big data is defined as a process which is used to collect large and complex data sets. Database system by using big data technology is an advanced innovation in the field of the communication system (Inukollu, Arsi, & Ravuri, 2014). Now, this information system is enhancing by the web services and data management systems and it also implemented on many web servers. Due to increasing of web-based services, it is very difficult to handle a large amount of data for any organization and it is observed that this technology reduced the problem of data management (Li, Gai, Qiu, Qiu, & Zhao, 2017). According to MGI, the big data is a type of datasets whose size is very large as compared to database software and storage elements. Google is one of the best examples of this technology and in which it collects data from its own services and operation and it uses the various process to store data like voice recognition, location-based services, and translation (Matturdi, Xianwei, Shuai, & Fuhong, 2014).
Technologies used in big data
There are many technologies used in big data analysis which are described below
It is a way to collect independent nodes and loss of big data and it is estimated that it can store replicated data source (Perera, et al., 2015). This is also called as non-relational database system because it does not depend upon nodes or counters. Big data system uses this type of technology to distribute files into various kinds of storage elements.
It is a very common technology which is used in big data and the main purpose of this system is that it produces retrieve data without implementing the technical restriction. It uses Apache Hadoop and various real-time data elements to store data of any organization (Perera, et al., 2015).
Data integration is a process which is used for integration purpose and many organization uses this type of technology because it reduced time and cost. A key operational test for most associations taking care of enormous information is to process terabytes (or petabytes) of information in a way that can be valuable for client expectations. Information joining devices enable organizations to streamline information over various enormous information arrangements, for example, Amazon EMR, Apache Hive, Apache Pig, Apache Spark, Hadoop, MapReduce, MongoDB and Couchbase (Weber, 2015).
It refers to a pre-evaluation process that uses programmes to store data or information. The main objective of this technology is to manipulate the database into a given format and which can be used for further analysis. It uses various kinds of tools that accelerate the information sharing system by using unstructured data sets (Yang, Li, & Niu, 2015). The main limitation of this process is that it is not an automatic system and need user oversight that consumes more times.
An imperative parameter for big data preparing is the information quality. The information quality programming can lead purging and enhancement of vast informational indexes by using parallel preparing (Yang, Li, & Niu, 2015). These virtual products are generally utilized for getting steady and dependable yields from enormous information handling.
While the customary SQL can be successfully used to deal with an extensive measure of organized information, we require NoSQL (Not Only SQL) to deal with unstructured information. NoSQL databases store unstructured information with no specific composition. Each line can have its own particular arrangement of segment esteems. NoSQL gives better execution in putting away an enormous measure of information. There are many open-source NoSQL DBs accessible to investigate huge Data (Wu, Zhu, Wu, & Ding, 2014).
Challenges faced in big data
It is a most important problem faced in big data analysis and it is observed that this technology increased various types of security risks. When private information of any consumer combined with other databases then they can lead risks and interface into human servers (Xu, Jiang, Wang, Yuan, & Ren, 2014). In this analysis system, users can lose their personal information’s because it uses many software’s and networks to store data and hackers can easily block their peripheral devices.
Data access and sharing problem
In this modern technology, many users use various security systems to secure data from other person but attacks use flooding process by which they can control their communication systems (Yang, Li, & Niu, 2015). By which management of data and sharing of information are very difficult and data access is also a common problem for big data.
Analysis of any data or information by sung big data process is very difficult because it is less accurate rather than other technologies. The analysis process is done on the large amount of database which may be unstructured, or structured and big data uses the less efficient software’s through which people do not receive proper analysis (Yang, Wu, Yin, Li, & Zhao, 2017).
Scale and complexity
It is researched that when the rate of data enhances then the volume of the database also increases by which the problem of complexity occurs in the system. Traditional software’s and tools are not enough to monitor and control the volumes of data sets (Yang, Li, & Niu, 2015). It is observed that retrieval and modeling are also common problems due to scalability and complexity of information which requires to be analyzed.
Gaps in the literature
It is estimated that there are many authors that are researched on issues of big data and they observed that the main problem with big data process is lack of privacy (Hashem, et al., 2015). Big data is the latest analysis technology which can store a large amount of data and many organizations can analyze their data sets. There are many security threats and challenges into this technology, for example, data collection problem, security and privacy issue, and analytics issue and complexity problem (Perera, et al., 2015). Many writers have been addressed the issue of analysis and complexity but the problem of data breach and security which are not addressed (Inukollu, Arsi, & Ravuri, 2014). To improve the security of user’s personal data files various types of technologies are developed like encryption method, cryptography, and virtualization process. According to my opinion, the lack of security is the very biggest problem for any modern technology and I identified that many users use low password system by which they can lose their privacy. Hackers use botnet systems and they can encrypt consumers personal accounts by transferring a large number of traffic signals. We can reduce this type of challenges by adopting biometric systems and firewall software for security purpose (Yang, Li, & Niu, 2015).
Information is a very important key element for any organization and many users communicate with other by using communication networks and social media through which data rate also increased (Hashem, et al., 2015). Big data decreased the problem of data storage because this process provides a database system to store the user's personal information. In future information and communication technology will reduce potential threats and issues of big data and organizations can make their strategies to avoid the problem of a data breach (Hashem, et al., 2015). Therefore the information of any users can be secure by using encryption technique and in the future, the organization can manage their security plans.
The big data technology is a combination of processing steps and various techniques which is used to analyze the data of any community. This technology increases with the help of traditional security solution, public clouds, and DMZ. There are many technologies used in big data like data virtualization, data integration, distributed storage, and pre-processing system and all these are explained in this report. It is investigated that many organizations cannot manage their data files for which big data analytics can be used to manage all personal data files. This report described various kinds of challenges and issues of big data and gap in the literature. Users should adopt security programmes like cryptography, password-based system and firewall software to reduce problems and issues.
Al Nuaimi, E., Al Neyadi, H., Mohamed, N., & Al-Jaroodi, J. (2015). Applications of big data to smart cities. Journal of Internet Services and Applications, 6(1), 25.
Alrawais, A., Alhothaily, A., Hu, C., & Cheng, X. (2017). Fog computing for the internet of things: Security and privacy issues. IEEE Internet Computing, 21(2), 34-42.
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47(2), 98-115.
Inukollu, V. N., Arsi, S., & Ravuri, S. R. (2014). Security issues associated with big data in cloud computing. International Journal of Network Security & Its Applications, 6(3), 45.
Li, Y., Gai, K., Qiu, L., Qiu, M., & Zhao, H. (2017). Intelligent cryptography approach for secure distributed big data storage in cloud computing. Information Sciences, 387(7), 103-115.
Matturdi, B., Xianwei, Z., Shuai, L., & Fuhong, L. (2014). Big Data security and privacy: A review. China Communications, 11(14), 135-145.
Perera, C., Ranjan, R., Wang, L., Khan, S. U., & Zomaya, A. Y. (2015). Big data privacy in the internet of things era. IT Professional, 17(3), 32-39.
Weber, R. H. (2015). Internet of things: Privacy issues revisited. Computer Law & Security Review, 31(5), 618-627.
Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2014). Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1), 97-107.
Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: privacy and data mining. IEEE Access, 2(1), 1149-1176.
Yang, J. J., Li, J. Q., & Niu, Y. (2015). A hybrid solution for privacy-preserving medical data sharing in the cloud environment. Future Generation Computer Systems, 43(5), 74-86.
Yang, Y., Wu, L., Yin, G., Li, L., & Zhao, H. (2017). A survey on security and privacy issues in internet-of-things. IEEE Internet of Things Journal, 4(5), 1250-1258.