The Process of Replication in the Database
Discuss About The Benchmark Scalability Distributed Database?
The process of replication in the database involves copying of the files at variable sources so that availability of the data is maintained at all times. However, either a fragment or a relation can be replicated at one site. The main benefits of this system are that the system can continue to operate for as long as one site involving the data is running (Coronel & Morris, 2016). In addition, it can also be used to make the retrieval of the data very easy and makes it very reliable and easy to perform the work concerned. It can also be used to reflect the organizations structure of the concerned company. Lastly, the use of the distributed database can also help the various organizational centers to achieve local or site related autonomy which will be helpful in controlling the data.
The first problem related to the replication procedure is that whenever a copy of the file is modified, it becomes difficult to perform similar operations. This reduces the consistency of the file which leads to the reduction in efficiency of the process. The continuous consistency is modified where deviation in the numerical values are observed between the replicas. The next disadvantage is the security of the system (Liu & Özsu, 2018). The data file that has been replicated needs to be made secure such that the availability of it is confirmed. Lastly, the time taken for replicating the data can be very high which depends on the size of the file to be replicated. Such activities can lead to massive downtime of the system which will ultimately result in reduction of business processes.
For deciding the need for fragmentation, there are two different considerations that are always considered. These are the quantitative information and the qualitative information. The quantitative information is those data which denotes the quantity of the records. Similarly, the qualitative information denotes the quality of the data. The frequency related to the number of queries, site of the query, and the selectivity of the queries is termed as quantitative information. In addition, the types of data and the type of operation i.e. read or write operations are termed as the qualitative information (Abadi, Madden & Lindner, 2016). This information is necessary for deciding the requirement of fragmentation. Along with this, the partitioning of identifiers also helps on denoting the type of fragmentation to be followed. In case, portioning of relations with tuples are involved, then horizontal fragmentation is required. Similarly, if partitioning of relations with attributes is involved, then vertical fragmentation is adopted.
Considerations for Fragmentation
The use of allocation designs are mainly utilized for denoting the storage of the fragments on the respective sites needed. There can be two different fragment storage options. These can be centralized or partitioned. In case of the centralized allocation design, the requirements include the allocation of a single database and a database management system at one of the site required (Bailis et al., 2014). The users of the system will be distributed across various sites related to the system.
Another allocation design is the partitioned database design. In case of this design system, the database is made to be partitioned to form various disjointed fragments where each one of these fragments will be assigned to one specific site.
Lastly, there is some other information that is considered for allocation design of a fragment. These include the database related information, application based information, and site based information and the network information.
There are various considerations that are followed while adopting the design considerations of the replicated aspects. The first consideration is the selection of the fragments that will be stored as various copies. There are two different types of replications processes and depending on them, the design considerations are adopted (Cellary, Morzy & Gelenbe, 2014). These are the complete replication and the selective replication. For the complete replication process, a whole copy of the system database will be maintained at each of the sites concerned. Similarly, for selective replication process, selective number of fragments will be replicated at each of the sites concerned.
Another consideration that is adopted is the numerical value of the probability. If the ratio of the read only query and the updated queries are greater or equal to one, then the process of replication is considered to be beneficial, else it might pose problems in the process.
The design strategies of a distributed database involves considering the three aspects of the database, fragmentation, allocation and replication. These three aspects are always related to each other either by following the bottom-up approach or the top-down approach. According to the distributed database design specifications, the fragmentation process is utilized to divide the information and database so that they can be shared on the sites (Widom, 2016). The next is the allocation processes where these fragmented aspects are allocated to the required sites. Lastly, the replication process involves replicating of the requirements such that they can then be copied for future reference. This is the main indications of the processes.
In case of the diagram depicted above, the databases are fragmented to form the smaller components which are then allocated and replicated at the necessary sites required.Allocation Designs for Fragmentation
For adopting horizontal fragmentation at the regional offices, various advantages and disadvantages exist. In this scenario, financial data from the regional offices are transferred to the headquarters. In this type of operations, the main advantage of this operation is that the speed of processing is much fast. In addition, in case of horizontal fragmentation, partitioning of relations with tuples is adopted.
However, there are various disadvantages to these processes. In cases of this, emergency cases cannot be handled properly. If the server crashes, the lack of replication will make it hard to duplicate the data. As a result, the process of fragmenting will have to be started all over again which will lead to the consumption of resources and time (Foster & Godbole, 2016). Although, the process of horizontal fragmentation is considered fast, the failures like server crash or network problems can lead to problems to be associated with data transfer.
In this case, horizontal fragmentation at a geographical level is applied. This is followed by replication of the data. The system associated to it is considered to be very efficient. This is because fragmentation and replication is done on the same level. This helps in keeping the system effective and also helps in keeping efficiency in the functionality of the regional offices. The main advantage of this system is that the system can be successfully utilized in a geographical level owing to the increase in the business operations of the company involved.
However, the main disadvantage of this system is that the need for a connection is required. In case of this scenario, the replicated data are successfully made so that they can be accessed from the various geographical locations (Kuhlenkamp, Klems & Röss, 2014). However, the lack of network connection can make it difficult to sync the data. This will lead to un-controlled and un-coordinated business process and this is the main disadvantage of the system.
In case of this scenario, horizontal fragmentations are applied. It can be seen that the horizontal fragmentation involves partitioning of relation with tuples. For this reason, it is faster in processing than the vertical fragmentation process. However, in case of this scenario, the replicated fragments are stored in then regional centers of the office. The main advantage of this type of application is that the requirements for maintaining the data are much less (Jukic, Vrbsky & Nestorov, 2016). This is mainly because the connection to the regional offices is not required. As a result, the synchronization among the regional offices will be maintained effectively.
Design Considerations for Replication
However, the main disadvantage is that the headquarter office of the concerned company will not be able to control the normal operations. This is mainly because the data from the regional office will be already used to operate upon without syncing with the headquarters (Coronel & Morris, 2016). As such, the normal operation of the headquarters will lag behind the operation of the regional offices which will in turn lead to need for better and faster connection that will cause more acquisition of resources.
In case of the vertical fragmentation processes, the partitioning of relations is done by considering the attributes of the database. In this scenario, the financial tables are considered in the headquarters and the other tables are considered in the regional offices. This process is effective and considered to be efficient which an advantage to the process is (Liu & Özsu, 2018). Another advantage is that the cost of achieving the fragmentation processes is also very low. This also reduces the need for allocating resources for this purpose.
The main disadvantage is the need for replication of the data. This will be utilized in storing the data in all of the locations such that efficient operation can be handled and backup of the system is available at all times.
Out of the entire discussed scenario, the vertical fragmentation adoption is considered to be effective. This is mainly because the fragmented data is efficient in adopting the business processes. For such global enterprises, this will be considered to be a great solution for business processes.
This design for adopting distributed databases is termed to be effective for addressing the requirements of the organization. In this network, vertical fragmentation is applied and in this case, the replication of data is done when network is available.
References
Abadi, D., Madden, S., & Lindner, W. (2016). Sensor Network Integration with Streaming Database Systems. In Data Stream Management (pp. 409-428). Springer, Berlin, Heidelberg.
Bailis, P., Fekete, A., Franklin, M. J., Ghodsi, A., Hellerstein, J. M., & Stoica, I. (2014). Coordination avoidance in database systems. Proceedings of the VLDB Endowment, 8(3), 185-196.
Cellary, W., Morzy, T., & Gelenbe, E. (2014). Concurrency control in distributed database systems (Vol. 3). Elsevier.
Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education.
Coronel, C., & Morris, S. (2016). Database systems: design, implementation, & management. Cengage Learning.
Faerber, F., Kemper, A., Larson, P. Å., Levandoski, J., Neumann, T., & Pavlo, A. (2017). Main Memory Database Systems. Foundations and Trends® in Databases, 8(1-2), 1-130.
Foster, E. C., & Godbole, S. (2016). Distributed database systems. In Database Systems (pp. 361-370). Apress, Berkeley, CA.
Jukic, N., Vrbsky, S., & Nestorov, S. (2016). Database systems: Introduction to databases and data warehouses. Prospect Press.
Kuhlenkamp, J., Klems, M., & Röss, O. (2014). Benchmarking scalability and elasticity of distributed database systems. Proceedings of the VLDB Endowment, 7(12), 1219-1230.
Liu, L., & Özsu, M. T. (2018). Encyclopedia of database systems. Springer.
Widom, J. (2016, September). Research in database systems: Challenges, principles, prototypes, and results. In Advances in ICT for Emerging Regions (ICTer), 2016 Sixteenth International Conference on (pp. 3-3). IEEE.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2019). Essay: Scalability Of Distributed Database Design Strategies And Considerations.. Retrieved from https://myassignmenthelp.com/free-samples/benchmark-scalability-distributed-database.
"Essay: Scalability Of Distributed Database Design Strategies And Considerations.." My Assignment Help, 2019, https://myassignmenthelp.com/free-samples/benchmark-scalability-distributed-database.
My Assignment Help (2019) Essay: Scalability Of Distributed Database Design Strategies And Considerations. [Online]. Available from: https://myassignmenthelp.com/free-samples/benchmark-scalability-distributed-database
[Accessed 22 December 2024].
My Assignment Help. 'Essay: Scalability Of Distributed Database Design Strategies And Considerations.' (My Assignment Help, 2019) <https://myassignmenthelp.com/free-samples/benchmark-scalability-distributed-database> accessed 22 December 2024.
My Assignment Help. Essay: Scalability Of Distributed Database Design Strategies And Considerations. [Internet]. My Assignment Help. 2019 [cited 22 December 2024]. Available from: https://myassignmenthelp.com/free-samples/benchmark-scalability-distributed-database.