A report must be written about the implementation and results of analyses.
• MongoDB server setup options:
You can either set up a mongoDB server on your own computer and perform all operations there, or use the cloud server provided by us;
We will provide a MongoDB server in the cloud that each student can use to set up his or her own database. We will provide each student that needs it with their own MongoDB login and a separate database.
• Database setup requirements:
• The dataset must be stored in more than one collection (a collection is equivalent to a table in a relational database). This means you will need to store descriptive information about the data in separate collection to the primary data (e.g. the meaning of the information in the Vic Roads dataset such as the units of measurement, the meaning of “density” and “flow”) and so forth.
• Use the mongoimport command to import CSV or other supported data formats. E.g.: mongoimport -h 144.6.224.55 -d fit5141 -c vicroads --type CSV -- headerline < 3003.csv
• Connecting and analyzing data using R
As you will need to query mongoDB using R and therefore need to install the R package Rmongo from here: http://cran.r-project.org/web/packages/rmongodb/index.html. Refer to the R documentation for how to install a package.
Perform simple statistical reports in the form of tables and charts on the data as appropriate for the business.
• Connect to MongoDB using Tableau
There are several packages available to provide an ODBC or other interface for Tableau to be able to access Mongo.
Using Tableau generate visualization of the data appropriate to the business question as described in the R section above. These can be similar summary statistics or related summary information to that provided in the R section.
Overview of MongoDB as a NoSQL database
Mongo database is simply a no SQL type of database. By saying this we mean that it’s a document database. No SQL databases are far more different from relation databases such as MY SQL .In MySQL databases every this as to be mapped, one needs to figure out the exact scheme depending on what tables you use, fields and also the types of fields , whatever its going to be a string or integer . With no SQL one needs to plan out the database and you don’t need to do the predefined structure before you build your application. One of the major advantage and no SQL databases is databases in general is Scaling, there are really easy to scale. There also much faster in performing operations especially when one is dealing with bulk of data. No SQL is usually the best alternative as long as there not many tones of inter connected relationships.
The setup of the database performed all operations on the cloud server provided, that is MongoDB server Cloud. https://cloud.mongodb.com. The Project Administrator has the only logic and password to get access to the database for the read and write performance. (Banker, 2011)
Fig 1.1: Login and Security Features.
Ideally, the storage of the data set are stored on cloud based remote servers. As said earlier the one of the major advantage of no SQL database is that even though one is using multiple of data it doesn’t require exact schema depending on the data type one is using.
But in this case the primary data that is the characteristic data has unique identifier, the customer identifier, which is used in correspondent with the MoreDSPDataEncoded table. Both of tables interact because they all share the customer identifier as a unique field. This lays a foundation for interaction and communication between of attributes among the entire Database. (Dede et al., 2013)
Fig 2.1 Primary Data
With the use of mongo import command to import CVS mongo import “–h 144.6.224.55 –d fir5141 –c vicrroads –type csv – headlines < 3003.csv” the following data was managed to be imported to the cloud in an array form hence making data easier to be implemented in the data base and being easier for processing. With multiple of data items and field it is advisable to import data and update the database to efficiency and reliability of the database.
Fig 2.2: Data Entry and Manipulation
R is an open source programming language where analysis is carried out majorly for statistical computing and visualization. It is free and runs most of computing platforms and contains contributions from top computational statisticians. It presents a data analysis sequence which may be applied to environment dataset, using a small but typical data set of multivariate point observations. (Ihaka & Gentleman, 2011).
R majorly focuses on 5 aspects that is;
- placing statistical analysis in the framework of research question,
- moving from simple to complex methods (First is the exploration then to selection of promising modelling approaches),
- visualizing as well as computing,
- making correction interferences,
- Statistical computation and visualization.
On this report on performing statistical summary the id deploys a cluster of the same context for the database and on the cluster there are two query results displayed that of the Characteristic data from the smart metered house hold which initially data was imported from the excel type of form that was the data entry and MoreDSPDataEncoded sheet, this is a typical spreadsheet product with several inadequacies for processing in R which contains the customer consumption data and is correspondent to the first characteristics data sheet. This data is fully set and summarized and contains more than 20000 characters and fields from the research area and representative of the lighting situation in the area. The data is initially the set of two sources and each source has a fixed layer and computational lighting and the average of each building and the type of AC that is brought for each customer. Second the MtrRGgActNetEnergyMaxDlyKwh is the sum of all the power consumption as from the reading active date without any estimate. Each of the data input of comprises of the data consumption of Energy01Kwh of each building and each customer and the number of people that consume the energy. For both data set we’ve located purposively and subjectively to represented energy consumption rate and the active energy use. (Chodorow, 2013)
Setting up MongoDB server on the Cloud
On the final setup of the summary information and completeness ad seamless of database setup and R integration, innovative approach I required first since were dealing with the cloud server then one requires to establish a database on the server (Angles, 2012). By generating a cluster one is establishing a read or write type of database and might add other users but him himself is the sole admin. (Agrawal et al., 2014)
After setting up the cluster then the admin needs to link the cluster to the ip whitelist (Address) so as to gain access to the database services with a unique identifier and with logic access to the service
Then one needs to connect the cloud database with the Mongo Shell which will run and store some back up to the system and act as a root storage while generating the database. Ideally the main essence of this is to be able to establish a backup type of storage to your machine and be able finalize its communication with the cloud database that is remotely located. (Ranjan, 2014).
After setting up both the cluster and the ip address whitelist then there is a final step to complete your cloud database setup. This is administrating roles, logins and actions to the people who are going to use the system database. As an administrator, one needs to identify the users and their roles in the database development. This gives anyone the access to cloud database established and the roles given to each user. In this case I managed to generate a two user login with the name Emmanuel Nyamanya one of the email address [email protected] and [email protected]. Both with access pass access of the same character E1%friend. https://cloud.mongodb.com/user .
These are the sites that I used when developing the database and the analysis of Mongo Db https://cloud.mongodb.com/v2#/account/organizations which is dedicated for the specific organizations storage. (Wei et al., 2011)
https://cloud.mongodb.com/v2#/account/publicApi
References
Agrawal, R., Imran, A., Seay, C., & Walker, J. (2014, October). A layer based architecture for
provenance in big data. In Big Data (Big Data), 2014 IEEE International Conference on (pp. 1-7). IEEE.
Angles, R. (2012, April). A comparison of current graph database models. In Data Engineering
Workshops (ICDEW), 2012 IEEE 28th International Conference on (pp. 171-177).IEEE.
Banker, K. (2011). MongoDB in action. Manning Publications Co..
Chodorow, K. (2013). MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. " O'Reilly Media, Inc.".
Dede, E., Govindaraju, M., Gunter, D., Canon, R. S., & Ramakrishnan, L. (2013, June).
Performance evaluation of a mongodb and hadoop platform for scientific data analysis. In
Proceedings of the 4th ACM workshop on Scientific cloud computing (pp. 13-20). ACM.
Ihaka, R., & Gentleman, R. (2011). R: a language for data analysis and graphics. Journal of computational and graphical statistics, 5(3), 299-314.
Ranjan, R. (2014). Streaming big data processing in datacenter clouds. IEEE Cloud Computing, 1(1), 78-83.
Wei, J., Zhao, Y., Jiang, K., Xie, R., & Jin, Y. (2011). Analysis farm: A cloud-based scalable aggregation and query platform for network log analysis.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2021). Essay On Deployment Of Dataset On MongoDB And R Based Simple Analysis Of Data. Retrieved from https://myassignmenthelp.com/free-samples/fit5141-advanced-topics-in-information-technology/deployment-of-dataset-on-mongodb.html.
" Essay On Deployment Of Dataset On MongoDB And R Based Simple Analysis Of Data." My Assignment Help, 2021, https://myassignmenthelp.com/free-samples/fit5141-advanced-topics-in-information-technology/deployment-of-dataset-on-mongodb.html.
My Assignment Help (2021) Essay On Deployment Of Dataset On MongoDB And R Based Simple Analysis Of Data [Online]. Available from: https://myassignmenthelp.com/free-samples/fit5141-advanced-topics-in-information-technology/deployment-of-dataset-on-mongodb.html
[Accessed 13 November 2024].
My Assignment Help. ' Essay On Deployment Of Dataset On MongoDB And R Based Simple Analysis Of Data' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/fit5141-advanced-topics-in-information-technology/deployment-of-dataset-on-mongodb.html> accessed 13 November 2024.
My Assignment Help. Essay On Deployment Of Dataset On MongoDB And R Based Simple Analysis Of Data [Internet]. My Assignment Help. 2021 [cited 13 November 2024]. Available from: https://myassignmenthelp.com/free-samples/fit5141-advanced-topics-in-information-technology/deployment-of-dataset-on-mongodb.html.