In the domains of M-Commerce and E-Commerce has led to various advancements for increasing on technology dependence. With numerous online users exponentially increasing and service types that is made accessible increasing online at the higher rate than before, the threats that are serious in terms of privacy and security also dealt with it. With the several service automation sectors for the elements such as payments of telephone bill, mobile recharge, insurance and payment of electricity bill and because of emergence of projects related to E-Governance, almost each of the individual depends on mobile and online banking. This led to increasing numerous illegal activities of Money Laundering, Banking fraud, Insurance fraud, Credit Card fraud and online identity thefts, etc.
Other technological advancement made procedure of detection and identification for such types of the frauds that are possible in primary place. In the various scenario it was identified numerous potential risk involvement, process and store in real time huge data amount and has the ability for collecting the data. Thus, the main aim is to concentrates on Australian Bank prevention and detection of the activities related to the fraudulent with the help of advanced analytics of data. There are some of the challenges that are ranging from storage and collection of the real time for visualization, processing and pre-processing of the processed information which will help in identifying the fraudulent patterns.
Other main challenging that is preventing from utilizing information and existing techniques or algorithms is data imbalance. A fraudulent transaction to the genuine transactions ratio is not correct in the proportion of fraud detection procedure used in the credit card is very negligible. The unstructured data nature makes it very complicated as algorithms or techniques of data mining need some of the structured data. (Rahm, 2000)
The fraud detection systems that are used in Australian Bank earliest are statistics and data mining for extracting fraudulent data from available information. This will lead to emergence of advanced architectures and detection mechanisms which facilitates by enhancing the system performance. In this analysis, it actually gives the conclusion that is very surprising which aggregates the works of the models much better than personalized models. (Liu, 2009)
Statistical modeling, neural networks, adaptive pattern recognition and learning of the machine were employed for developing the predictive models for giving the measure for certainty about whether the transaction particular one is fraudulent. All the fraud detection methodologies that are related to credit card is seek for discovering in spending patterns that are based on the historical information for the particular client’s activities that are based on past. This is not at all suitable for the online banking due to the diversity of client’s activities that are based on the online banking and there is limited historical information that is available for the particular client.
For the intrusion detection methods of data mining can get applied. The classification model with the frequent episodes and connected rules algorithm has get created for anomaly some of the intrusion detection. This method will automatically create accurate and concise models of detection from the huge amount of the audit information. Thus, this needs huge amount of the audit information in order for computing the rule sets for the profile.
Fraud Detection system that are based on Data Mining– The fraud detection methodology that is cost sensitive based on credit card uses Bayes Minimum Risk classifier. The author Bahnsen, 2013 claims for giving realistic views of monetary losses and gains occurring as per the fraud detection in Australian Bank. The method which concentrates on giving efficient fraud detection with the help of imbalance information.
The method that is fraud detection which is rule based gives huge enhancements in the procedure of detection. Metrics that also claims like FPR and TPR not at all suitable for some issues classes, thus it presents own metric performance. This method will help in tending to defects that fraud in Australian Bank occurrence by analyzing the transactions pattern for the normal users.
Outlier detection that is based on Heuristic– The method of transaction scoring uses some of the generic scatter and algorithm for determining frauds that are based on credit card. With each of the transaction, this scoring method claims for minimizing wrongly classified transactions number. With the help of this method there is an improvement in performance to 200%. (Jia-jie, 2012)
Fraud Detection systems that are based on Graph– The benefit of this method is that it will considers the average criteria for the mutual, other mechanisms that tends for performing effectively various clusters shapes. The main principle is detecting distances that are based on outliers. This method uses procedure of lightweight probing for determining various outliers.
Fraud Detection Systems that are based on big data– Hormozi, 2013 presented the detection information that is based on Credit card fraud by Hadoop MapReduce environment given the accuracy of evaluation. There are some of the measuring criteria considered and they are true positive rates, detection rate, false negative rate and cost. This method also utilizes some of the Hadoop environment for performing some of the operations.
Fraud Detection System faced issues in Australian Bank
Below are the challenges that are faced by the fraud detection system:
Incomplete information and Imbalanced dataset– The main challenge is Imbalanced dataset nature. In order for giving the accurate outcomes, the applications that are based on fraud detections need actual information. It has three of the general phases containing loading, transformation and extraction.
Diversity of transaction– The main challenges that are faced by scenario of fraud detection is based on client. In this method, the major drawback is that there is involvement of human interaction. Because of client’s involvement, this becomes very crucial in designing the system with false positives. (Elias, 2011)
Big Data real scenario– Deploying and developing the system of fraud detection, in real scenario proves to be an issue, the reason is having large information velocity and huge availability of the data which is known as big data. Thus, it can state that the detection system that is fraud is created for the present scenario that should get capable for processing the huge records in short time spam giving as needed the accurate outcomes.
Emerging Fraud new patterns– As the sophisticated techniques and technology advances for preventing and detecting the fraud will emerge, thus the system will fight back by using advanced techniques or methods for performing some of the fraudulent activities and maintaining equilibrium. The initial mechanism of detection needs data mining or statistical techniques whereas the present scenario demands heuristics methods and sophisticated to machine learning. (Alowais, 2012)
Visualization– The analysis that is based on algorithm gives effective outcome. Visualization usage lack of the techniques uses certain patterns.
Fear of the false positives– The highest issue encountered by the Australian Bank in the procedure of the detection of fraud is the misclassification. The major concern is the reduction of the false positives for most of the firms when they are dealing with the applications that are fraud detection. (Rong-Chang, 2005)
Online prediction requirement– The main hurdle by systems is sheer velocity and the volume where transactions will get committed. The detection models are having big challenge that proves the predicting and come with accurate outcomes.
Case studies (based on Australian Bank)
Below are some of the case studies that are based on some of the various frauds –
Insurance fraud: The upcoming occurring frauds are the insurance frauds. (Hadi, 2013)
Financial fraud: In this scenario, suppose that Australian Bank requests for the two and three identifications from the user before it has opening the PAN card number, phone number, the address and an account. There are varieties of the frauds that will get identified by findings the fraud rings –
Credit card fraud: This is the typical frauds that are having typical difference from some of the other frauds. Below figure will represent the transaction scenario for three of the users: (Duman, 2013)
Aggregated Vs Personalized models– The system deals for fraud detection while performing various operations on the data considering the whole information. But in the scenario of fraud detection, the needs for the solution which predicts the frauds for the single pattern of the users. Hence, the personalized model which analyzes the individual records and gives the prediction for user level for the lucrative choice. (Gurjar, 2014)
Findings various fraud rings– This process that tends for finding the connections among entities that will prove to get point for identifying various frauds. This procedure will get carried out usually and efficiently works under graph databases instead of RDBMS.
Analysis based on GPU– Explosion in numerous transactions will lead to rise in the numerous transactions generation in the single slot of the time. With the help of GPU, they can perform various tasks and those are in parallel. (Cao, 2014)
Analysis based on Hadoop– With the usage of this analysis there is a huge increase in recent years because they perform parallel nature of the processing. Because of the huge data volume connected with present applications. This will act as the best candidates for the Hadoopable issues.
Visualization– Using algorithms though analysis proves for giving efficient solutions, there exist few of the hidden patterns that would get identified with some sort of the support of human analysis and intelligence. Thus, the methodology that is supervised visualization using the graphs would get developed, which would visualize outcomes to return various patterns which are not deciphered initially with the help of the analysis based on the information. (Sahin, 2013)
Model based on the Search engine– The recent growing technology used in the Australian bank is the concept of the elastic search. This concept will work on basic search engine nature which “counts” everything that is needed to it. The frequency of the incidence of the corresponding transactions to the particular individual that will get mapped using the approach of the TF-IDF for determining effectively the outliers. Here, the scenario is mapped directly with the detection system that is based on fraud by considering the sources of transactions and transactions. By the procedure of overlaying some of the transactions that will get outlier in the system, or in few words it was stated that the transactions that are dissatisfied can get identified from the top left corner, that will get depicts the uncommonly general scenario. (Duman, 2011)
Description of the dataset
Because of the generic problem nature, in the research single dataset is not at all considered, whereas, directions will get concentrates for developing the generic architecture that would get utilized and trained for all the procedures of the fraud detections. Thus, in this research variety of the datasets are used for completing the analysis. CMS dataset is used for analyzing the insurance and health care datasets. In research for the link analysis Enron email datasets are used. This fraud detection will get carried out mainly in German and Australian banks and they are available in the repository of UCI. (Wei, 2013)
Most of the Australian banks, using neural software which is known as Falcon, built by the FICO that is American company. Just like the online shopping is the area of the growth, and the fraud that is based on credit card is the area of the growth, so that the manufacturing of the systems for the financial institutions for keeping the track of bad guys. There are some of the malwares that hides in the application of the web browser until the banking and the shopping sites will get opened.
Recommendations and suggestions
Because of the massive information contained in issue that is being analyzed, there will be two viable choices in order for determining the efficient solutions. The primary choice will be processing that is based on GPU and second choice will be Hadoop. The options that are available possible in current scenario are for algorithms that are port conventional to the architectures or for developing algorithms that are tailored to the environments. The most of the appropriate options are cost efficient and effective and will tradeoffs that would get incurred by the various methods. Thus, visualization gives various deep insights on the patterns that are hidden. Every software which is based on the database that is mentioned has their own benefits and there is corresponding to one which will get used for an effective management of the problem. Every datasets that are defined needs to master some of the queries that are specific to them; thus the portability is not the choice while considering various databases of the graph. Thus, the probability of the result from the model that will get used as the score of the risk for transaction. The biggest at the score of the risk and more likely that transaction is very fraud. The transactions will then ranked by this score in the order of descending. (Kim, 2014)
The detection methods for the fraud that are used by the Australian bank were created from initially phases to the present technologies and they are analyzed and categorized. In this paper, research gives various cons and pros and the present needs for the fraud scenario that is existing in the current situations. There are challenges which are faced while the designing the system of fraud detection were discussed in details and thus provided the directions to research. These directions which are used leads to the various research areas and they are disjoint from one another. The future works which will deal in giving solutions to the various challenges that are posed by utilizing the environment as per above provided directions. In next stage, analysis will get carried out for giving methods that are unified rather than separate the methods for the frauds of each sub-category. Analysis and the research on the report will get carried out and will get directed towards developing the framework that is unified which trains with the particular category of the fraud and gives even prevention and effective detection if it is possible. The online banking that is sophisticated fraud contains multiple resources, containing systems that are online businesses, web technology, computing tools and human wisdoms. The framework that is used takes benefit of multiple methods that are data mining, mixed features, domain knowledge and multiple-layer structure for the systematic solution.
Rahm, Erhard, and Hong Hai Do. (2000). Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23.4: 3-13.
Liu, Ou, et al. (2009), On an ant colony-based approach for business fraud detection. Emerging Intelligent Computing Technology and Applications, Springer Berlin Heidelberg, 1104-1111.
Jia-jie, Shen, (2012) Electronic transaction fraud detection based on improved PSO algorithm. Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on IEEE
Elías, Arturo, et al. (2011) Outlier analysis for plastic card fraud detection a hybridized and multi-objective approach. Hybrid Artificial Intelligent Systems, Springer Berlin Heidelberg, 1-9.
Alowais, Mohammed Ibrahim, and Lay-Ki Soon, (2012) Credit Card Fraud Detection: Personalized or Aggregated Model. Mobile, Ubiquitous, and Intelligent Computing (MUSIC), 2012 Third FTRA International Conference on IEEE.
Rong-Chang Chen; Shu-Ting Luo; Xun Liang, Lee, V.C.S. (2005) Personalized Approach Based on SVM and ANN for Detecting Credit Card Fraud. Neural Networks and Brain, ICNN&B '05. International Conference on, vol.2,no., pp.810-815, 13-15.
Hormozi, Elham, et al. (2013) Accuracy evaluation of a credit card fraud detection system on Hadoop MapReduce, Information and Knowledge Technology (IKT), 2013 5th Conference on. IEEE.
Hadi, et al. (2013) Credit cards fraud detection by negative selection algorithm on hadoop (To reduce the training time), Information and Knowledge Technology (IKT), 2013 5th Conference on IEEE.
Duman, Ekrem, AyseBuyukkaya, and IlkerElikucuk. (2013) A Novel and Successful Credit Card Fraud Detection System Implemented in a Turkish Bank. Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on IEEE
Bahnsen, Alejandro Correa, et al. (2013) Cost sensitive credit card fraud detection using Bayes minimum risk. Machine Learning and Applications (ICMLA), 2013 12th International Conference on Vol. 1 IEEE
Gurjar, Ram Niwas, Neeraj Sharma, and ManojWadhwa. (2014) Finding outliers using mutual nearness based ranks detection algorithm. Optimization, Reliabilty, and Information Technology (ICROIT), 2014 International Conference on IEEE
Cao, Lei, et al. (2014) Scalable distance-based outlier detection over high-volume data streams. Data Engineering (ICDE), 2014 IEEE 30th International Conference on IEEE
Sahin, Yusuf, SerolBulkan, and EkremDuman, (2013) A cost-sensitive decision tree approach for fraud detection, Expert Systems with Applications 40.15: 5916-5923.
Duman, Ekrem, M. HamdiOzcelik. (2011) Detecting credit card fraud by genetic algorithm and scatter search, Expert Systems with Applications 38.10: 13057-13063.
Wei, Wei, et al. (2013) Effective detection of sophisticated online banking fraud on extremely imbalanced data, World Wide Web 16.4: 449-475
Kim, Ae Chan, et al. (2014) Fraud and financial crime detection model using malware forensics, Multimedia Tools and Applications 68.2: 479-496.