PERFORMANCE EVALUATION OF WEKA CLUSTERING ALGORITHMS ON LARGE DATASETS
- Department Of Computer Science, Himachal Pradesh University, Shimla, India.
- Abstract
- Keywords
- References
- Cite This Article as
- Corresponding Author
Data Mining is the process of analyzing data from different viewpoints and summarizing it into useful information. By using Data mining tool, the user can analyze data from different dimensions or angles, categorize it, and process the relations recognized. Clustering is one of most widely used techniques in data mining. Clustering is the process of grouping data by finding similarities between data based on their features. Similar Items are grouped in one cluster and dissimilar in another. In this paper, a comparative study of nine clustering algorithms is performed. For comparison three datasets are used. The main objective of the study is to observe the effect of size of different dataset on data mining tool and clustering algorithms. The dataset chosen for comparison are diverse in terms of number of attributes and instances. All the nine algorithms are compared according to the factors such as size of the dataset, number of clusters and time taken to form clusters. For performing comparison, data mining tool Weka is used. Also the performance of Weka for handling large datasets is analyzed.
- Han and M. Kamber,?Data Mining, Concepts and Techniques?,Second Edition, Morgan Kaufman Publishers
- Kalyani M Raval, ?Data Mining Techniques?, International Journal of Advanced Research in Computer Science and Software Engineering , Volume 2, Issue 10, October 2012
- Smita, Priti Sharma, ?Use of Data Mining in Various Field: A Survey Paper?, IOSR Journal of Computer Engineering (IOSR-JCE) ,Volume 16, Issue 3, Ver. V (May-Jun. 2014)
- Prachi Surwade, Prof. Satish S. Banait, ?A Survey on Clustering Techniques For Mining Big Data?, International Journal Of Advanced Research in Science And Management, Volume 2, Issue 2, Feburary 2016
- Harshada S. Deshmukh, Prof. P. L. Ramteke, ?COMPARING THE TECHNIQUES OF CLUSTER ANALYSIS FOR BIG DATA?,International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), Volume 4 Issue 12, December 2015
- KeshavSanse, Meena Sharma, ?Clustering methods for Big data analysis?, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), Volume 4 Issue 3, March 2015
- Sajana, C. M. Sheela Rani and K. V. Narayana, ?A Survey on Clustering Techniques for Big Data Mining ?,Indian Journal of Science and Technology, Vol 9(3), DOI:10.17485/ijst/2016/v9i3/75971, January 2016
- Narendra Sharma, Aman Bajpai, Mr. Ratnesh Litoriya, ?Comparison the various clustering algorithms of weka tools?, International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)
- Sunita B Aher, Mr. LOBO L.M.R.J, ?Data Mining in Educational System using WEKA?, International Conference on Emerging Technology Trends (ICETT) 2011 Proceedings published by International Journal of Computer Applications? (IJCA)
- Garima, Hina Gulati, P.K.Singh, ?Clustering Techniques in Data Mining: A Comparison?,2nd International Conference on Computing for Sustainable Global Development, 2015
- Prakash Singh, Aarohi Surya, ?PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA?, International Journal of Advances in Engineering & Technology, Jan., 2015
- Sapna Jain, M AfsharAalam, M. N Doja,? K-MEANS CLUSTERING USING WEKA INTERFACE?, ,Proceedings of the 4th National Conference; INDIACom-2010 Computing For Nation Development, February 25 ? 26, 2010
- Rupali Patil, Shyam Deshmukh, K Rajeswari, ?Analysis of Simple K-Means with Multiple Dimensions using WEKA?, International Journal of Computer Applications (0975 ? 8887) ,Volume 110 ? No. 1, January 2015
- Mugdha Jain, Chakradhar Verma, ?Adapting k-means for Clustering in Big Data?, International Journal of Computer Applications (0975 ? 8887) ,Volume 101? No.1, September 2014
- Olga Kurasova, VirginijusMarcinkevicius, Viktor Medvedev, AurimasRapecka, and Pavel Stefanovic , ?Strategies for Big Data Clustering?, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence , DOI74110.1109/ICTAI.2014.115
- Bhagyashri S. Gandhi, Leena A. Deshpande, ?The Survey on Approaches to Efficient Clustering and Classification Analysis of Big Data?,International Journal of Engineering Trends and Technology (IJETT) ? Volume 36 Number 1- June 2016
- Venkateswara Reddy Eluri, MS. Amina Salim Mohd AL-Jabri, Dr.M.RAMESH, Dr. Mare Jane, ?A Comparative Study of Various Clustering Techniques on Big Data Sets using Apache Mahout?, 2016 3rd MEC International Conference on Big Data and Smart City
- Aris-Kyriakos Koliopoulos, Paraskevas Yiapanis, FiratTekiner, Goran Nenadic, John Keane, ?A Parallel Distributed Weka Framework for Big Data Mining using Spark?, IEEE International Congress on Big Data,2015.
[Anju Parmar, Divya Chauhan and K.L. Bansal. (2017); PERFORMANCE EVALUATION OF WEKA CLUSTERING ALGORITHMS ON LARGE DATASETS Int. J. of Adv. Res. 5 (Jun). 2209-2216] (ISSN 2320-5407). www.journalijar.com
DEPARTMENT OF COMPUTER SCIENCE, HIMACHAL PRADESH UNIVERSITY, SHIMLA, INDIA