Comparative Analysis of Web Usage Data using SOM and K-means algorithms
- PG Scholar, Dept. of CSE, BMSCE, Bangalore, INDIA.
- Associate Professor, Dept. of CSE, BIT, Bangalore, INDIA.
- Professor and Head, Dept. of CSE, BMSCE, Bangalore, INDIA.
- Abstract
- Keywords
- Cite This Article as
- Corresponding Author
Web data is becoming very popular for the transformation and distribution of valuable information which can be freely accessible by users. Hence Web is becoming too large and diverse. Organization of data on the Web for the efficient access has become a big challenge for the Web site administrators. So there is a need to apply data mining and neural network techniques to extract information from the Web for the better organization of Web data. Web usage mining is one of the main research areas which focus on extracting valuable information from the Web by using Web usage data. Web usage mining is a part of data mining that is much needed to find out patterns or clusters with help of user’s session and behaviour. Web usage mining process starts with pre-processing followed by clustering of data and finally visualization of clusters effectively. We have considered, Web usage mining to find required information by analysing Web usage data using two Self Organizing Map and K-means. As to have a comparison between both methods in clustering of Web usage data we need to initially prepare Web navigational data available for clustering. So we will start with pre-processing of log file to remove unwanted data followed by removal of redundant data and later separating the users as well as sessions depending on time interval. In pre-processing phase the sessions are formed based on time interval taken by each of the user. Once all the sessions are formed we will make use of Self Organizing Map (SOM) algorithm to segregate them into different clusters using the weight matrix. To compare SOM, with K-means algorithm clusters are formed using K-means algorithms. Clusters formed by both the algorithms are visualized by using JFree tool. Finally, the charts obtained by both the algorithms are compared to analyse the clusters.
[Shilpa M Patil, T Vijaya Kumar, H S Guruprasad (2015); Comparative Analysis of Web Usage Data using SOM and K-means algorithms Int. J. of Adv. Res. 3 (Oct). 486-493] (ISSN 2320-5407). www.journalijar.com