Support Vector Machines in Apache Spark
- Department of Information Technology, Rajagiri School of Engineering & Technology, Kerala
- Abstract
- Keywords
- Cite This Article as
- Corresponding Author
Complex and huge quantities of data are produced every second. These data need to be categorized. Hence, text categorization is the method of categorizing text documents into one or more predefined categories or classes. A number of methods are proposed for text categorization. The most popular method is the Support Vector Machine (SVM) because it is more efficient when compared to other proposed methods for text categorization. The objective of the paper is to implement SVM using Apache Spark and to predict the accuracy and confusion matrix of the classifier. Apache Spark is an open source framework that helps in computation of large datasets in an efficient way much faster than Hadoop. Spark is a computational tool that processes the data.
[Ann Paul (2016); Support Vector Machines in Apache Spark Int. J. of Adv. Res. 4 (Aug). 76-80] (ISSN 2320-5407). www.journalijar.com