22Aug 2016

Support Vector Machines in Apache Spark

Ann Paul

Department of Information Technology, Rajagiri School of Engineering & Technology, Kerala

Abstract
Keywords
Cite This Article as
Corresponding Author

Complex and huge quantities of data are produced every second. These data need to be categorized. Hence, text categorization is the method of categorizing text documents into one or more predefined categories or classes. A number of methods are proposed for text categorization. The most popular method is the Support Vector Machine (SVM) because it is more efficient when compared to other proposed methods for text categorization. The objective of the paper is to implement SVM using Apache Spark and to predict the accuracy and confusion matrix of the classifier. Apache Spark is an open source framework that helps in computation of large datasets in an efficient way much faster than Hadoop. Spark is a computational tool that processes the data.

[Ann Paul (2016); Support Vector Machines in Apache Spark Int. J. of Adv. Res. 4 (Aug). 76-80] (ISSN 2320-5407). www.journalijar.com

Ann Paul

Download Full Paper

Download PDF No. of Downloads: 135 | No. of Views: 336

This work is licensed under a Creative Commons Attribution 4.0 International License.

Support Vector Machines in Apache Spark

Download Full Paper

Share this article