LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification

Rezaul Karim, Mahmudul Hasan, Amit Kumar Kundu, Ali Ahmed Ave

Abstract


Abstract– While the core quality of SVM comes from its ability to get the global optima, its classification performance also depends on computing kernels.However, while this kernel-complexity generates the power of machine, it is also responsible for the compu-tational load to execute this kernel. Moreover, insisting on a similarity function to be a positive definite kernel demands some properties to be satisfied that seem unproductive sometimes raising a question about which similarity measures to be used for classifier. We model Vapnik’s LPSVM proposing a new similarity function replacing kernel func-tion.Following the strategy of ”Accuracy first, speed second”, we have modelled a similarity function that is mathematically well-defined depending on analysis as well as geometry and complex enough to train the machine for generating solid generalization ability. Being consistent with the theory of learning by Balcan and Blum [1], our similarity function does not need to be a valid kernel function and demands less computational cost for executing compared to its counterpart like RBF or other kernels while provides sufficient power to the classifier using its optimal complexity. Benchmarking shows that our similarity function based LPSVM poses test error 0.86 times of the most powerful RBF based QP SVM but demands only 0.40 times of its computational cost.

Full Text:

PDF

References


M.-F. Balcan, A. Blum, and N. Srebro, “A

theory of learning with similarity functions,”

Machine Learning, vol. 72, no. 1, pp. 89–112,

N. Cristianini, J. Shawe-Taylor et al., An

introduction to support vector machines and

other kernel-based learning methods. Cam-

bridge university press, 2000.

B. Scholkopf and A. J. Smola, Learning with

kernels: support vector machines, regulariza-

tion, optimization, and beyond. Adaptive

Computation and Machine Learning Series,

V. Vapnik, Statistical learning theory. Wi-

ley, 1998.

R. Karim, M. Bergtholdt, J. H. Kappes, and

C. Schn ̈orr, “Greedy-based design of sparse

two-stage svms for fast classification,” in

Proc. of the 29th DAGM Symposium on Pat-

tern Recognition, 2007, pp. 395–404.

R. Karim and A. K. Kundu, “Efficiency and

performance analysis of a sparse and pow-

erful second order svm based on lp and qp,”

International Journal of Advanced Computer

Science and Applications (IJACSA), vol. 9,

no. 2, pp. 311–318, 2018.

——, “Computational analysis to reduce

classification cost keeping high accuracy of

the sparser lpsvm,” International Journal of

Machine Learning and Computing, vol. 9,

no. 6, pp. 728–733, 2019.

T. Joachims and C. N. J. Yu, “Sparse kernel

svms via cutting-plane training,” in Proc. of

the European Conference on Machine Learn-

ing and Knowledge Discovery in Databases

(ECML), 2009, p. 8.

S. S. Keerthi, O. Chapelle, and D. DeCoste,

“Building support vector machines with re-

duced classifier complexity,” Journal of Ma-

chine Learning Research, (JMLR), vol. 7, no.

Jul, pp. 1493–1515, 2006.

C. J. Burges, “A tutorial on support vector

machines for pattern recognition,” Data min-

ing and knowledge discovery, vol. 2, no. 2, pp.

–167, 1998.

Y. Ying, C. Campbell, and M. Girolami,

“Analysis of svm with indefinite kernels,”

Advances in neural information processing

systems, vol. 22, pp. 2205–2213, 2009.

J. Chen and J. Ye, “Training svm with in-

definite kernels,” in Proceedings of the 25th

international conference on Machine learn-

ing, 2008, pp. 136–143.

T. Graepel, R. Herbrich, P. Bollmann-

Sdorra, and K. Obermayer, “Classification

on pairwise proximity data,” Advances in

neural information processing systems, pp.

–444, 1999.

B. Haasdonk, “Feature space interpreta-

tion of svms with indefinite kernels,” IEEE

Transactions on pattern analysis and ma-

chine intelligence, vol. 27, no. 4, pp. 482–492,

H.-T. Lin and C.-J. Lin, “A study on sigmoid

kernels for svm and the training of non-psd

kernels by smo-type methods,” submitted to

Neural Computation, vol. 3, no. 1-32, p. 16,

R. Luss and A. d’Aspremont, “Support vec-

tor machine classification with indefinite ker-

nels,” Mathematical Programming Computa-

tion, vol. 1, no. 2, pp. 97–118, 2009.

C. S. Ong, X. Mary, S. Canu, and A. J.

Smola, “Learning with non-positive kernels,”

in Proceedings of the twenty-first interna-

tional conference on Machine learning, 2004,

p. 81.

E. Pekalska, P. Paclik, and R. P. Duin, “A

generalized kernel approach to dissimilarity-

based classification,” Journal of machine

learning research, vol. 2, no. Dec, pp. 175–

, 2001.

G. Wu, E. Y. Chang, and Z. Zhang, “An

analysis of transformation on non-positive

semidefinite similarity matrix for kernel ma-

chines,” in Proceedings of the 22nd Inter-

national Conference on Machine Learning,

vol. 8. Citeseer, 2005.

I. Alabdulmohsin, X. Gao, and X. Z. Zhang,

“Support vector machines with indefinite

kernels,” in Asian Conference on Machine

Learning. PMLR, 2015, pp. 32–47.

L. Wang, C. Yang, and J. Feng, “On learning

with dissimilarity functions,” in Proceedings

of the 24th international conference on Ma-

chine learning, 2007, pp. 991–998.

M.-F. Balcan, A. Blum, and N. Srebro, “Im-

proved guarantees for learning via similarity

functions,” 2008.

A. Nefedov, J. Ye, C. Kulikowski, I. Much-

nik, and K. Morgan, “Experimental study of

support vector machines based on linear and

quadratic optimization criteria,” DIMACS

Technical Report 2009-18, 2009.

J. Kools. (2013) 6 functions for generating

artificial datasets. https://www.mathworks.

com/matlabcentral/fileexchange/

-6-functions-for-generating-artificial-datasets.

[Online; accessed 22-September-2020].

H. A. Abu Alfeilat, A. B. Hassanat,

O. Lasassmeh, A. S. Tarawneh, M. B. Al-

hasanat, H. S. Eyal Salman, and V. S.

Prasath, “Effects of distance measure choice

on k-nearest neighbor classifier performance:

a review,” Big data, vol. 7, no. 4, pp. 221–

, 2019.

G. R ̈atsch. Benchmark data sets. [On-

line]. Available: http://ida.first.fraunhofer.

de/projects/bench/benchmarks.htm

T. Diethe, “13 benchmark datasets derived

from the UCI, DELVE and STATLOG

repositories,” https://github.com/tdiethe/

gunnar raetsch benchmark datasets/,

S. Boyd and L. Vandenberghe, Convex op-

timization. Cambridge University Press,

A. S. Shirkhorshidi, S. Aghabozorgi, and

T. Y. Wah, “A comparison study on similar-

ity and dissimilarity measures in clustering

continuous data,” PloS one, vol. 10, no. 12,

p. e0144059, 2015.

V. Prasath, H. A. A. Alfeilat, A. Hassanat,

O. Lasassmeh, A. S. Tarawneh, M. B. Al-

hasanat, and H. S. E. Salman, “Distance and

similarity measures effect on the performance

of k-nearest neighbor classifier–a review,”

arXiv preprint arXiv:1708.04321, 2017.

D. R. Wilson and T. R. Martinez, “Improved

heterogeneous distance functions,” Journal

of artificial intelligence research, vol. 6, pp.

–34, 1997.

C. C. Aggarwal, A. Hinneburg, and D. A.

Keim, “On the surprising behavior of dis-

tance metrics in high dimensional space,” in

International conference on database theory.

Springer, 2001, pp. 420–434




DOI: https://doi.org/10.31449/inf.v47i8.4767

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.