EBSVM: A Sparsification Method from LP-Based SVM via Stochastic Functional ξ-Analysis
Abstract
While SVM being a solid mathematical and state-of-the-art sparse but powerful stochastic machine with top accuracy, Vapnik’s LP based SVM is sparser than the QP generated one though offers similar accuracy. Yet, further sparsity is essential to efficiently work with very complicated and large datasets. In our work, we propose a novel efficient method, EBSVM by imposing additional sparsification on this sparser LPSVM maintaining optimal complexity of the machine to further reduce classification cost while preserving similar accuracy by analyzing the stochastic auxiliary variable, ξ after a compact deterministic mapping of data. Experiments with Gunnar Rätsch benchmark data as well as synthetic data reveals that our model demands classification cost much smaller than the cutting-edge prominent classifiers while posing similar accuracy despite of being simpler to be implemented. On average, it costs 12.4% of the standard QPSVM, 50.3% of the VLPSVM, 68.1% of the EESVM [1], 98.1% of the greedy SpSVM-2 [2]. It insists the least MAC (Machine Accuracy Cost) value as well as GFR(Generalization Failure Rate) considering these machines. Numerical calculus based noise-sensitivity analysis proves our model’s potentiality in working with noisy data. This machine is also expected to work with high efficiency in case of very large and complex data satisfying constraints in real-time.
References
A. K. Kundu, R. Karim, and A. A. Ave,
“Escalating svm-efficiency by sequential qp
lp and slack variable analysis,” Journal
of Computer Science, vol. 19, no. 10,
pp. 1253–1262, 2023. [Online]. Available:
https://thescipub.com/abstract/jcssp
.2023.1253.1262
S. S. Keerthi, O. Chapelle, and D. De-
Coste, “Building support vector machines
with reduced classifier complexity,” Jour-
nal of Machine Learning Research, vol. 7,
pp. 1493–1515, 2006. [Online]. Available:
https://www.jmlr.org/papers/v7/
keerthi06a.html
I. Steinwart, “Sparseness of support vector
machines—some asymptotically sharp bounds,” in Advances in Neural Information Processing Systems 16, S. Thrun,
L. K. Saul, and B. Sch¨olkopf, Eds.,
, pp. 1069–1076. [Online]. Available:
https://proceedings.neurips.cc/paper/2003/
file/4c8c76b39d294759a9000cbda3a6571a-
Paper.pdf
T. Downs, K. E. Gates, and A. Mas-
ters, “Exact simplification of support
vector solutions,” Journal of Ma-
chine Learning Research, vol. 2,
pp. 293–297, 2001. [Online]. Available:
https://dl.acm.org/doi/10.5555/944790.944814
T. Joachims and C.-N. J. Yu, “Sparse
kernel svms via cutting-plane training,” in
Machine Learning and Knowledge Discov-
ery in Databases, ser. Lecture Notes in
Computer Science, vol. 5781. Springer,
, pp. 47–62. [Online]. Available:
https://link.springer.com/chapter/10.1007/
-3-642-04180-8 8
A. Cotter, S. Shalev-Shwartz, and N. Sre-
bro, “Learning optimally sparse support
vector machines,” in Proceedings of the 30th
International Conference on Machine Learn-
ing, 2013, pp. 266–274. [Online]. Available:
https://proceedings.mlr.press/v28/cotter13
.html
S. Romdhani, P. Torr, B. Sch¨olkopf,
and A. Blake, “Efficient face detec-
tion by a cascaded support–vector ma-
chine expansion,” vol. 460, no. 2051,
, pp. 3283–3297. [Online]. Available:
https://royalsocietypublishing.org/doi/10.
/rspa.2004.1333
M. Wu, B. Sch¨olkopf, and G. Bakır,
“A direct method for building sparse
kernel learning algorithms,” Journal of
Machine Learning Research, vol. 7,
pp. 603–624, 2006. [Online]. Available:
https://jmlr.org/papers/volume7/wu06a
/wu06a.pdf
M. R¨atsch, S. Romdhani, G. Teschke, and
T. Vetter, “Over-complete wavelet approxi-
mation of a support vector machine for effi-
cient classification,” in Pattern Recognition:
th DAGM Symposium, ser. Lecture Notes
in Computer Science, vol. 3663. Springer,
, pp. 351–360. [Online]. Available:
https://link.springer.com/chapter/10.1007/
44
B. Heisele, T. Serre, S. Prentice, and
T. Poggio, “Hierarchical classification
and feature reduction for fast face
detection with support vector machines,”
Pattern Recognition, vol. 36, no. 9,
pp. 2007–2017, 2003. [Online]. Available:
https://www.sciencedirect.com/science/article
/abs/pii/S0031320303000621
R. Karim, M. Bergtholdt, J. H. Kappes,
and C. Schn¨orr, “Greedy-based design
of sparse two-stage svms for fast classi-
fication,” in Pattern Recognition: 29th
DAGM Symposium, ser. Lecture Notes in
Computer Science, vol. 4713. Springer,
, pp. 395–404. [Online]. Available:
https://link.springer.com/chapter/10.1007/978-
-540-74936-3 40
K. Z. Arreola, J. Fehr, and H. Burkhardt,
“Fast support vector machine classification
using linear svms,” in Proceedings of the 18th
International Conference on Pattern Recog-
nition (ICPR), 2006, pp. 366–369.
H. Sahbi and D. Geman, “A hier-
archy of support vector machines for
pattern detection,” Journal of Ma-
chine Learning Research, vol. 7, pp.
–2123, 2006. [Online]. Available:
https://jmlr.org/papers/v7/sahbi06a.html
S. Maji, A. C. Berg, and J. Malik,
“Efficient classification for additive kernel
svms,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 35,
no. 1, pp. 66–77, 2013. [Online]. Available:
https://acberg.com/papers/mbm2012pami.pdf
ˇLubor Ladick´y and P. H. S. Torr,
“Locally linear support vector machines,”
in Proceedings of the 28th International
Conference on Machine Learning (ICML),
, pp. 985–992. [Online]. Available:
https://icml.cc/Conferences/2011/papers/508
icmlpaper.pdf
Z. Xu, J. Gardner, S. Tyree,
and K. Weinberger, “Compressed sup-
port vector machines,” arXiv preprint
arXiv:1501.06478, 2015. [Online]. Available:
https://arxiv.org/abs/1501.06478
J. Li and Y. Zhang, “Learning surf cas-
cade for fast and accurate object detection,”
in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition
(CVPR), 2013, pp. 3468–3475.
V. C. Raykar, B. Krishnapuram, and
S. Yu, “Designing efficient cascaded classi-
fiers: Tradeoff between accuracy and cost,”
in Proceedings of the 16th ACM SIGKDD
International Conference on Knowledge Dis-
covery and Data Mining, ser. KDD ’10.
ACM, 2010, pp. 853–860. [Online]. Available:
https://dl.acm.org/doi/10.1145/1835804.
R. Karim, A. K. Kundu, and A. A.
Ave, “Sequential mathematical program-
ming with ζ-analysis for hesvm,” in Math-
ematics and Computer Science: Contem-
porary Developments. BP International,
, vol. 7, pp. 144–169. [Online].
Available: https://stm.bookpi.org/MCSCD-
V7/article/view/16162
Z. Fu, A. Robles-Kelly, and J. Zhou,
“Mixing linear svms for nonlinear clas-
sification,” IEEE Transactions on Neu-
ral Networks, vol. 21, no. 12, pp.
–1975, 2010. [Online]. Available:
https://doi.org/10.1109/TNN.2010.2080319
R. Karim and A. K. Kundu, “Efficiency
and performance analysis of a sparse and
powerful second order svm based on lp
and qp,” International Journal of Advanced
Computer Science and Applications, vol. 9,
no. 2, pp. 311–318, 2018. [Online]. Available:
https://thesai.org/Publications/ViewPaper?
Volume=9&Issue=2&Code=IJACSA&
SerialNo=44
——, “Computational analysis to reduce
classification cost keeping high accuracy of
the sparser lpsvm,” International Journal of
Machine Learning and Computing, vol. 9,
no. 6, pp. 728–733, 2019. [Online]. Available:
https://www.ijmlc.org/vol9/865-L0294.pdf
V. Franc and V. Hlav´aˇc, “Greedy algorithm
for a training set reduction in the kernel
methods,” in Computer Analysis of Images
and Patterns (CAIP), ser. Lecture Notes
in Computer Science, vol. 2756. Springer,
, pp. 426–433. [Online]. Available:
https://link.springer.com/chapter/10.1007/978-
-540-45179-2 53
Q. Gu and J. Han, “Clustered support
vector machines,” in Proceedings of the
th International Conference on Artificial
Intelligence and Statistics (AISTATS),
, pp. 307–315. [Online]. Available:
https://proceedings.mlr.press/v31/gu13b.html
R. Karim, M. Hasan, A. K. Kundu,
and A. A. Ave, “Lp svm with a novel
similarity function outperforms powerful
lp-qp-kernel-svm considering efficient clas-
sification,” Informatica, vol. 47, no. 8,
pp. 1–12, 2023. [Online]. Available:
https://www.informatica.si/index.php/
informatica/article/view/4767
S. Zhou, “Sparse SVM for Sufficient
Data Reduction,” IEEE Transactions
on Pattern Analysis and Machine
Intelligence, vol. 44, no. 9, pp.
–4958, 2022. [Online]. Available:
https://ieeexplore.ieee.org/document/9415153
D. Geebelen, J. A. K. Suykens, and J. Van-
dewalle, “Reducing the number of support
vectors of svm classifiers using the smoothed
separable case approximation,” IEEE Trans-
actions on Neural Networks and Learning
Systems, vol. 23, no. 4, pp. 682–688, 2012.
H. Wang, Y. Shao, S. Zhou, C. Zhang, and
N. Xiu, “Support Vector Machine Classifier
via L0/1 Soft-Margin Loss,” IEEE Transac-
tions on Pattern Analysis and Machine Intel-
ligence, vol. 44, no. 10, pp. 7253–7265, 2022.
J. Liu, L.-W. Huang, Y.-H. Shao,
W.-J. Chen, and C.-N. Li, “Nonlin-
ear kernel support vector machine with
-1 soft margin loss,” arXiv preprint
arXiv:2203.00399, 2022. [Online]. Available:
https://arxiv.org/abs/2203.00399
M. Wu, Z. Yang, and J. Ye,
“Nonlinear Kernel-Free Quadratic Hyper-
Surface Support Vector Machine with
-1 Loss Function,” arXiv preprint
arXiv:2404.10559, 2024. [Online]. Available:
https://arxiv.org/abs/2404.10559
M. Akhtar, M. Tanveer, and M. Arshad,
“RoBoSS: A robust, bounded, sparse, and
smooth loss function for supervised learn-
ing,” IEEE Transactions on Pattern Anal-
ysis and Machine Intelligence, 2024.
R. Lin, Y. Yao, and Y. Liu, “Ker-
nel support vector machine classifiers
with ℓ0-norm hinge loss,” Neurocomputing,
vol. 589, p. 127669, 2024. [Online]. Available:
https://www.sciencedirect.com/science/article
/abs/pii/S0925231224004405
H. Wang and Y. Shao, “Sparse and robust
svm classifier for large scale classification,”
Applied Intelligence, vol. 53, no. 16, pp.
647–19 671, 2023. [Online]. Available:
https://link.springer.com/article/10.1007/s10489-
-04511-w
——, “Fast truncated huber loss
svm for large scale classification,”
Knowledge-Based Systems, vol. 260,
p. 110074, 2023. [Online]. Available:
https://www.sciencedirect.com/science/article
/pii/S0950705122011674
H. Wang, Z. Zhu, and Y. Shao, “Fast sup-
port vector machine with low-computational
complexity for large-scale classification,”
IEEE Transactions on Systems, Man,
and Cybernetics: Systems, vol. 54,
pp. 4151–4163, 2024. [Online]. Available:
https://api.semanticscholar.org/CorpusID:
H. Wang, Y. Shao, and S. Zhou, “Fast svm
classifier for large-scale classification prob-
lems,” Information Sciences, vol. 625, p.
, 2023.
L. K. Hansen, “Stochastic linear learn-
ing: Exact test and training error
averages,” Neural Networks, vol. 6, no. 3,
pp. 393–396, 1993. [Online]. Available:
https://www.sciencedirect.com/science/article
/pii/089360809390006I
R. Karim, “Theoretical analysis for exact re-
sults in stochastic linear learning,” Master’s
thesis, Citeseer, 2004.
V. Kecman, “Support vector machines—an
introduction,” in Support Vector Ma-
chines: Theory and Applications, ser.
Studies in Fuzziness and Soft Comput-
ing, L. Wang, Ed. Springer, 2005,
vol. 177, pp. 1–47. [Online]. Available:
https://link.springer.com/chapter/10.1007
/10984697 1
S. Geman, E. Bienenstock, and R. Doursat,
“Neural networks and the bias/variance
dilemma,” Neural Computation, vol. 4,
no. 1, pp. 1–58, 1992. [Online]. Available:
https://doi.org/10.1162/neco.1992.4.1.1
I. Guyon, V. N. Vapnik, B. E.
Boser, L. Bottou, and S. A. Solla,
“Structural risk minimization for char-
acter recognition,” Advances in Neural
Information Processing Systems, pp.
–479, 1992. [Online]. Available:
https://proceedings.neurips.cc/paper/1991/
file/10a7cdd970fe135cf4f7bb55c0e3b59f-
Paper.pdf
D. C. Montgomery, E. A. Peck, and G. G.
Vining, Introduction to Linear Regression
Analysis, 6th ed. John Wiley & Sons, 2021.
C. J. Burges, “A tutorial on support vector
machines for pattern recognition,” Data
Mining and Knowledge Discovery, vol. 2,
no. 2, pp. 121–167, 1998. [Online]. Available:
https://doi.org/10.1023/A:1009715923555
C. M. Bishop, Neural Networks for
Pattern Recognition. Oxford: Claren-
don Press, 1995. [Online]. Available:
https://dl.acm.org/doi/10.5555/525960
V. Vapnik, The Nature of Statis-
tical Learning Theory. New York:
Springer-Verlag, 1995. [Online]. Available:
https://doi.org/10.1007/978-1-4757-2440-0
V. N. Vapnik, Estimation of Dependences
Based on Empirical Data. Springer, 1979.
V. Vapnik and A. Chervonenkis, “A note on
one class of perceptrons,” Automation and
Remote Control, vol. 25, 1964.
M. Awad and R. Khanna, “Support vec-
tor machines for classification,” in Effi-
cient Learning Machines. Berkeley, CA:
Apress, 2015, pp. 39–66. [Online]. Avail-
able: https://doi.org/10.1007/978-1-4302-
-9 3
T. M. Cover, “Geometrical and sta-
tistical properties of systems of linear
inequalities with applications in pattern
recognition,” IEEE Transactions on Elec-
tronic Computers, vol. EC-14, no. 3,
pp. 326–334, 1965. [Online]. Available:
https://doi.org/10.1109/PGEC.1965.264137
M. Aizerman, E. Braverman, and L. Rozo-
noer, “Theoretical foundations of the poten-
tial function method in pattern recognition
learning,” Automat. Remote Control, vol. 25,
pp. 821–837, 1964. [Online]. Available:
https://cs.uwaterloo.ca/ y328yu/classics/
kernel.pdf
K.-L. Du, B. Jiang, J. Lu, J. Hua,
and M. N. S. Swamy, “Exploring ker-
nel machines and support vector ma-
chines: Principles, techniques, and fu-
ture directions,” Mathematics, vol. 12,
no. 24, p. 3935, 2024. [Online]. Available:
https://doi.org/10.3390/math12243935
J. Mercer, “Functions of positive and neg-
ative type, and their connection with the
theory of integral equations,” Philosophi-
cal Transactions of the Royal Society of
London. Series A, Containing Papers of a
Mathematical or Physical Character, vol.
, pp. 415–446, 1909. [Online]. Available:
https://royalsocietypublishing.org/doi/
1098/rsta.1909.0016
R. Bellman, Dynamic Programming. Prince-
ton, NJ: Princeton University Press, 1957.
N. Aronszajn, “Theory of reproducing ker-
nels,” Transactions of the American Mathe-
matical Society, vol. 68, pp. 337–404, 1950.
F. Girosi, “An equivalence between sparse
approximation and support vector ma-
chines,” Artificial Intelligence Laboratory,
Massachusetts Institute of Technology, Tech.
Rep. AIM-1606, 1997. [Online]. Available:
https://dspace.mit.edu/handle/1721.1/7289
N. E. Heckman, “The theory and appli-
cation of penalized least squares methods
or reproducing kernel hilbert spaces made
easy,” Department of Statistics, University
of British Columbia, Tech. Rep. Tech-
nical Report 216, 1997. [Online]. Avail-
able: https://www.stat.ubc.ca/technical-
reports-archive/doc/216.pdf
G. Wahba, Spline Models for Ob-
servational Data, ser. CBMS-NSF Re-
gional Conference Series in Applied
Mathematics. Philadelphia: Society
for Industrial and Applied Mathemat-
ics, 1990, vol. 59. [Online]. Available:
https://epubs.siam.org/doi/book/10.1137/
9781611970128
S. Boyd and L. Vandenberghe, Con-
vex Optimization. Cambridge Uni-
versity Press, 2004. [Online]. Available:
https://doi.org/10.1017/CBO9780511804441
V. N. Vapnik, Statistical Learn-
ing Theory. Wiley, 1998. [Online].
Available: https://www.wiley.com/en-
us/Statistical+Learning+Theory-p-
A. Nefedov, J. Ye, C. Kulikowski,
I. Muchnik, and K. Morgan, “Experi-
mental study of support vector machines
based on linear and quadratic optimiza-
tion criteria,” DIMACS Technical Re-
port 2009-18, 2009. [Online]. Available:
https://doi.org/10.1109/icmla.2009.52
T. Suttorp and C. Igel, “Multi-objective
optimization of support vector machines,”
pp. 199–220, 2006. [Online]. Available:
https://doi.org/10.1007/11399346 9
N. Cristianini and J. Shawe-Taylor,
An Introduction to Support Vector Ma-
chines and Other Kernel-Based Learning
Methods. Cambridge: Cambridge Uni-
versity Press, 2000. [Online]. Available:
https://doi.org/10.1017/CBO9780511801389
A. Tharwat, A. E. Hassanien, and B. E. El-
naghi, “A ba-based algorithm for parame-
ter optimization of support vector machine,”
Pattern Recognition Letters, vol. 93, pp. 13–
, 2016.
S. R. Gunn, “Support vector machines
for classification and regression,” School
of Electronics and Computer Science, Uni-
versity of Southampton, Tech. Rep. ISIS
Technical Report, 1998. [Online]. Available:
https://eprints.soton.ac.uk/256459/
G. R¨atsch. Benchmark datasets.
Re-packaged and available on
GitHub. [Online]. Available:
https://github.com/tdiethe/gunnar raetsch
benchmark datasets
J. Kools, “6 functions for generating arti-
ficial datasets,” 2013. [Online]. Available:
https://www.mathworks.com/matlabcentral
/fileexchange/41459-6-functions-for-
generating-artificial-datasets
DOI: https://doi.org/10.31449/inf.v49i28.8742

This work is licensed under a Creative Commons Attribution 3.0 License.