A tutorial on support vector regression

Smola, Alex J.; Schölkopf, Bernhard

doi:10.1023/B:STCO.0000035301.49549.88

A tutorial on support vector regression

Published: August 2004

Volume 14, pages 199–222, (2004)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Alex J. Smola¹ &
Bernhard Schölkopf²

36k Accesses
7448 Citations
22 Altmetric
1 Mention
Explore all metrics

Abstract

In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A random forest guided tour

Article 19 April 2016

Gérard Biau & Erwan Scornet

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Bartosz Krawczyk

References

Aizerman M.A., Braverman É.M., and Rozonoér L.I. 1964. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25: 821-837.
Google Scholar
Aronszajn N. 1950. Theory of reproducing kernels. Transactions of the American Mathematical Society 68: 337-404.
Google Scholar
Bazaraa M.S., Sherali H.D., and Shetty C.M. 1993. Nonlinear Programming: Theory and Algorithms, 2nd edition, Wiley.
Bellman R.E. 1961. Adaptive Control Processes. Princeton University Press, Princeton, NJ.
Google Scholar
Bennett K. 1999. Combining support vector and mathematical programming methods for induction. In: Schölkopf B., Burges C.J.C., and Smola A.J., (Eds.), Advances in Kernel Methods-SV Learning, MIT Press, Cambridge, MA, pp. 307-326.
Google Scholar
Bennett K.P. and Mangasarian O.L. 1992. Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software 1: 23-34.
Google Scholar
Berg C., Christensen J.P.R., and Ressel P. 1984. Harmonic Analysis on Semigroups. Springer, New York.
Google Scholar
Bertsekas D.P. 1995. Nonlinear Programming. Athena Scientific, Belmont, MA.
Google Scholar
Bishop C.M. 1995. Neural Networks for Pattern Recognition. Clarendon Press, Oxford.
Google Scholar
Blanz V., Schölkopf B., Bülthoff H., Burges C., Vapnik V., and Vetter T. 1996. Comparison of view-based object recognition algorithms using realistic 3D models. In: von der Malsburg C., von Seelen W., Vorbrüggen J.C., and Sendhoff B. (Eds.), Artificial Neural Networks ICANN'96, Berlin. Springer Lecture Notes in Computer Science, Vol. 1112, pp. 251-256.
Bochner S. 1959. Lectures on Fourier integral. Princeton Univ. Press, Princeton, New Jersey.
Google Scholar
Boser B.E., Guyon I.M., and Vapnik V.N. 1992. Atraining algorithm for optimal margin classifiers. In: Haussler D. (Ed.), Proceedings of the Annual Conference on Computational Learning Theory. ACM Press, Pittsburgh, PA, pp. 144-152.
Google Scholar
Bradley P.S., Fayyad U.M., and Mangasarian O.L. 1998. Data mining: Overview and optimization opportunities. Technical Report 98-01, University ofWisconsin, Computer Sciences Department, Madison, January. INFORMS Journal on Computing, to appear.
Bradley P.S. and Mangasarian O.L. 1998. Feature selection via concave minimization and support vector machines. In: Shavlik J. (Ed.), Proceedings of the International Conference on Machine Learning, Morgan Kaufmann Publishers, San Francisco, California, pp. 82-90. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-03.ps.Z.
Google Scholar
Bunch J.R. and Kaufman L. 1977. Some stable methods for calculating inertia and solving symmetric linear systems. Mathematics of Computation 31: 163-179.
Google Scholar
Bunch J.R. and Kaufman L. 1980. A computational method for the indefinite quadratic programming problem. Linear Algebra and Its Applications, pp. 341-370, December.
Bunch J.R., Kaufman L., and Parlett B. 1976. Decomposition of a symmetric matrix. Numerische Mathematik 27: 95-109.
Google Scholar
Burges C.J.C. 1996. Simplified support vector decision rules. In L. Saitta (Ed.), Proceedings of the International Conference on Machine Learning, Morgan Kaufmann Publishers, San Mateo, CA, pp. 71-77.
Google Scholar
Burges C.J.C. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2): 121-167.
Article Google Scholar
Burges C.J.C. 1999. Geometry and invariance in kernel based methods. In Schölkopf B., Burges C.J.C., and Smola A.J., (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA, pp. 89-116.
Google Scholar
Burges C.J.C. and SchölkopfB. 1997. Improving the accuracy and speed of support vector learning machines. In Mozer M.C., Jordan M.I., and Petsche T., (Eds.), Advances in Neural Information Processing Systems 9, MIT Press, Cambridge, MA, pp. 375-381.
Google Scholar
Chalimourda A., Schölkopf B., and Smola A.J. 2004. Experimentally optimal νin support vector regression for different noise models and parameter settings. Neural Networks 17(1): 127-141.
Google Scholar
Chang C.-C., Hsu C.-W., and Lin C.-J. 1999. The analysis of decomposition methods for support vector machines. In Proceeding of IJCAI99, SVM Workshop.
Chang C.C. and Lin C.J. 2001. Training ν-support vector classi-fiers: Theory and algorithms. Neural Computation 13(9): 2119-2147.
Google Scholar
Chen S., Donoho D., and Saunders M. 1999. Atomic decomposition by basis pursuit. Siam Journal of Scientific Computing 20(1): 33-61.
Google Scholar
Cherkassky V. and Mulier F. 1998. Learning from Data. JohnWiley and Sons, New York.
Google Scholar
Cortes C. and Vapnik V. 1995. Support vector networks. Machine Learning 20: 273-297.
Article Google Scholar
Cox D. and O'Sullivan F. 1990. Asymptotic analysis of penalized likelihood and related estimators. Annals of Statistics 18: 1676-1695.CPLEX Optimization Inc. Using the CPLEX callable library. Manual, 1994.
Google Scholar
Cristianini N. and Shawe-Taylor J. 2000. An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK.
Google Scholar
Cristianini N., Campbell C., and Shawe-Taylor J. 1998. Multiplicative updatings for support vector learning. NeuroCOLT Technical Report NC-TR-98-016, Royal Holloway College.
Dantzig G.B. 1962. Linear Programming and Extensions. Princeton Univ. Press, Princeton, NJ.
Google Scholar
Devroye L., Györfi L., and Lugosi G. 1996. A Probabilistic Theory of Pattern Recognition. Number 31 in Applications of mathematics.Springer, New York.
Google Scholar
Drucker H., Burges C.J.C., Kaufman L., Smola A., and Vapnik V. 1997.Support vector regression machines. In: Mozer M.C., Jordan M.I., and Petsche T. (Eds.), Advances in Neural Information Processing Systems 9, MIT Press, Cambridge, MA, pp. 155-161.
Google Scholar
Efron B. 1982. The jacknife, the bootstrap, and other resampling plans. SIAM, Philadelphia.
Google Scholar
Efron B. and Tibshirani R.J. 1994. An Introduction to the Bootstrap. Chapman and Hall, New York.
Google Scholar
El-Bakry A., Tapia R., Tsuchiya R., and ZhangY. 1996. On the formulation and theory of the Newton interior-point method for nonlinear programming. J. Optimization Theory and Applications 89: 507-541.
Google Scholar
Fletcher R. 1989. Practical Methods of Optimization. John Wiley and Sons, New York.
Google Scholar
Girosi F. 1998. An equivalence between sparse approximation and support vector machines. Neural Computation 10(6): 1455-1480.
Article Google Scholar
Girosi F., Jones M., and Poggio T. 1993. Priors, stabilizers and basis functions: From regularization to radial, tensor and additive splines. A.I. Memo No. 1430, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Guyon I., Boser B., and Vapnik V. 1993. Automatic capacity tuning of very large VC-dimension classifiers. In: Hanson S.J., Cowan J.D., and Giles C.L. (Eds.), Advances in Neural Information Processing Systems 5. Morgan Kaufmann Publishers, pp. 147-155.
Härdle W. 1990. Applied nonparametric regression, volume 19 of Econometric Society Monographs. Cambridge University Press.
Hastie T.J. and Tibshirani R.J. 1990. Generalized Additive Models, volume 43 of Monographs on Statistics and Applied Probability. Chapman and Hall, London.
Google Scholar
Haykin S. 1998. Neural Networks: A Comprehensive Foundation. 2nd edition. Macmillan, New York.
Google Scholar
Hearst M.A., Schölkopf B., Dumais S., Osuna E., and Platt J. 1998. Trends and controversies-support vector machines. IEEE Intelligent Systems 13: 18-28.
Google Scholar
Herbrich R. 2002. LearningKernel Classifiers: Theory and Algorithms. MIT Press.
Huber P.J. 1972. Robust statistics: A review. Annals of Statistics 43: 1041.
Google Scholar
Huber P.J. 1981. Robust Statistics. John Wiley and Sons, New York. IBM Corporation. 1992. IBM optimization subroutine library guide and reference. IBM Systems Journal, 31, SC23-0519.
Google Scholar
Jaakkola T.S. and Haussler D. 1999. Probabilistic kernel regression models. In: Proceedings of the 1999 Conference on AI and Statistics.
Joachims T. 1999. Making large-scale SVM learning practical. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA, pp. 169-184.
Google Scholar
Karush W. 1939. Minima of functions of several variables with inequalities as side constraints. Master's thesis, Dept. of Mathematics, Univ. of Chicago.
Kaufman L. 1999. Solving the quadratic programming problem arising in support vector classification. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA, pp. 147-168
Google Scholar
Keerthi S.S., Shevade S.K., Bhattacharyya C., and Murthy K.R.K. 1999. Improvements to Platt's SMO algorithm for SVM classifier design. Technical Report CD-99-14, Dept. of Mechanical and Production Engineering, Natl. Univ. Singapore, Singapore.
Google Scholar
Keerthi S.S., Shevade S.K., Bhattacharyya C., and Murty K.R.K. 2001. Improvements to platt's SMO algorithm for SVM classifier design. Neural Computation 13: 637-649.
Google Scholar
Kimeldorf G.S. and Wahba G. 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics 41: 495-502.
Google Scholar
Kimeldorf G.S. and Wahba G. 1971. Some results on Tchebycheffian spline functions. J. Math. Anal. Applic. 33: 82-95.
Google Scholar
Kowalczyk A. 2000. Maximal margin perceptron. In: Smola A.J., Bartlett P.L., Schölkopf B., and Schuurmans D. (Eds.), Advances in Large Margin Classifiers, MIT Press, Cambridge, MA, pp. 75-113.
Google Scholar
Kuhn H.W. and Tucker A.W. 1951. Nonlinear programming. In: Proc. 2nd Berkeley Symposium on Mathematical Statistics and Probabilistics, Berkeley. University of California Press, pp. 481-492.
Lee Y.J. and Mangasarian O.L. 2001. SSVM: A smooth support vector machine for classification. Computational optimization and Applications 20(1): 5-22.
Google Scholar
Li M. and Vitányi P. 1993. An introduction to Kolmogorov Complexity and its applications. Texts and Monographs in Computer Science. Springer, New York.
Google Scholar
Lin C.J. 2001. On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks 12(6): 1288-1298.
PubMed Google Scholar
Lustig I.J., Marsten R.E., and Shanno D.F. 1990. On implementing Mehrotra's predictor-corrector interior point method for linear programming. Princeton Technical Report SOR90-03., Dept. of Civil Engineering and Operations Research, Princeton University.
Lustig I.J., Marsten R.E., and Shanno D.F. 1992. On implementing Mehrotra's predictor-corrector interior point method for linear programming. SIAM Journal on Optimization 2(3): 435-449.
Google Scholar
MacKay D.J.C. 1991. Bayesian Methods for Adaptive Models. PhD thesis, Computation and Neural Systems, California Institute of Technology, Pasadena, CA.
Mangasarian O.L. 1965. Linear and nonlinear separation of patterns by linear programming. Operations Research 13: 444-452.
Google Scholar
Mangasarian O.L. 1968. Multi-surface method of pattern separation. IEEE Transactions on Information Theory IT-14: 801-807.
Google Scholar
Mangasarian O.L. 1969. Nonlinear Programming. McGraw-Hill, New York.
Google Scholar
Mattera D. and Haykin S. 1999. Support vector machines for dynamic reconstruction of a chaotic system. In: SchölkopfB., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA, pp. 211-242.
Google Scholar
McCormick G.P. 1983. Nonlinear Programming: Theory, Algorithms, and Applications. John Wiley and Sons, New York.
Google Scholar
Megiddo N. 1989. Progressin Mathematical Programming, chapter Pathways to the optimal set in linear programming, Springer, New York, NY, pp. 131-158.
Google Scholar
Mehrotra S. and Sun J. 1992. On the implementation of a (primal-dual) interior point method. SIAM Journal on Optimization 2(4): 575-601.
Google Scholar
Mercer J. 1909. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, London A 209: 415-446.
Google Scholar
Micchelli C.A. 1986. Algebraic aspects of interpolation. Proceedings of Symposia in Applied Mathematics 36: 81-102.
Google Scholar
Morozov V.A. 1984. Methods for Solving Incorrectly Posed Problems. Springer.
Müller K.-R., Smola A., Rätsch G., Schölkopf B., Kohlmorgen J., and Vapnik V. 1997. Predicting time series with support vector machines. In: Gerstner W., Germond A., Hasler M., and Nicoud J.-D. (Eds.), Artificial Neural Networks ICANN'97, Berlin. Springer Lecture Notes in Computer Science Vol. 1327 pp. 999-1004.
Murtagh B.A. and Saunders M.A. 1983. MINOS 5.1 user's guide. Technical Report SOL 83-20R, Stanford University, CA, USA, Revised 1987.
Neal R. 1996. Bayesian Learning in Neural Networks. Springer.
Nilsson N.J. 1965. Learning machines: Foundations ofTrainable Pattern Classifying Systems. McGraw-Hill.
Nyquist. H. 1928. Certain topics in telegraph transmission theory. Trans. A.I.E.E., pp. 617-644.
Osuna E., Freund R., and Girosi F. 1997. An improved training algorithm for support vector machines. In Principe J., Gile L., Morgan N., and Wilson E. (Eds.), Neural Networks for Signal Processing VII-Proceedings of the 1997 IEEEWorkshop, pp. 276-285, New York, IEEE.
Google Scholar
Osuna E. and Girosi F. 1999. Reducing the run-time complexity in support vector regression. In: Schölkopf B., Burges C.J.C., and Smola A. J. (Eds.), Advances in Kernel Methods-Support Vector Learning, pp. 271-284, Cambridge, MA, MIT Press.
Google Scholar
Ovari Z. 2000.Kernels, eigenvalues and support vector machines. Honours thesis, Australian National University, Canberra.
Platt J. 1999. Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.) Advances in Kernel Methods-Support Vector Learning, pp. 185-208, Cambridge, MA, MIT Press.
Google Scholar
Poggio T. 1975. On optimal nonlinear associative recall. Biological Cybernetics, 19: 201-209.
Google Scholar
Rasmussen C. 1996. Evaluation of Gaussian Processes and Other Methods for Non-Linear Regression. PhD thesis, Department of Computer Science, University of Toronto, ftp://ftp.cs.toronto.edu/pub/carl/thesis.ps.gz.
Rissanen J. 1978. Modeling by shortest data description. Automatica, 14: 465-471.
Article Google Scholar
Saitoh S. 1988. Theory of Reproducing Kernels and its Applications. Longman Scientific & Technical, Harlow, England.
Saunders C., Stitson M.O., Weston J., Bottou L., Schölkopf B., and Smola A. 1998. Support vector machine-reference manual.Technical Report CSD-TR-98-03, Department of Computer Science, Royal Holloway, University of London, Egham, UK. SVM available at http://svm.dcs.rhbnc.ac.uk/.
Schoenberg I. 1942. Positive definite functions on spheres. Duke Math. J., 9: 96-108.
Google Scholar
Schölkopf B. 1997. Support Vector Learning. R. Oldenbourg Verlag, München. Doktorarbeit, TU Berlin. Download: http://www.kernel-machines.org.
Schölkopf B., Burges C., and Vapnik V. 1995. Extracting support data for a given task. In: Fayyad U.M. and Uthurusamy R. (Eds.), Proceedings, First International Conference on Knowledge Discovery & Data Mining, Menlo Park, AAAI Press.
Google Scholar
Schölkopf B., Burges C., and Vapnik V. 1996. Incorporating invariances in support vector learning machines. In: von der Malsburg C., von Seelen W., Vorbrüggen J. C., and Sendhoff B. (Eds.), Artificial Neural Networks ICANN'96, pp. 47-52, Berlin, Springer Lecture Notes in Computer Science, Vol. 1112.
Google Scholar
Schölkopf B., Burges C.J.C., and Smola A.J. 1999a. (Eds.) Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge, MA.
Google Scholar
Schölkopf B., Herbrich R., Smola A.J., and Williamson R.C. 2001. A generalized representer theorem. Technical Report 2000-81, NeuroCOLT, 2000.To appear in Proceedings of the Annual Conference on Learning Theory, Springer (2001).
Schölkopf B., Mika S., Burges C., Knirsch P., Müller K.-R., Rätsch G., and Smola A. 1999b. Input space vs. feature space in kernel-based methods. IEEE Transactions on Neural Networks, 10(5): 1000-1017.
Google Scholar
Schölkopf B., Platt J., Shawe-Taylor J., Smola A.J., and Williamson R.C.2001. Estimating the support of a high-dimensional distribution. Neural Computation,13(7): 1443-1471.
Article Google Scholar
Schölkopf B., Simard P., Smola A., and Vapnik V. 1998a. Prior knowledge in support vector kernels. In: Jordan M.I., Kearns M.J., and Solla S.A. (Eds.) Advances in Neural Information Processing Systems 10, MIT Press. Cambridge, MA, pp. 640-646.
Google Scholar
Schölkopf B., Smola A., and Müller K.-R. 1998b. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10: 1299-1319.
Google Scholar
Schölkopf B., Smola A., Williamson R.C., and Bartlett P.L. 2000. New support vector algorithms. Neural Computation, 12: 1207-1245.
Google Scholar
Schölkopf B. and Smola A.J. 2002. Learning with Kernels. MIT Press.
Schölkopf B., Sung K., Burges C., Girosi F., Niyogi P., Poggio T., and Vapnik V. 1997. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Transactions on Signal Processing, 45: 2758-2765.
Article Google Scholar
Shannon C.E. 1948. A mathematical theory of communication. Bell System Technical Journal, 27: 379-423, 623-656.
Google Scholar
Shawe-Taylor J., Bartlett P.L., Williamson R.C., and Anthony M. 1998. Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5): 1926-1940.
Google Scholar
Smola A., Murata N., Schölkopf B., and Müller K.-R. 1998a. Asymptotically optimal choice of ε-loss for support vector machines. In: Niklasson L., Bodén M., and Ziemke T. (Eds.) Proceedings of the International Conference on Artificial Neural Networks, Perspectives in Neural Computing, pp. 105-110, Berlin, Springer.
Google Scholar
Smola A., Schölkopf B., and Müller K.-R. 1998b. The connection between regularization operators and support vector kernels. Neural Networks, 11: 637-649.
Google Scholar
Smola A., Schölkopf B., and Müller K.-R. 1998c. General cost functions for support vector regression. In: Downs T., Frean M., and Gallagher M. (Eds.) Proc. of the Ninth Australian Conf. on Neural Networks, pp. 79-83, Brisbane, Australia. University of Queensland.
Google Scholar
Smola A., Schölkopf B., and Rätsch G. 1999. Linear programs for automatic accuracy control in regression. In: Ninth International Conference on Artificial Neural Networks, Conference Publications No. 470, pp. 575-580, London. IEE.
Smola. A.J. 1996. Regression estimation with support vector learning machines. Diplomarbeit, Technische Universität München.
Smola A.J. 1998. Learning with Kernels. PhD thesis, Technische Universit ät Berlin. GMD Research Series No. 25.
Smola A.J., Elisseeff A., Schölkopf B., and Williamson R.C. 2000. Entropy numbers for convex combinations and MLPs. In Smola A.J., Bartlett P.L., Schölkopf B., and Schuurmans D. (Eds.) Advances in Large Margin Classifiers, MIT Press, Cambridge, MA, pp. 369-387.
Google Scholar
Smola A.J., óvári Z.L., and Williamson R.C. 2001. Regularization with dot-product kernels. In: Leen T.K., Dietterich T.G., and Tresp V. (Eds.) Advances in Neural Information Processing Systems 13, MIT Press, pp. 308-314.
Smola A.J. and Schölkopf B. 1998a. On a kernel-based method for pattern recognition, regression, approximation and operator inversion. Algorithmica, 22: 211-231.
Google Scholar
Smola A.J. and Schölkopf B. 1998b. A tutorial on support vector regression. NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK.
Google Scholar
Smola A.J. and Schölkopf B. 2000. Sparse greedy matrix approximation for machine learning. In: Langley P. (Ed.), Proceedings of the International Conference on Machine Learning, Morgan Kaufmann Publishers, San Francisco, pp. 911-918.
Google Scholar
Stitson M., Gammerman A., Vapnik V., Vovk V., Watkins C., and Weston J. 1999. Support vector regression with ANOVA decomposition kernels. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press Cambridge, MA, pp. 285-292.
Google Scholar
Stone C.J. 1985. Additive regression and other nonparametric models. Annals of Statistics, 13: 689-705.
Google Scholar
Stone M. 1974. Cross-validatory choice and assessment of statistical predictors (with discussion). Journal of the Royal Statistical Society, B36: 111-147.
Google Scholar
Sreet W.N. and Mangasarian O.L. 1995. Improved generalization via tolerant training. Technical Report MP-TR-95-11, University of Wisconsin, Madison.
Google Scholar
Tikhonov A.N. and Arsenin V.Y. 1977. Solution of Ill-posed problems. V. H. Winston and Sons.
Tipping M.E. 2000. The relevance vector machine. In: Solla S.A., Leen T.K., and Müller K.-R. (Eds.), Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, MA, pp. 652-658.
Google Scholar
Vanderbei R.J. 1994. LOQO: An interior point code for quadratic programming.TR SOR-94-15, Statistics and Operations Research, Princeton Univ., NJ.
Google Scholar
Vanderbei R.J. 1997. LOQO user's manual-version 3.10. Technical Report SOR-97-08, Princeton University, Statistics and Operations Research, Code available at http://www.princeton.edu/ ~rvdb/.
Vapnik V. 1995. The Nature of Statistical Learning Theory. Springer, New York.
Google Scholar
Vapnik V. 1998. Statistical Learning Theory. John Wiley and Sons, New York.
Google Scholar
Vapnik. V. 1999. Three remarks on the support vector method of function estimation. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA, pp. 25-42.
Google Scholar
Vapnik V. and Chervonenkis A. 1964. A note on one class of perceptrons. Automation and Remote Control, 25.
Vapnik V. and Chervonenkis A. 1974. Theory of Pattern Recognition [in Russian]. Nauka, Moscow. (German Translation: Wapnik W. & Tscherwonenkis A., Theorie der Zeichenerkennung, Akademie-Verlag, Berlin, 1979).
Google Scholar
Vapnik V., Golowich S., and Smola A. 1997. Support vector method for function approximation, regression estimation, and signal processing. In: Mozer M.C., Jordan M.I., and Petsche T. (Eds.) Advances in Neural Information Processing Systems 9, MA, MIT Press, Cambridge. pp. 281-287.
Google Scholar
Vapnik V. and Lerner A. 1963. Pattern recognition using generalized portrait method. Automation and Remote Control, 24: 774-780.
Google Scholar
Vapnik V.N. 1982. Estimation of Dependences Based on Empirical Data. Springer, Berlin.
Google Scholar
Vapnik V.N. and Chervonenkis A.Y. 1971. On the uniformconvergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2): 264-281.
Google Scholar
Wahba G. 1980. Spline bases, regularization, and generalized cross-validation for solving approximation problems with large quantities of noisy data. In: Ward J. and Cheney E. (Eds.), Proceedings of the International Conference on Approximation theory in honour of George Lorenz, Academic Press, Austin, TX, pp. 8-10.
Google Scholar
Wahba G. 1990. Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia.
Google Scholar
Wahba G. 1999. Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA. pp. 69-88.
Google Scholar
Weston J., Gammerman A., Stitson M., Vapnik V., Vovk V., and Watkins C. 1999. Support vector density estimation. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.) Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA. pp. 293-306.
Google Scholar
Williams C.K.I. 1998. Prediction with Gaussian processes: From linear regression to linear prediction and beyond. In: Jordan M.I. (Ed.), Learning and Inference in Graphical Models, Kluwer Academic, pp. 599-621.
Williamson R.C., Smola A.J., and Schölkopf B. 1998. Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. Technical Report 19, NeuroCOLT, http://www.neurocolt.com. Published in IEEE Transactions on Information Theory, 47(6): 2516-2532 (2001).
Google Scholar
Yuille A. and Grzywacz N. 1988. The motion coherence theory. In: Proceedings of the International Conference on Computer Vision, IEEE Computer Society Press, Washington, DC, pp. 344-354.
Google Scholar

Download references

Author information

Authors and Affiliations

RSISE, Australian National University, Canberra, 0200, Australia
Alex J. Smola
Max-Planck-Institut für biologische Kybernetik, 72076, Tübingen, Germany
Bernhard Schölkopf

Authors

Alex J. Smola
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Schölkopf
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Smola, A.J., Schölkopf, B. A tutorial on support vector regression. Statistics and Computing 14, 199–222 (2004). https://doi.org/10.1023/B:STCO.0000035301.49549.88

Download citation

Issue Date: August 2004
DOI: https://doi.org/10.1023/B:STCO.0000035301.49549.88

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A tutorial on support vector regression

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A random forest guided tour

Learning from imbalanced data: open challenges and future directions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A tutorial on support vector regression

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A random forest guided tour

Learning from imbalanced data: open challenges and future directions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation