ISSN: 2167-0501
+44-77-2385-9429
Review Article - (2015) Volume 4, Issue 5
In bioinformatics or chemoinformatics, we always need data mining of support vector machines (SVMs) for its large databases. Kernels play an important role in SVMs. Thus it is very necessary to list all the kernels of SVMs that we currently use.
<Keywords: Bio/Chemoinformatics data; Data mining; Support vector machines (SVMs); Kernels; Hilbert space
With basic mathematical knowledge on Hilbert space and its reproducing kernel Hilbert space (RKHS) etc., it is not very difficult to understand the following (online-listed) kernels used in SVMs of data mining [1-5]:
(01). Polynomial (homogeneous) [1]:, where · denotes the dot product - an algebraic operation that takes two equal-length sequences of numbers and returns a single number, and d is an integer number.
(02). Polynomial (inhomogeneous) [1]: ,c is a constant.
(03). Gaussian radial basis function (RBF) [1]: , for γ >0. Sometimes parametrized using
(04). Hyperbolic tangent (Sigmoid kernel) [1]: , for some (not every) κ > 0 and c < 0.
(05). Fisher kernel [1]: , where I is the Fisher information matrix and UX is the Fisher score.
(06). Graph kernel [1]: a kernel function that computes an inner product on graphs.
(07). String kernel [1]: a kernel function that operates on strings, i.e. finite sequences of symbols that need not be of the same length.
(08). Tree kernel [1]: the application of the more general concept of positive-definite kernel (a generalization of a positive-definite matrix) to tree structures.
(09). Path kernel [1].
(11). Fourier kernel [1]:
(12). B-spline kernel [4]:
(13). Cosine kernel [4]:
(14). Multiquadric kernel [4]:
(15). Wave kernel [4]:
(16). Log kernel [4]:
(17). Cauchy kernel [4]:
(18). Tstudent kernel [4]:
(19). Thin-plate kernel [4]:
(20). combination of some kernels of the above, e.g. [2,5].
(21). Wavelet-SVM kernels: Harr kernel, Daubechies kernel, Coiflet kernel, Symlet kernel [3].
(22). In summary, in [6,7], there are a list of kernels:
• Definition 9.1 Polynomial kernel 286
• Computation 9.6 All-subsets kernel 289
• Computation 9.8 Gaussian kernel 290
• Computation 9.12 ANOVA kernel 293
• Computation 9.18 Alternative recursion for ANOVA kernel 296
• Computation 9.24 General graph kernels 301
• Definition 9.33 Exponential difiusion kernel 307
• Definition 9.34 von Neumann difiusion kernel 307
• Computation 9.35 Evaluating difiusion kernels 308
• Computation 9.46 Evaluating randomised kernels 315
• Definition 9.37 Intersection kernel 309
• Definition 9.38 Union-complement kernel 310
• Remark 9.40 Agreement kernel 310
• Section 9.6 Kernels on real numbers 311
• Remark 9.42 Spline kernels 313
• Definition 9.43 Derived subsets kernel 313
• Definition 10.5 Vector space kernel 325
• Computation 10.8 Latent semantic kernels 332
• Definition 11.7 The p-spectrum kernel 342
• Computation 11.10 The p-spectrum recursion 343
• Remark 11.13 Blended spectrum kernel 344
• Computation 11.17 All-subsequences kernel 347
• Computation 11.24 Fixed length subsequences kernel 352
• Computation 11.33 Naive recursion for gap-weighted
• subsequences kernel 358
• Computation 11.36 Gap-weighted subsequences kernel 360
• Computation 11.45 Trie-based string kernels 367
• Algorithm 9.14 ANOVA kernel 294
• Algorithm 9.25 Simple graph kernels 302
• Algorithm 11.20 All non-contiguous subsequences kernel 350
• Algorithm 11.25 Fixed length subsequences kernel 352
• Algorithm 11.38 Gap-weighted subsequences kernel 361
• Algorithm 11.40 Character weighting string kernel 364
• Algorithm 11.41 Soft matching string kernel 365
• Algorithm 11.42 Gap number weighting string kernel 366
• Algorithm 11.46 Trie-based p-spectrum kernel 368
• Algorithm 11.51 Trie-based mismatch kernel 371
• Algorithm 11.54 Trie-based restricted gap-weighted kernel 374
• Algorithm 11.62 Co-rooted subtree kernel 380
• Algorithm 11.65 All-subtree kernel 383
• Algorithm 12.14 Pair HMM kernel 407
• Algorithm 12.17 Hidden tree model kernel 411
• Algorithm 12.34 Fixed length Markov model Fisher kernel 427.
This research was supported by a Victorian Life Sciences Computation Initiative (VLSCI) grant numbered VR0063 on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government (Australia).