In the Robust Formulations for Handling Uncertainty in Kernel Matrices we study the problem of uncertainty in the entries of the Kernel matrix, arising in SVM formulation. Using Chance Constraint Programming and a novel large deviation inequality we derive a formulation which is robust to such noise. The resulting formulation applies when the noise is Gaussian, or has finite support. The formulation in general is non-convex, but in several cases of interest it reduces to a convex program. The problem of uncertainty in kernel matrix is motivated from the real world problem of classifying proteins when the structures are provided with some uncertainty. The formulation derived here naturally incorporates such uncertainty in a principled manner leading to significant improvements over the state of the art.
RSVM is a robust classifiers which can handle uncertainty in kernel. This contains 3 solver :
Quadratic Program : RSVM_QP
Second-oder-cone Program : RSVM_SOCP.
Modified Quasi Newton Method : RSVM _QN.
We implemented the above formulations in SeDuMi [5] for Matlab. The code is available here. See the README file for details.
The synthetic dataset, "SynD", used in our synthetic experiments [1] is available here. The scripts used to generate synthetic data, can be downloaded from here.
The Protein Structure dataset, ProD, used in our synthetic experiments [1] is available here.
References:
Sahely Bhadra, Sourangshu Bhattacharya, Chiranjib Bhattacharyya and Aharon Ben-Tal. Robust Formulations for Handling Uncertainty in Kernel Matrices ICML 2010.
Jos F. Sturm. Using SeDuMi 1.02, a Matlab Toolbox for Optimization over Symmetric Cones. Available at http://www.optimization-online.org/DB_HTML/2001/10/395.html. Software can be downloaded from http://sedumi.mcmaster.ca/.