SVM_pAUC

A Structural SVM Based Approach for Optimizing Partial AUC

Harikrishna Narasimhan
Shivani Agarwal

Machine Learning and Learning Theory Group
Department of Computer Science and Automation
Indian Insititute of Science, Bangalore

A paper on this method was presented at ICML 2013: [paper] [supplementary material]

Upload Date: 22/12/12

Abstract

The area under the Receiver Operating Characteristic (ROC) curve (AUC) is a widely used performance measure in machine learning. Increasingly, however, in several applications, ranging from ranking and biometric screening to medical diagnosis, performance is measured not in terms of the full area under the ROC curve, but instead, in terms of the partial area under the ROC curve between two specified false positive rates. In this paper, we develop a structural SVM framework for directly optimizing the partial AUC between any two false positive rates. Our approach makes use of a cutting plane solver along the lines of the structural SVM based approach for optimizing the full AUC developed by Joachims (2005). Unlike the full AUC, where the combinatorial optimization problem needed to find the most violated constraint in the cutting plane solver can be decomposed easily to yield an efficient algorithm, the corresponding optimization problem in the case of partial AUC is harder to decompose. One of our key technical contributions is an efficient algorithm for solving this combinatorial optimization problem that has the same computational complexity as Joachims' algorithm for optimizing the usual AUC. The effectiveness of our approach is demonstrated on a variety of real-world tasks.

Implementation

A MATLAB implementation of the proposed method (termed SVMpAUC) for optimizing partial AUC in an FPR interval [alpha, beta] can be downloaded here. This implementation works with a linear kernel. We will soon provide an implementation that can work with non-linear kernels.

In the code provided, we have used Andrea Vedaldi's MATLAB API for implementing structural SVMs; this in turn is a wrapper over Thorsten Joachims' C API. Hence, before using the code, you will have to download and install svm-struct-matlab-1.0 (MATLAB wrapper for SVM^struct) from Andrea Vedaldi's page.

Usage: The svmpauc function provided contains the implementation of the proposed method and can be called from the MATLAB terminal as follows:

w = svmpauc(X, y, alpha, beta, C, epsilon);

where

Input:

X is a n x d matrix containing 'n' training examples of dimension 'd'
y is a n x 1 matrix containing labels (-1/+1) for the 'n' training examples
[alpha, beta] is a user-specified false positive rate (FPR) interval over which the algorithm optimizes partial AUC.
C is the regularization paramter that controls the trade-off between the model complexity and training error.
epsilon is a tolerance parameter that determines convergence of the algorithm.

Output:

w is the d x 1 weight vector learnt by the algorithm.

Examples

To be uploaded soon.

Disclaimer

This code is free to use for scientific purpose, but should not be distributed without permission from the authors. The authors are not responsible for any implications arising out of the usage of this code.

Contact

For any queries or suggestions, please feel free to contact us at Harikrishna Narasimhan .

References

Joachims, T. A support vector method for multivariate performancemeasures. In Proceedings of the 22nd International Conference on Machine Learning (ICML), pp. 377-384, 2005.
Joachims, T. Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217-226, 2006.
Narasimhan, H. and Agarwal, S. A Structural SVM Based Approach for Optimizing Partial AUC. In Proceedings of the 30th International Conference on Machine Learning (ICML), 2013.
Tsochantaridis, I., Joachims, T., Hofmann, T., and Altun, Y. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6:1453-1484, 2005.