SVMの最適なCおよびガンマパラメータを決定するための検索範囲は?


32

分類にSVMを使用しており、線形カーネルとRBFカーネルの最適なパラメーターを決定しようとしています。線形カーネルの場合、交差検証されたパラメーター選択を使用してCを決定し、RBFカーネルの場合、グリッド検索を使用してCおよびガンマを決定します。

私は20(数値)機能と70のトレーニング例を7つのクラスに分類する必要があります。

Cおよびガンマパラメータの最適値を決定するために、どの検索範囲を使用する必要がありますか?

回答:


31

チェックアウトSVM分類に実用的なガイドいくつかのポインタのための、特に5ページ。

Cγ(C,γ)C and γ is a practical method to identify good parameters (for example, C=25,23,,215;γ=215,213,,23).

Remember to normalize your data first and if you can, gather more data because from the looks of it, your problem might be heavily underdetermined.


Should peer testing be done manually? there is not a library to achieve it?
x-rw

11

Check out section 2.3.2 of this paper by Chapelle and Zien. They have a nice heuristic to select a good search range for σ of the RBF kernel and C for the SVM. I quote

To determine good values of the remaining free parameters (eg, by CV), it is important to search on the right scale. We therefore fix default values for C and σ that have the right order of magnitude. In a c-class problem we use the 1/c quantile of the pairwise distances Dijρ of all data-points as a default for σ. The default for C is the inverses of the empirical variance s2 in features space, which can be calculated by s2=1niKii1n2i,jKij from a n×n kernel matrix K.

Afterwards, they use multiples (e.g. 2k for k{2,...,2}) of the default value as search range in a grid-search using cross-validation. That always worked very well for me.

Of course, we @ciri said, normalizing the data etc. is always a good idea.


I think there are several equal rbf kernel formulations. One with gamma and another with sigma, i.e. gamma = 1/2sigma^2. Does the gamma in the above heuristic correspond to gamma, sigma or sigma^2? I have found other descriptions of the same heurstic which are for gamma.
machinery

If you check the linked paper, it is 12σ2
fabee

@fabee Should peer testing be done manually? there is not a library to achieve it?
x-rw
弊社のサイトを使用することにより、あなたは弊社のクッキーポリシーおよびプライバシーポリシーを読み、理解したものとみなされます。
Licensed under cc by-sa 3.0 with attribution required.