当前位置: 动力学知识库 > 问答 > 编程问答 >

python - LinearSVC and LogisticRegression are equivalent?

问题描述:

I am comparing performance of LinearSVC and LogisticRegression of scikit.learn on textual data. I am using LinearSVC with 'l2' penalty and 'squared_hinge' loss. I use LogisticRegression with 'l1' penalty. However I find that their performance is near identical on my data sets (in terms of classification accuracy, precision/recall etc., as well as running times). This can't be sheer coincidence, and leads me to suspect that these are in fact identical implementations. Is that the case?

If I am to compare LogisticRegression with a support vector based implementation (that can handle multiclass data) which class in scikit.learn should I use?

Here's my code

scorefunc = make_scorer(fbeta_score, beta = 1, pos_label = None)

splits = StratifiedShuffleSplit(data.labels, n_iter = 5, test_size = 0.2)

params1 = {'penalty':['l1'],'C':[0.0001, 0.001, 0.01, 0.1, 1.0]}

lr = GridSearchCV(LogisticRegression(), param_grid = params1, cv = splits, n_jobs = 5, scoring = scorefunc)

lr.fit(data.combined_mat, data.labels)

print lr.best_params_, lr.best_score_

>> {'penalty': 'l1', 'C': 1.0} 0.91015049974

params2 = {'C':[0.0001, 0.001, 0.01, 0.1, 1.0], 'penalty':['l2']}

svm = GridSearchCV(LinearSVC(), param_grid = params2, cv = splits, n_jobs = 5, scoring = scorefunc)

svm.fit(data.combined_mat, data.labels)

print svm.best_params_, svm.best_score_

>> {'penalty': 'l2', 'C': 1.0} 0.968320989681

分享给朋友:
您可能感兴趣的文章:
随机阅读: