# 3.算法思想

1. 找到一个平面，可以很好的区分不同类别的点，即使分类器的训练误差小，线性可分时要求训练误差为0。

2. 很好的识别未知类别样本的类别，即多大程度上信任该分类器在未知样本上分类的效果。

# 5.程序实现（案例）

`#数据集来自MASS包的cats数据集#下面的程序将实现用体重和心脏重量来预测一只猫的性别library(e1071)data(cats,package="MASS")summary(cats)inputData=data.frame(cats[, c (2,3)], Sex= as.factor(cats\$Sex))train=inputData[1:108,]#训练集test=inputData[109:144,]#测试集#初步建模x=train[,-3]y=train[,3]#核函数选择高斯核函数model1=svm(x,y,kernel='radial',gamma=if(is.vector(x)) 1 else1/ncol(x))#计算训练误差，结果显示有14个样本类别错误z=test[,-3]zy=test[,3]zy=as.integer(zy)pred1=predict(model1,x)table(pred1,y)#优化模型attach(train)#将数据集train按列单独确认为向量type=c("C-classification","nu-classification","one-classification")kernel=c("linear","polynomial","radial","sigmoid")pred2=array(0,dim=c(108,3,4))accuracy=matrix(0,3,4)yy=as.integer(y)for(i in 1:3){for(j in 1:4){pred2[,i,j]=predict(svm(x,y,type=type[i],kernel=kernel[j]),x)if(i>2) accuracy[i,j]=sum(pred2[,i,j]!=1)else accuracy[i,j]=sum(pred2[,i,j]!=yy)}}#12种组合算法在训练集上的误差wrong=matrix(0,3,4)for(i in 1:3){for(j in 1:4){wrong[i,j]=mean(yy != pred2[,i,j])#错误率占比}}#选择训练集上误差最小的三种组合，计算在测试集上的误差，三种组合在训练集上的错误率分别为0.241,0.259,0.278；三种组合分别是nu-classification+radial、C-classification+linear组合和C-classification+radial组合。pred3=array(0,dim=c(108,3,4))for(i in 1:3){for(j in 1:4){pred3[,i,j]=predict(svm(x,y,type=type[i],kernel=kernel[j]),z)if(i>2) accuracy[i,j]=sum(pred3[,i,j]!=1)else accuracy[i,j]=sum(pred3[,i,j]!=yy)}}mean(zy != pred3[,2,3])mean(zy != pred3[,1,1])mean(zy != pred3[,1,3])#计算结果分别为0.417,0,0#在测试集上错误率为0的两种算法分别是C-classification+linear组合和C-classification+radial组合。`

end!