TAGS :Viewed: 3 - Published at: a few seconds ago

[ plotting linear SVM ]

I tried following the example here but i am having trouble applying it when i have 16 features. lin_svc is trained with those 16 features (i deleted the line to re-train it again from the example). it works and i tried it and also extracted .coef_before.

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm

#features is an array of 16
#lin_svc variable is available 
#train is a pandas DF

X = train[features].as_matrix()
y = train.outcome

h = .02 # step size in the mesh

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))

# title for the plots
titles = ['SVC with linear kernel']

for i, clf in enumerate([lin_svc]):
    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, m_max]x[y_min, y_max].
    plt.subplot(2, 2, i + 1)
    plt.subplots_adjust(wspace=0.4, hspace=0.4)

    Z = clf.predict(X)

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)

    # Plot also the training points
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
    plt.xlabel('Sepal length')
    plt.ylabel('Sepal width')
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())


The error i am getting is:

ValueError                                Traceback (most recent call last)
<ipython-input-8-d52ca252fc3a> in <module>()
     25     # Put the result into a color plot
---> 26     Z = Z.reshape(xx.shape)
     27     plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)

ValueError: total size of new array must be unchanged

Answer 1

I've encountered this same issue myself. Since you're really interested in plotting Z as a function of xx and yy, you should be passing those to clf.predict() rathan than passing X. Try replacing

Z = clf.predict(X)


Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

and the plot should show nicely (assuming no other bugs).

Also you may want to change the title of your question to something like "Plotting 2-D Decision Boundary," since this has nothing to do with SVMs specifically. You'll encounter this kind of issue with any of the sklearn classifiers.