machine learning - SVM - passing a string to the CountVectorizer in Python vectorizes each character? -


i have working svm , countvectorizer works fine when input transform function list of strings. however, if pass 1 string it, vectorizer iterates through each character in string , vectorizes each one, though set analyzer parameter word when constructing countvectorizer.

for x in range(0,3):         test=raw_input("type message classify: ")         v=vectorizer.transform(test).toarray()         print(v)         print(len(v))         print(svm.predict(vectorizer.transform(test).toarray())) 

i'm able fix issue changing second line in above code to:

test=[raw_input("type message classify: ")] 

but seems strange have 1-item list. isn't there better way without constructing list?

it expects list or array of documents when pass in single string assumes each element of string document (ie: character).

try changing svm.predict(vectorizer.transform(test).toarray()) svm.predict(vectorizer.transform([test]).toarray())

ps: toarray() part not going scale use real-world corpus. svms in sklearn can operate on sparse matrices i'd drop part together.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -