python - Linear Regression Lasagne / Theano -


i'm trying make simple multivariate linear regression lasagne. input:

x_train = np.array([[37.93, 139.5, 329., 16.64,                     16.81, 16.57, 1., 707.,                     39.72, 149.25, 352.25, 16.61,                     16.91, 16.60, 40.11, 151.5,                     361.75, 16.95, 16.98, 16.79]]).astype(np.float32) y_train = np.array([37.92, 138.25, 324.66, 16.28, 16.27, 16.28]).astype(np.float32) 

for 2 data points network should able learn y perfectly.

here model:

i1 = t.matrix() y = t.vector() lay1 = lasagne.layers.inputlayer(shape=(none,20),input_var=i1) out1 = lasagne.layers.get_output(lay1) lay2 = lasagne.layers.denselayer(lay1, 6, nonlinearity=lasagne.nonlinearities.linear) out2 = lasagne.layers.get_output(lay2) params = lasagne.layers.get_all_params(lay2, trainable=true) cost = t.sum(lasagne.objectives.squared_error(out2, y)) grad = t.grad(cost, params) updates = lasagne.updates.sgd(grad, params, learning_rate=0.1)  f_train = theano.function([i1, y], [out1, out2, cost], updates=updates) 

after executing multiple times

f_train(x_train,y_train) 

the cost explodes infinity. idea going wrong here?

thanks!

the network has capacity single training instance. need apply strong regularization prevent training diverging. alternatively, , more realistically, give more complex training data (many instances).

with single instance task can solved using 1 input, instead of 20, , denselayer's bias disabled:

import numpy np import theano import lasagne import theano.tensor t   def compile():     x, z = t.matrices('x', 'z')     lh = lasagne.layers.inputlayer(shape=(none, 1), input_var=x)     ly = lasagne.layers.denselayer(lh, 6, nonlinearity=lasagne.nonlinearities.linear,                                    b=none)     y = lasagne.layers.get_output(ly)     params = lasagne.layers.get_all_params(ly, trainable=true)     cost = t.sum(lasagne.objectives.squared_error(y, z))     updates = lasagne.updates.sgd(cost, params, learning_rate=0.0001)     return theano.function([x, z], [y, cost], updates=updates)   def main():     f_train = compile()      x_train = np.array([[37.93]]).astype(theano.config.floatx)     y_train = np.array([[37.92, 138.25, 324.66, 16.28, 16.27, 16.28]])\         .astype(theano.config.floatx)      _ in xrange(100):         print f_train(x_train, y_train)   main() 

note learning rate needs reduced lot prevent divergence.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

android - Keyboard hides my half of edit-text and button below it even in scroll view -