python - Linear Regression Lasagne / Theano -
i'm trying make simple multivariate linear regression lasagne. input:
x_train = np.array([[37.93, 139.5, 329., 16.64, 16.81, 16.57, 1., 707., 39.72, 149.25, 352.25, 16.61, 16.91, 16.60, 40.11, 151.5, 361.75, 16.95, 16.98, 16.79]]).astype(np.float32) y_train = np.array([37.92, 138.25, 324.66, 16.28, 16.27, 16.28]).astype(np.float32)
for 2 data points network should able learn y
perfectly.
here model:
i1 = t.matrix() y = t.vector() lay1 = lasagne.layers.inputlayer(shape=(none,20),input_var=i1) out1 = lasagne.layers.get_output(lay1) lay2 = lasagne.layers.denselayer(lay1, 6, nonlinearity=lasagne.nonlinearities.linear) out2 = lasagne.layers.get_output(lay2) params = lasagne.layers.get_all_params(lay2, trainable=true) cost = t.sum(lasagne.objectives.squared_error(out2, y)) grad = t.grad(cost, params) updates = lasagne.updates.sgd(grad, params, learning_rate=0.1) f_train = theano.function([i1, y], [out1, out2, cost], updates=updates)
after executing multiple times
f_train(x_train,y_train)
the cost explodes infinity. idea going wrong here?
thanks!
the network has capacity single training instance. need apply strong regularization prevent training diverging. alternatively, , more realistically, give more complex training data (many instances).
with single instance task can solved using 1 input, instead of 20, , denselayer
's bias disabled:
import numpy np import theano import lasagne import theano.tensor t def compile(): x, z = t.matrices('x', 'z') lh = lasagne.layers.inputlayer(shape=(none, 1), input_var=x) ly = lasagne.layers.denselayer(lh, 6, nonlinearity=lasagne.nonlinearities.linear, b=none) y = lasagne.layers.get_output(ly) params = lasagne.layers.get_all_params(ly, trainable=true) cost = t.sum(lasagne.objectives.squared_error(y, z)) updates = lasagne.updates.sgd(cost, params, learning_rate=0.0001) return theano.function([x, z], [y, cost], updates=updates) def main(): f_train = compile() x_train = np.array([[37.93]]).astype(theano.config.floatx) y_train = np.array([[37.92, 138.25, 324.66, 16.28, 16.27, 16.28]])\ .astype(theano.config.floatx) _ in xrange(100): print f_train(x_train, y_train) main()
note learning rate needs reduced lot prevent divergence.
Comments
Post a Comment