python - PyBrain: overflow encountered in square, invalid value encountered in multiply -
i create neural network this:
n = feedforwardnetwork() inlayer = linearlayer(43) bias = biasunit() hiddenlayer = sigmoidlayer(100) outlayer = linearlayer(1) n.addinputmodule(inlayer) n.addmodule(bias) n.addmodule(hiddenlayer) n.addoutputmodule(outlayer) in_to_hidden = fullconnection(inlayer, hiddenlayer) bias_to_hidden = fullconnection(bias, hiddenlayer) hidden_to_out = fullconnection(hiddenlayer, outlayer) n.addconnection(in_to_hidden) n.addconnection(bias_to_hidden) n.addconnection(hidden_to_out) n.sortmodules()
i train following way (i'm simplifying, it's being trained in multiple iterations):
self.trainer = backproptrainer(self.neural_net, learningrate=0.8) (...) ds = superviseddataset(self.net_input_size, 1) ds.addsample([...], np.float64(learned_value)) (...) self.trainer.trainondataset(ds)
sometimes following warnings:
(...)/lib/python3.5/site-packages/pybrain-0.3.1-py3.5.egg/pybrain/supervised/trainers/backprop.py:99: runtimewarning: overflow encountered in square error += 0.5 * sum(outerr ** 2)
(...)/lib/python3.5/site-packages/pybrain-0.3.1-py3.5.egg/pybrain/structure/modules/sigmoidlayer.py:14: runtimewarning: invalid value encountered in multiply inerr[:] = outbuf * (1 - outbuf) * outerr
and when check saved net file see all weights nan
:
(...) <fullconnection class="pybrain.structure.connections.full.fullconnection" name="fullconnection-8"> <inmod val="biasunit-5"/> <outmod val="sigmoidlayer-11"/> <parameters>[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]</parameters> </fullconnection> (...)
as suggested, here comes answer:
having learning rate of 0.8 ineffective since can lead errors yours , prevent effective learning of network.
with such high learning rate, based on cost function, network change weights large amounts, therefore weights might overflow nan values.
generally (even if weights not overflow nan values) high learning rate not idea in terms of learning, too. you're network solves specific problems learning large training data set. if you're learning rate high, 0.8, network adapts very hard current epoch's data. of information / learned features of former epoch's lost, because network adjusts current epoch's error rate.
for problems typical learning rates 0.01 or 0.001 or less because want draw small conclusion 1 single epoch , rather learn invariant features of several epochs.
Comments
Post a Comment