machine learning - Trouble understanding Convolutional Neural Network -

i read convolutional neural networks here. started playing torch7. having confusion convolutional layer of cnn.

from tutorial,


the neurons in layer connected small region of layer before it, instead of of neurons in fully-connected manner.


for example, suppose input volume has size [32x32x3], (e.g. rgb cifar-10 image). if receptive field of size 5x5, each neuron in conv layer have weights [5x5x3] region in input volume, total of 5*5*3 = 75 weights.


if input layer [32x32x3], conv layer compute output of neurons connected local regions in input, each computing dot product between weights , region connected in input volume. may result in volume such [32x32x12].

i started playing conv layer might image. did in torch7. here implementation,

require 'image' require 'nn'  = image.lena()  model = nn.sequential() model:add(nn.spatialconvolutionmm(3, 10, 5, 5)) --depth = 3, #output layer = 10, filter = 5x5  res = model:forward(i) itorch.image(res) print(#i) print(#res) 

output cnn transformation

  3  512  512 [torch.longstorage of size 3]    10  508  508 [torch.longstorage of size 3] 

now lets see structure of cnn


so, questions are,

question 1

is convolution done - lets take image 32x32x3. , there 5x5 filter. 5x5 filter pass through whole 32x32 image , produce convoluted images? okay, sliding 5x5 filter across whole image, 1 image, if there 10 output layers, 10 images(as see output). how these? (see image clarification if required)

enter image description here

question 2

what number of neurons in conv layer? number of output layers? in code i've written above, model:add(nn.spatialconvolutionmm(3, 10, 5, 5)). 10? (no. of output layers?)

if point number 2 not make sense. according if receptive field of size 5x5, each neuron in conv layer have weights [5x5x3] region in input volume, total of 5*5*3 = 75 weights. weight here? confused in this. in model defined in torch, there no weight. how weight playing role here?

can explain going on?

is convolution done - lets take image 32x32x3. , there 5x5 filter. 5x5 filter pass through whole 32x32 image , produce convoluted images?

for 32x32x3 input image 5x5 filter iterate on every single pixel , each pixel @ 5x5 neighborhood. neighborhood contains 5*5*3=75 values. below example image 3x3 filter on single input channel, i.e. 1 neighborhood of 3*3*1 values (source).


for each individual neighbor filter have 1 parameter (aka weight), 75 parameters. calculate 1 single output value (value @ pixel x, y) reads neighbor values, multiplies each 1 respective parameter/weight , adds @ end (see discrete convolution). optimal weights have learned during training.

so 1 filter iterate on image , generate new output, pixel pixel. if have multiple filters (i.e. second parameter in spatialconvolutionmm >1) multiple outputs ("planes" in torch).

okay, sliding 5x5 filter across whole image, 1 image, if there 10 output layers, 10 images(as see output). how these? (see image clarification if required)

each output plane gets generated own filter. each filter has own parameters (5*5*3 parameters in example). process multiple filters same one.

what number of neurons in conv layer? number of output layers? in code i've written above, model:add(nn.spatialconvolutionmm(3, 10, 5, 5)). 10? (no. of output layers?)

you should call them weights or parameters, "neurons" doesn't fit convolutional layers. number of parameters is, described, 5*5*3=75 per filter in example. have 10 filters ("output planes") have 750 parameters total. if add second layer network model:add(nn.spatialconvolutionmm(10, 10, 5, 5)) have additional 5*5*10=250 parameters per filter , 250*10=2500 total. notice how number can grow (512 filters/output planes in 1 layer operating on 256 input planes nothing uncommon).

for further reading should @ . scroll down chapter "introducing convolutional networks". under "local receptive fields" there visualizations understand filter (one shown above).


Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

android - Keyboard hides my half of edit-text and button below it even in scroll view -