-
#12. Convolutional Neural Networks: Step by Step연구실 2019. 10. 13. 15:56
* Convolutional Neural Networks
(1) Zero-Padding
- image 주변에 padding을 추가하는 작업
- Conv layer가 줄어들지 않고 연산을 할 수 있도록 만들어준다. 깊은 레이어로 갈수록 높이와 너비가 줄어들기 때문에 패딩을 넣어주어야 한다.
- 'same' convolution: the height/width is exactly preserved after one layer
- image의 가장자리에 있는 정보들을 보존해준다.
def zero_pad(X, pad): """ Pad with zeros all images of the dataset X. The padding is applied to the height and width of an image, as illustrated in Figure 1. Argument: X -- python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images pad -- integer, amount of padding around each image on vertical and horizontal dimensions Returns: X_pad -- padded image of shape (m, n_H + 2*pad, n_W + 2*pad, n_C) """ X_pad = np.pad(X, ((0,0), (pad,pad), (pad,pad), (0,0)), 'constant', constant_values = 0) return X_pad
- X: python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images
- a = [1, 2, 3, 4, 5], np.pad(a, (2, 3), 'constant', constant_values = 0): [0, 0, 1, 2, 3, 4, 5, 0, 0, 0]
(2) Single step of convolution
def conv_single_step(a_slice_prev, W, b): """ Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation of the previous layer. Arguments: a_slice_prev -- slice of input data of shape (f, f, n_C_prev) W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev) b -- Bias parameters contained in a window - matrix of shape (1, 1, 1) Returns: Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data """ # Element-wise product between a_slice_prev and W. Do not add the bias yet. s = np.multiply(a_slice_prev, W) # Sum over all entries of the volume s. Z = np.sum(s) # Add bias b to Z. Cast b to a float() so that Z results in a scalar value. Z = Z + float(b) return Z
(3) Convolutional Neural Networks - Forward pass
def conv_forward(A_prev, W, b, hparameters): """ Implements the forward propagation for a convolution function Arguments: A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev) W -- Weights, numpy array of shape (f, f, n_C_prev, n_C) b -- Biases, numpy array of shape (1, 1, 1, n_C) hparameters -- python dictionary containing "stride" and "pad" Returns: Z -- conv output, numpy array of shape (m, n_H, n_W, n_C) cache -- cache of values needed for the conv_backward() function """ ### START CODE HERE ### # Retrieve dimensions from A_prev's shape (≈1 line) (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Retrieve dimensions from W's shape (≈1 line) (f, f, n_C_prev, n_C) = W.shape # Retrieve information from "hparameters" (≈2 lines) stride = hparameters['stride'] pad = hparameters['pad'] # Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor. (≈2 lines) n_H = int((n_H_prev - f + 2 * pad)/stride) + 1 n_W = int((n_W_prev - f + 2 * pad)/stride) + 1 # Initialize the output volume Z with zeros. (≈1 line) Z = np.zeros((m, n_H, n_W, n_C)) # Create A_prev_pad by padding A_prev A_prev_pad = zero_pad(A_prev, pad) for i in range(m): # loop over the batch of training examples a_prev_pad = A_prev_pad[i] # Select ith training example's padded activation for h in range(n_H): # loop over vertical axis of the output volume for w in range(n_W): # loop over horizontal axis of the output volume for c in range(n_C): # loop over channels (= #filters) of the output volume # Find the corners of the current "slice" (≈4 lines) vert_start = h * stride vert_end = vert_start + f horiz_start = w * stride horiz_end = horiz_start + f # Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line) a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end,:] # Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈1 line) Z[i, h, w, c] = conv_single_step(a_slice_prev, W[...,c], b[...,c]) ### END CODE HERE ### # Making sure your output shape is correct assert(Z.shape == (m, n_H, n_W, n_C)) # Save information in "cache" for the backprop cache = (A_prev, W, b, hparameters) return Z, cache
- vert_start: 가로 이동 첫 인덱스 / horiz_start: 세로 이동 첫 인덱스
- output shape of the CONV layer:
* Pooking layer
- pooling layer는 input의 높이와 너비를 줄여준다. 계산을 줄이고 feature detector가 변하지 않도록 도와준다.
- 종류:
(1) Max-pooling layer: stores the max value
(2) Average-pooling layer: stores the average value
- 학습시켜야 하는 parameter들이 없지만 hyperparameter는 존재한다.(window size f 등등)
(1) Forward Pooling
- padding을 사용하지 않기 때문에 output size는 다음과 같아진다.
# GRADED FUNCTION: pool_forward def pool_forward(A_prev, hparameters, mode = "max"): """ Implements the forward pass of the pooling layer Arguments: A_prev -- Input data, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev) hparameters -- python dictionary containing "f" and "stride" mode -- the pooling mode you would like to use, defined as a string ("max" or "average") Returns: A -- output of the pool layer, a numpy array of shape (m, n_H, n_W, n_C) cache -- cache used in the backward pass of the pooling layer, contains the input and hparameters """ # Retrieve dimensions from the input shape (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape # Retrieve hyperparameters from "hparameters" f = hparameters["f"] stride = hparameters["stride"] # Define the dimensions of the output n_H = int(1 + (n_H_prev - f) / stride) n_W = int(1 + (n_W_prev - f) / stride) n_C = n_C_prev # Initialize output matrix A A = np.zeros((m, n_H, n_W, n_C)) for i in range(m): # loop over the training examples for h in range(n_H): # loop on the vertical axis of the output volume for w in range(n_W): # loop on the horizontal axis of the output volume for c in range (n_C): # loop over the channels of the output volume # Find the corners of the current "slice" (≈4 lines) vert_start = n_H * stride vert_end = vert_start + f horiz_start = n_W * stride horiz_end = horiz_start + f # Use the corners to define the current slice on the ith training example of A_prev, channel c. (≈1 line) a_prev_slice = A_prev[i, vert_start:vert_end, horiz_start:horiz_end, c] # Compute the pooling operation on the slice. Use an if statment to differentiate the modes. Use np.max/np.mean. if mode == "max": A[i, h, w, c] = np.max(a_prev_slice) elif mode == "average": A[i, h, w, c] = np.mean(a_prev_slice) # Store the input and hparameters in "cache" for pool_backward() cache = (A_prev, hparameters) # Making sure your output shape is correct assert(A.shape == (m, n_H, n_W, n_C)) return A, cache
* Backpropagation in convolutional neural networks
(1) Convolutional layer backward pass
1. Computing dA
- dA 계산식(Wc: filter, dZhw: a scalar corresponding to the gradient of the cost with respect to the ouput of the conv layer Z at the hth row and wth column):
2. Computing dW
- loss에 대해 dWc(filter에 대한 derivative) 계산 식(a_slice: the slice which was used to generate the activation Zij):
3. Computing db
- filter Wc의 cost에 대해 db 계산식:
'연구실' 카테고리의 다른 글
#14. Keras tutorial - the Happy House (0) 2019.10.14 #13. Convolutional Neural Networks: Application (0) 2019.10.13 #11. Tensorflow Tutorial (0) 2019.10.10 #10. Optimization Methods (0) 2019.10.10 #9. Gradient Checking (0) 2019.10.07