ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • #12. Convolutional Neural Networks: Step by Step
    연구실 2019. 10. 13. 15:56

     

    * Convolutional Neural Networks

    (1) Zero-Padding

    - image 주변에 padding을 추가하는 작업

    - Conv layer가 줄어들지 않고 연산을 할 수 있도록 만들어준다. 깊은 레이어로 갈수록 높이와 너비가 줄어들기 때문에 패딩을 넣어주어야 한다. 

    - 'same' convolution: the height/width is exactly preserved after one layer

    - image의 가장자리에 있는 정보들을 보존해준다.

    def zero_pad(X, pad):
        """
        Pad with zeros all images of the dataset X. The padding is applied to the height and width of an image, 
        as illustrated in Figure 1.
        
        Argument:
        X -- python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images
        pad -- integer, amount of padding around each image on vertical and horizontal dimensions
        
        Returns:
        X_pad -- padded image of shape (m, n_H + 2*pad, n_W + 2*pad, n_C)
        """
        
        X_pad = np.pad(X, ((0,0), (pad,pad), (pad,pad), (0,0)), 'constant', constant_values = 0)
        
        return X_pad

    - X: python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images

    - a = [1, 2, 3, 4, 5], np.pad(a, (2, 3), 'constant', constant_values = 0): [0, 0, 1, 2, 3, 4, 5, 0, 0, 0]

     

     

     

    (2) Single step of convolution

    def conv_single_step(a_slice_prev, W, b):
        """
        Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation 
        of the previous layer.
        
        Arguments:
        a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
        W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
        b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
        
        Returns:
        Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
        """
        
        # Element-wise product between a_slice_prev and W. Do not add the bias yet.
        s = np.multiply(a_slice_prev, W)
        # Sum over all entries of the volume s.
        Z = np.sum(s)
        # Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
        Z = Z + float(b)
    
        return Z

     

     

    (3) Convolutional Neural Networks - Forward pass

     

    def conv_forward(A_prev, W, b, hparameters):
        """
        Implements the forward propagation for a convolution function
        
        Arguments:
        A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
        W -- Weights, numpy array of shape (f, f, n_C_prev, n_C)
        b -- Biases, numpy array of shape (1, 1, 1, n_C)
        hparameters -- python dictionary containing "stride" and "pad"
            
        Returns:
        Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)
        cache -- cache of values needed for the conv_backward() function
        """
        
        ### START CODE HERE ###
        # Retrieve dimensions from A_prev's shape (≈1 line)  
        (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
        
        # Retrieve dimensions from W's shape (≈1 line)
        (f, f, n_C_prev, n_C) = W.shape
        
        # Retrieve information from "hparameters" (≈2 lines)
        stride = hparameters['stride']
        pad = hparameters['pad']
        
        # Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor. (≈2 lines)
        n_H = int((n_H_prev - f + 2 * pad)/stride) + 1
        n_W = int((n_W_prev - f + 2 * pad)/stride) + 1
        
        # Initialize the output volume Z with zeros. (≈1 line)
        Z = np.zeros((m, n_H, n_W, n_C))
        
        # Create A_prev_pad by padding A_prev
        A_prev_pad = zero_pad(A_prev, pad)
        
        for i in range(m):                               # loop over the batch of training examples
            a_prev_pad = A_prev_pad[i]                               # Select ith training example's padded activation
            for h in range(n_H):                           # loop over vertical axis of the output volume
                for w in range(n_W):                       # loop over horizontal axis of the output volume
                    for c in range(n_C):                   # loop over channels (= #filters) of the output volume
                        
                        # Find the corners of the current "slice" (≈4 lines)
                        vert_start = h * stride
                        vert_end = vert_start + f
                        horiz_start = w * stride
                        horiz_end = horiz_start + f
                        
                        # Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line)
                        a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end,:]
                        
                        # Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈1 line)
                        Z[i, h, w, c] = conv_single_step(a_slice_prev, W[...,c], b[...,c])
                                            
        ### END CODE HERE ###
        
        # Making sure your output shape is correct
        assert(Z.shape == (m, n_H, n_W, n_C))
        
        # Save information in "cache" for the backprop
        cache = (A_prev, W, b, hparameters)
        
        return Z, cache

    - vert_start: 가로 이동 첫 인덱스 / horiz_start: 세로 이동 첫 인덱스

    - output shape of the CONV layer:

     

     

    * Pooking layer

    - pooling layer는 input의 높이와 너비를 줄여준다. 계산을 줄이고 feature detector가 변하지 않도록 도와준다.

    - 종류:

        (1) Max-pooling layer: stores the max value

        (2) Average-pooling layer: stores the average value

    - 학습시켜야 하는 parameter들이 없지만 hyperparameter는 존재한다.(window size f 등등)

     

     

    (1) Forward Pooling

    - padding을 사용하지 않기 때문에 output size는 다음과 같아진다.

    # GRADED FUNCTION: pool_forward
    
    def pool_forward(A_prev, hparameters, mode = "max"):
        """
        Implements the forward pass of the pooling layer
        
        Arguments:
        A_prev -- Input data, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
        hparameters -- python dictionary containing "f" and "stride"
        mode -- the pooling mode you would like to use, defined as a string ("max" or "average")
        
        Returns:
        A -- output of the pool layer, a numpy array of shape (m, n_H, n_W, n_C)
        cache -- cache used in the backward pass of the pooling layer, contains the input and hparameters 
        """
        
        # Retrieve dimensions from the input shape
        (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
        
        # Retrieve hyperparameters from "hparameters"
        f = hparameters["f"]
        stride = hparameters["stride"]
        
        # Define the dimensions of the output
        n_H = int(1 + (n_H_prev - f) / stride)
        n_W = int(1 + (n_W_prev - f) / stride)
        n_C = n_C_prev
        
        # Initialize output matrix A
        A = np.zeros((m, n_H, n_W, n_C))              
        
        for i in range(m):                         # loop over the training examples
            for h in range(n_H):                     # loop on the vertical axis of the output volume
                for w in range(n_W):                 # loop on the horizontal axis of the output volume
                    for c in range (n_C):            # loop over the channels of the output volume
                        
                        # Find the corners of the current "slice" (≈4 lines)
                        vert_start = n_H * stride
                        vert_end = vert_start + f
                        horiz_start = n_W * stride
                        horiz_end = horiz_start + f
                        
                        # Use the corners to define the current slice on the ith training example of A_prev, channel c. (≈1 line)
                        a_prev_slice = A_prev[i, vert_start:vert_end, horiz_start:horiz_end, c]
    
                        # Compute the pooling operation on the slice. Use an if statment to differentiate the modes. Use np.max/np.mean.
                        if mode == "max":
                            A[i, h, w, c] = np.max(a_prev_slice)
                        elif mode == "average":
                            A[i, h, w, c] = np.mean(a_prev_slice)
        
        
        # Store the input and hparameters in "cache" for pool_backward()
        cache = (A_prev, hparameters)
        
        # Making sure your output shape is correct
        assert(A.shape == (m, n_H, n_W, n_C))
        
        return A, cache

     


    * Backpropagation in convolutional neural networks

    (1) Convolutional layer backward pass

    1. Computing dA

    -  dA 계산식(Wc: filter, dZhw: a scalar corresponding to the gradient of the cost with respect to the ouput of the conv layer Z at the hth row and wth column):

    2. Computing dW

    - loss에 대해 dWc(filter에 대한 derivative) 계산 식(a_slice: the slice which was used to generate the activation Zij):

     

    3. Computing db

    - filter Wc의 cost에 대해 db 계산식:

     

    '연구실' 카테고리의 다른 글

    #14. Keras tutorial - the Happy House  (0) 2019.10.14
    #13. Convolutional Neural Networks: Application  (0) 2019.10.13
    #11. Tensorflow Tutorial  (0) 2019.10.10
    #10. Optimization Methods  (0) 2019.10.10
    #9. Gradient Checking  (0) 2019.10.07

    댓글

©hyunbul