Course Code: bspcomv
Duration: 35 hours
Course Outline:
  • Introduction to Computer Vision
    • Introduction
    • Image representation and analysis
      • Numerical representation of images
      • Image processing techniques like color and geometric transforms
      • Program CNN kernel for object edge-detection
    • Convolutional Neural Networks
      • Layers of CNN - Convolutional, Maxpooling, Fully Connected Layers
      • Build CNN based image classifier in Tensorflow/Pytorch
      • Layer activation and feature visualization techniques
      • Learn why distinguishing features are important in pattern and object recognition tasks.
    • Features and Object Recognition
      • Write code to extract information about an object’s color and shape.
      • Use features to identify areas on a face and to recognize the shape of a car or pedestrian on a road.
    • Image Segmentation
      • Implement k-means clustering to break an image up into parts.
      • Find the contours and edges of multiple objects in an image.
      • Learn about background subtraction for video.
  • Advanced Computer Vision and Deep Learning
    • Advanced CNN Architecture
      • Learn about advances in CNN architectures.
      • See how region-based CNN, like Faster R-CNN, have allowed for fast, localized object recognition in images.
      • Work with a YOLO/single shot object detection system
    • Recurrent Neural Networks
      • Learn how recurrent neural networks learn from ordered sequences of data.
      • Implement an RNN for sequential text generation.
      • Explore how memory can be incorporated into a deep learning model.
      • Understand where RNN’s are used in deep learning applications.
    • Attention Mechanisms
      • Learn how attention allows models to focus on a specific piece of input data.
      • Understand where attention is useful in natural language and computer vision applications.
    • Image Captioning
      • Learn how to combine CNNs and RNNs to build a complex captioning model.
      • Implement an LSTM for caption generation.
      • Train a model to predict captions and understand a visual scene.
  • Object Tracking and Localization
    • Object Motion and Tracking
      • Learn how to programmatically track a single point over time.
      • Understand motion models that define object movement over time.
      • Learn how to analyze videos as sequences of individual image frames.
    • Optical Flow and Feature Matching
      • Implement a method for tracking a set of unique features over time.
      • Learn how to match features from one image frame to another.
      • Track a moving car using optical flow.
    • Robot Localization
      • Use Bayesian statistics to locate a robot in space.
      • Learn how sensor measurements can be used to safely navigate an environment.
      • Understand Gaussian uncertainty.
      • Implement a histogram filter for robot localization in Python.
    • Graph Slam
      • Identify landmarks and build up a map of an environment.
      • Learn how to simultaneously localize an autonomous vehicle and create a map of landmarks.
      • Implement move and sense functions for a robotic vehicle.