Supervision from Context


Ideas & Goals

  • Learn word representations from context in sentence
  • Allow accurate syntactic & semantic arithmetic [1]
  • Create comprehensive test set for regularities

Semantic & Syntactic Relations in Test

Proposed Architectures

  • CBOW = Continuous Bag-of-Words
  • CBOW shares projection matrices
  • Words are one-hot vocabulary vectors


Ideas & Goals

  • Leverage intra-image correlations for learning
  • Improve "unsupervised" pre-training

Context Classification

  • Predict relative position of second patch
  • Random variation and offset to not learn trivial relations


Nearest-Neighbour Examples

  • Representation from fc6 layer
  • NN via normalized cross-correlation

Results (VOC-2007)


  • Not as good as supervised pre-training (ImageNet)
  • Better than existing unsupervised pre-training
  • Representation still not ideal (chromatic aberration)

Transformation Invariance


Ideas & Goals

  • Use images from different angles to predict camera motion
  • Find representation useful for higher level tasks

SF and KITTI dataset examples

MNIST Results: Random Transformations

  • Random translation & rotation $\rightarrow$ classify transformation
  • Simple siamese network w/ varying base net

ILSVRC-12 Top-5 Accuracy

  • Base architecture similar to AlexNet
  • Also good results for keypoint matching & odometry
  • (1, 5, 10, 20) examples per class finetuning

Environment Exploration


Further Reading