Skip to content
Posted on:May 10, 2017 at 06:13 AM

GTC 2017, Gpu Technology Conference

GTC 2017, Gpu Technology Conference

Dealing with small dataset problem.

Artem Semyanov Prisma AI##

http://on-demand.gputechconf.com/gtc/2017/presentation/s7402-artem-semyanov-dealing-with-small-dataset-problem.pdf

  • why samll dataset problem?

    • building real world problem solution
    • academic settings
      • iid = independent identically distributed data at training and
    • reality settings
      • domain shift, dependent samples, non-stationary distributions, noise in data
    • one-shot learning
  • Data Augmentation

    • think about distribution of and user input data
    • how is it different from current raining dataset
      • random crop
      • random distort
      • random occlusion
      • random lighting conditions
    • making smaller + orientate
    • with background change (random saturation)
    • gamma correction
    • light color
  • Buliding Embddings:

    • from image classification to image retrieval
    • [[1]]L2 distance or cosine distance
    • [conv, avgpool, maxpool, concat, dropout, fully connected, softmax]
    • [[2]]From image classification to image retrieval
      • principal component analysis(PCA)
    • [[3]]Triplet Loss or Coupled Clusteres Loss
      • Triplet loss, Coupled clusters loss
    • region proposal
      • Faster R-CNN
      • fully convolutional semantic segmentation net: U-net
      • using bounding boxes
      • Faster R-CNN is single, unifed network for object detection.
  • Mac or R-Mac

    • [[4]](regional maximum activation of convolutions)
  • applying transfer learning

    • fine tuning already trained model with your dataset
    • adam”, adagrad, nesterov optimizer - momentum of gradients
    • Adam, mt = , vt, mt, vt, theta t+1
  • Applying transfer learning

    • BCE with Momentum

    • Start with significantly lower learning rate

    • i.e normal learning schedule form 1e-3 to 1e-6
    • start finetuning with 1e-4 in situation
      • when at first general purpose dataset then domain specific
    • start finetuning with 1e-5 in situation
      • when switching to another type of augmentation
    • not randomly selected first batches
    • batch normalization inside the model architecture
  • Reference [email protected]