- Date: May 8 2017 ~ May 11 2017
- Venue: San Jose Convention Center
- https://www.gputechconf.com
Dealing with small dataset problem.
Artem Semyanov Prisma AI##
-
why samll dataset problem?
- building real world problem solution
- academic settings
- iid = independent identically distributed data at training and
- reality settings
- domain shift, dependent samples, non-stationary distributions, noise in data
- one-shot learning
-
Data Augmentation
- think about distribution of and user input data
- how is it different from current raining dataset
- random crop
- random distort
- random occlusion
- random lighting conditions
- making smaller + orientate
- with background change (random saturation)
- gamma correction
- light color
-
Buliding Embddings:
- from image classification to image retrieval
- [[1]]L2 distance or cosine distance
- [conv, avgpool, maxpool, concat, dropout, fully connected, softmax]
- [[2]]From image classification to image retrieval
- principal component analysis(PCA)
- [[3]]Triplet Loss or Coupled Clusteres Loss
- Triplet loss, Coupled clusters loss
- region proposal
- Faster R-CNN
- fully convolutional semantic segmentation net: U-net
- using bounding boxes
- Faster R-CNN is single, unifed network for object detection.
-
Mac or R-Mac
- [[4]](regional maximum activation of convolutions)
-
applying transfer learning
- fine tuning already trained model with your dataset
- adam”, adagrad, nesterov optimizer - momentum of gradients
- Adam, mt = , vt, mt, vt, theta t+1
-
Applying transfer learning
-
BCE with Momentum
-
Start with significantly lower learning rate
- i.e normal learning schedule form 1e-3 to 1e-6
- start finetuning with 1e-4 in situation
- when at first general purpose dataset then domain specific
- start finetuning with 1e-5 in situation
- when switching to another type of augmentation
- not randomly selected first batches
- batch normalization inside the model architecture
-
-
Reference [email protected]