Biowulf High Performance Computing at the NIH
Few-Shot-ssl: semi-supervised learning from a few labeled samples

Few-shot semi-supervised learning (few-short-ssl) package implements learning algorithms that specifically allow for better generalization on problems with small labeled training sets.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

ssh -Y biowulf.nih.gov
[user@biowulf]$ sinteractive  --gres=gpu:k80:1 --mem=12g
[user@@cn3316 ~]$ module load Few-Shot-ssl
Download the source code and enter the few-shot-ssl-public folder:
[user@@cn3316 ~]$ git clone https://github.com/renmengye/few-shot-ssl-public.git
[user@@cn3316 ~]$ cd few-shot-ssl-public
Set the environment variable DATA_ROOT to the root folder containing your data, in this example to /fdb/Few-Shot-ssl; create a folder that will store the checkpoints/models:
[user@@cn3316 ~]$ export DATA_ROOT=/fdb/Few-Shot-ssl
[user@@cn3316 ~]$ mkdir checkpoints_mini-imagenet
Train a Few-Shot basic model on sample data:
[user@@cn3316 ~]$ few-shot python run_exp.py --data_root $DATA_ROOT --dataset mini-imagenet --model basic --label_ratio 0.4 --results checkpoints_mini-imagenet 
Using GPU implementation
I9008 2018-07-25 13:48:25.272414 run_exp.py:317 Not using the test set, using val
I9008 2018-07-25 13:48:25.289666 run_exp.py:320 Use split `train` for training
I9008 2018-07-25 13:48:25.290517 ...ni_imagenet.py:87 Label ratio 0.4
split train
num unlabel 5
num test 5
num distractor 5
Num label 15360
Num unlabel 23040
I9008 2018-07-25 13:48:28.479375 ...ni_imagenet.py:87 Label ratio 1
split val
num unlabel 5
num test 5
num distractor 5
Num label 9600
Num unlabel 0
I9008 2018-07-25 13:48:30.439604 ...del_factory.py:43 Model basic
I9008 2018-07-25 13:48:30.761852 ...odels/nnlib.py:55 Weight shape [3, 3, 3, 64]
I9008 2018-07-25 13:48:30.849676 ...odels/nnlib.py:71 Truncated normal initialization stddev=0.01
I9008 2018-07-25 13:48:30.852028 ...dels/nnlib.py:102 Weight decay 5e-05
I9008 2018-07-25 13:48:30.967584 ...dels/nnlib.py:117 Model/phi/cnn/layer_0/w:0
I9008 2018-07-25 13:48:31.014525 ...odels/nnlib.py:55 Weight shape [64]
I9008 2018-07-25 13:48:31.054281 ...odels/nnlib.py:86 Constant initialization value=0
...
...
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 199996/200000 [4:56:56<00:00, 11.23it
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 199998/200000 [4:56:56<00:00, 11.23it
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 199998/200000 [4:57:06<00:00, 11.22it
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 199998/200000 [4:57:46<00:00, 11.19it
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 199998/200000 [4:57:47<00:00, 11.19it
mini-imagenet_basic_2018-07-24-11-58-51-650200-588: 100% 200000/200000 [4:57:48<00:00, 11.19it/s, ce=1.087e+00, lr=1.563e-05, trn_acc=68.207, val_acc=42.980]
evaluation: 100% 600/600 [00:23<00:00, 25.42it/s]
evaluation: 100% 600/600 [00:22<00:00, 26.93it/s]
I9008 2018-07-24 16:57:26.396680 run_exp.py:372 Final train acc 68.553% (1.048%)
I9008 2018-07-24 16:57:26.402938 run_exp.py:374 Final test acc 42.907% (0.978%)
This step will create a new folder under checkpoints_mini-imagenet and store the training results in there. The name of the new folder is designated as MODEL_ID, for example
MODEL_ID = mini-imagenet_basic_2018-07-25-11-23-10-119665-033 
Now test the model by using additional options --eval and --pretrain $MODEL_ID:
[user@@cn3316 ~]$ few-shot python run_exp.py --data_root $DATA_ROOT --dataset mini-imagenet --label_ratio 0.4 --model basic --results checkpoints_mini-imagenet --eval --pretrain mini-imagenet_basic_2018-07-25-11-23-10-119665-033 
Using GPU implementation
I9008 2018-07-25 14:20:03.405697 run_exp.py:317 Not using the test set, using val
I9008 2018-07-25 14:20:03.408262 run_exp.py:320 Use split `train` for training
I9008 2018-07-25 14:20:03.408977 ...ni_imagenet.py:87 Label ratio 0.4
split train
num unlabel 20
num test 20
num distractor 0
Num label 15360
Num unlabel 23040
I9008 2018-07-25 14:20:05.090839 ...ni_imagenet.py:87 Label ratio 1
split val
num unlabel 20
num test 20
num distractor 0
Num label 9600
Num unlabel 0
I9008 2018-07-25 14:20:06.671241 ...del_factory.py:43 Model basic
I9008 2018-07-25 14:20:07.030679 ...odels/nnlib.py:55 Weight shape [3, 3, 3, 64]
I9008 2018-07-25 14:20:07.099157 ...odels/nnlib.py:71 Truncated normal initialization stddev=0.01
I9008 2018-07-25 14:20:07.124904 ...dels/nnlib.py:102 Weight decay 5e-05
I9008 2018-07-25 14:20:07.264019 ...dels/nnlib.py:117 Model/phi/cnn/layer_0/w:0
I9008 2018-07-25 14:20:07.327411 ...odels/nnlib.py:55 Weight shape [64]
I9008 2018-07-25 14:20:07.342690 ...odels/nnlib.py:86 Constant initialization value=0
I9008 2018-07-25 14:20:07.391453 ...dels/nnlib.py:117 Model/phi/cnn/layer_0/b:0
...
...
name: Tesla P100-PCIE-16GB
major: 6 minor: 0 memoryClockRate (GHz) 1.3285
pciBusID 0000:13:00.0
Total memory: 15.90GiB
Free memory: 15.61GiB
2018-07-25 17:08:27.905954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2018-07-25 17:08:27.906006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2018-07-25 17:08:27.906085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:13:00.0)
evaluation: 100% 600/600 [00:21<00:00, 28.02it/s]
evaluation: 100% 600/600 [00:22<00:00, 26.11it/s]
I9008 2018-07-25 17:36:02.193464 run_exp.py:372 Final train acc 74.713% (1.035%)
I9008 2018-07-25 17:36:02.204525 run_exp.py:374 Final test acc 45.907% (1.011%)
Exit the current session:
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$