The deepmedic project aims to offer easy access to Deep Learning for segmentation of structures of interest in biomedical 3D scans. It is a system that allows the easy creation of a 3D Convolutional Neural Network, which can be trained to detect and segment structures if corresponding ground truth labels are provided for training.
[user@biowulf]$ sinteractive --gres=lscratch:10 --mem=10g -c8 [user@cn0861 ~]$ module load deepmedic [+] Loading singularity 3.10.5 on cn4185 [+] Loading deepmedic 0.8.4Clone the deepmedic github repository and run deepmedic in CPU mode on the provided example data:
[user@biowulf]$ wget https://github.com/deepmedic/deepmedic/archive/refs/tags/v0.8.4.tar.gz [user@biowulf]$ tar -zxf v0.8.4.tar.gz && rm -f v0.8.4.tar.gz && cd deepmedic-0.8.4 [user@biowulf]$ deepMedicRun -model ./examples/configFiles/tinyCnn/model/modelConfig.cfg \ -train examples/configFiles/tinyCnn/train/trainConfigWithValidation.cfg Given configuration file: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/configFiles/tinyCnn/model/modelConfig.cfg Given configuration file: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/configFiles/tinyCnn/train/trainConfigWithValidation.cfg Creating necessary folders for training session... =============================== logger created ======================================= ======================== Starting new session ============================ Command line arguments given: Namespace(device='cpu', model_cfg='./examples/configFiles/tinyCnn/model/modelConfig.cfg', reset_trainer=False, saved_model=None, test_cfg=None, train_cfg='examples/configFiles/tinyCnn/train/trainConfigWithValidation.cfg') ... Available devices to Tensorflow: [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 17546255720476806479 , name: "/device:XLA_CPU:0" device_type: "XLA_CPU" memory_limit: 17179869184 locality { } incarnation: 16465827944861712538 physical_device_desc: "device: XLA_CPU device" ] CONFIG: The configuration file for the [model] given is: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/configFiles/tinyCnn/model/modelConfig.cfg ============================================================= ========== PARAMETERS FOR MAKING THE ARCHITECTURE =========== ============================================================= CNN model's name = tinyCnn ~~~~~~~~~~~~~~~~~Model parameters~~~~~~~~~~~~~~~~ Number of Classes (including background) = 5 ~Normal Pathway~~ Number of Input Channels = 2 Number of Layers = 3 Number of Feature Maps per layer = [4, 5, 6] Kernel Dimensions per layer = [[3, 3, 3], [3, 3, 3], [3, 3, 3]] Padding mode of convs per layer = ['VALID', 'VALID', 'VALID'] Residual connections added at the output of layers (indices from 0) = [] Layers that will be made of Lower Rank (indices from 0) = [] Lower Rank layers will be made of rank = [] ~Subsampled Pathway~~ Use subsampled Pathway = True Number of subsampled pathways that will be built = 1 Number of Layers (per sub-pathway) = [3] Number of Feature Maps per layer (per sub-pathway) = [[4, 5, 6]] Kernel Dimensions per layer = [[3, 3, 3], [3, 3, 3], [3, 3, 3]] Padding mode of convs per layer = ['VALID', 'VALID', 'VALID'] Subsampling Factor per dimension (per sub-pathway) = [[3, 3, 3]] Residual connections added at the output of layers (indices from 0) = [] Layers that will be made of Lower Rank (indices from 0) = [] Lower Rank layers will be made of rank = [] ~Fully Connected Pathway~~ Number of additional FC layers (Excluding the Classif. Layer) = 0 Number of Feature Maps in the additional FC layers = [] Padding mode of convs per layer = ['VALID'] Residual connections added at the output of layers (indices from 0) = [] Layers that will be made of Lower Rank (indices from 0) = [] Dimensions of Kernels in final FC path before classification = [[1, 1, 1]] ~Size Of Image Segments~~ Size of Segments for Training = [25, 25, 25] Size of Segments for Validation = [7, 7, 7] Size of Segments for Testing = [45, 45, 45] ~Dropout Rates~~ Drop.R. for each layer in Normal Pathway = [] Drop.R. for each layer in Subsampled Pathway = [] Drop.R. for each layer in FC Pathway (additional FC layers + Classific.Layer at end) = [0.5] ~Weight Initialization~~ Initialization method and params for the conv kernel weights = ['fanIn', 2] ~Activation Function~~ Activation function to use = prelu ~Batch Normalization~~ Apply BN straight on pathways' inputs (eg straight on segments) = [False, False, True] Batch Normalization uses a rolling average for inference, over this many batches = 60 ========== Done with printing session's parameters ========== ============================================================= CONFIG: The configuration file for the [session] was loaded from: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/configFiles/tinyCnn/train/trainConfigWithValidation.cfg ============= NEW TRAINING SESSION ============== ============================================================= ========= PARAMETERS FOR THIS TRAINING SESSION ============== ============================================================= Session's name = trainSessionWithValidTiny Model will be loaded from save = None ~Output~~ Main output folder = /vf/users/user/deepmedic/deepmedic-0.8.4/examples/output Log performance metrics for tensorboard = True Path and filename to save trained models = /vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/saved_models//trainSessionWithValidTiny//tinyCnn.trainSessionWithValidTiny ~~~~~~~~~~~~~~~~~Generic Information~~~~~~~~~~~~~~~~ Number of Cases for Training = 2 Number of Cases for Validation = 2 ~~~~~~~~~~~~~~~~~Training parameters~~~~~~~~~~~~~~~~ Dataframe (csv) filename = None Filepaths to Channels of the Training Cases = [['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/Flair_subtrMeanDivStd.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/T1c_subtrMeanDivStd.nii.gz'], ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0006_1/Flair_subtrMeanDivStd.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0006_1/T1c_subtrMeanDivStd.nii.gz']] Filepaths to Ground-Truth labels of the Training Cases = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/OTMultiClass.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0006_1/OTMultiClass.nii.gz'] Filepaths to ROI Masks of the Training Cases = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/brainmask.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0006_1/brainmask.nii.gz'] ~ Sampling (train) ~~ Type of Sampling = Fore/Background (0) Sampling Categories = ['Foreground', 'Background'] Percent of Samples to extract per Sampling Category = [0.5 0.5] Paths to weight-Maps for sampling of each category = None ~Training Cycle~~ Number of Epochs = 2 Number of Subepochs per epoch = 2 Number of cases to load per Subepoch (for extracting the samples for this subepoch) = 50 Number of Segments loaded per subepoch for Training = 1000. NOTE: This number of segments divided by the batch-size defines the number of optimization-iterations that will be performed every subepoch! Batch size (train) = 10 Number of parallel processes for sampling = 0 ~Learning Rate Schedule~~ Type of schedule = poly [Predef] Predefined schedule of epochs when the LR will be lowered = None [Predef] When decreasing Learning Rate, divide LR by = 2.0 [Poly] Initial epochs to wait before lowering LR = 0.6666666666666666 [Poly] Final epoch for the schedule = 2 [Auto] Initial epochs to wait before lowering LR = 5 [Auto] When decreasing Learning Rate, divide LR by = 2.0 [Auto] Minimum increase in validation accuracy (0. to 1.) that resets the waiting counter = 0.0 [Expon] (Deprecated) parameters = {'epochs_wait_before_decr': 0.6666666666666666, 'final_ep_for_sch': 2, 'lr_to_reach_at_last_ep': 0.00390625, 'mom_to_reach_at_last_ep': 0.9} ~Data Augmentation During Training~~ Image level augmentation: Parameters for image-level augmentation: {'affine':In order to run deepmedic in GPU mode on the same data, pass it an additional option -dev cuda0:} affine: OrderedDict([('prob', 0.7), ('max_rot_xyz', (45.0, 45.0, 45.0)), ('max_scaling', 0.1), ('seed', None), ('interp_order_imgs', 1), ('interp_order_lbls', 0), ('interp_order_roi', 0), ('interp_order_wmaps', 1), ('boundary_mode', 'nearest'), ('cval', 0.0)]) Patch level augmentation: Mu and std for shift and scale of histograms = {'shift': {'mu': 0.0, 'std': 0.05}, 'scale': {'mu': 1.0, 'std': 0.01}} Probabilities of reflecting each axis = (0.5, 0.0, 0.0) Probabilities of rotating planes 0/90/180/270 degrees = {'xy': {'0': 0.8, '90': 0.1, '180': 0.0, '270': 0.1}, 'yz': {'0': 0.0, '90': 0.0, '180': 0.0, '270': 0.0}, 'xz': {'0': 0.0, '90': 0.0, '180': 0.0, '270': 0.0}} ~~~~~~~~~~~~~~~~~Validation parameters~~~~~~~~~~~~~~~~ Perform Validation on Samples throughout training? = True Perform Full Inference on validation cases every few epochs? = True Dataframe (csv) filename = None Filepaths to Channels of Validation Cases = [['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/Flair_subtrMeanDivStd.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/T1c_subtrMeanDivStd.nii.gz'], ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/Flair_subtrMeanDivStd.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/T1c_subtrMeanDivStd.nii.gz']] Filepaths to Ground-Truth labels of the Validation Cases = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/OTMultiClass.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/OTMultiClass.nii.gz'] Filepaths to ROI masks for Validation Cases = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/brainmask.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/brainmask.nii.gz'] ~~~~~~Validation on Samples throughout Training~~~~~~~ Number of Segments loaded per subepoch for Validation = 5000 Batch size (val on samples) = 50 ~ Sampling (val) ~~ Type of Sampling = Uniform (1) Sampling Categories = ['Uniform'] Percent of Samples to extract per Sampling Category = [1.0] Paths to weight-maps for sampling of each category = None ~~~~Validation with Full Inference on Validation Cases~~~~~ Perform Full-Inference on Val. cases every that many epochs = 1 Batch size (val on whole volumes) = 10 ~Predictions (segmentations and prob maps on val. cases)~~ Save Segmentations = True Save Probability Maps for each class = [True, True, True, True, True] Filepaths to save results per case = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/predictions/trainSessionWithValidTiny/predictions//pred_brats_2013_pat0003_1.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/predictions/trainSessionWithValidTiny/predictions//pred_brats_2013_pat0004_1.nii.gz'] Suffixes with which to save segmentations and probability maps = {'segm': 'Segm', 'prob': 'ProbMapClass'} ~Feature Maps~~ Save Feature Maps = False Min/Max Indices of FMs to visualise per pathway-type and per layer = None Filepaths to save FMs per case = ['/vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/predictions/trainSessionWithValidTiny/features//pred_brats_2013_pat0003_1.nii.gz', '/vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/predictions/trainSessionWithValidTiny/features//pred_brats_2013_pat0004_1.nii.gz'] ~Optimization~~ Initial Learning rate = 0.001 Optimizer to use: SGD(0), Adam(1), RmsProp(2) = 2 Parameters for Adam: b1= placeholder, b2=placeholder, e= placeholder Parameters for RmsProp: rho= 0.9, e= 0.0001 Momentum Type: Classic (0) or Nesterov (1) = 1 Momentum Non-Normalized (0) or Normalized (1) = 1 Momentum Value = 0.6 ~Costs~~ Loss functions and their weights = {'xentr': 1.0, 'iou': None, 'dsc': None} Reweight samples in cost on a per-class basis = {'type': None, 'prms': None, 'schedule': [0, 2]} L1 Regularization term = 1e-06 L2 Regularization term = 0.0001 ~Freeze Weights of Certain Layers~~ Indices of layers from each type of pathway that will be kept fixed (first layer is 0): Normal pathway's layers to freeze = [] Subsampled pathway's layers to freeze = [] FC pathway's layers to freeze = [] ~~~~~~~~~~~~~~~~~ PRE-PROCESSING ~~~~~~~~~~~~~~~~ ~Data Compabitibility Checks~~ Check whether input data has correct format (can slow down process) = True ~Padding~~ Pad Input Images = True ~Intensity Normalization~~ Verbosity level = 0 Z-Score parameters = {'apply_to_all_channels': False, 'apply_per_channel': None, 'cutoff_percents': None, 'cutoff_times_std': None, 'cutoff_below_mean': False} ========== Done with printing session's parameters ========== ============================================================= ======================================================= =========== Making the CNN graph... =============== ...Building the CNN model... [Pathway_NORMAL] is being built... Block [0], FMs-In: 2, FMs-Out: 4, Conv Filter dimensions: [3, 3, 3] WARNING:tensorflow:From /opt/conda/envs/deepmedic/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py:1659: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. Block [1], FMs-In: 4, FMs-Out: 5, Conv Filter dimensions: [3, 3, 3] Block [2], FMs-In: 5, FMs-Out: 6, Conv Filter dimensions: [3, 3, 3] [Pathway_SUBSAMPLED[3, 3, 3]] is being built... Block [0], FMs-In: 2, FMs-Out: 4, Conv Filter dimensions: [3, 3, 3] Block [1], FMs-In: 4, FMs-Out: 5, Conv Filter dimensions: [3, 3, 3] Block [2], FMs-In: 5, FMs-Out: 6, Conv Filter dimensions: [3, 3, 3] [Pathway_FC] is being built... Block [0], FMs-In: 12, FMs-Out: 5, Conv Filter dimensions: [1, 1, 1] Adding the final Softmax layer... Finished building the CNN's model. Pathway [NORMAL], Mode: [train], Input's Shape: (None, 2, 25, 25, 25) Block [0], Mode: [train], Input's Shape: (None, 2, 25, 25, 25) Block [1], Mode: [train], Input's Shape: (None, 4, 23, 23, 23) Block [2], Mode: [train], Input's Shape: (None, 5, 21, 21, 21) Pathway [NORMAL], Mode: [train], Output's Shape: (None, 6, 19, 19, 19) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [train], Input's Shape: (None, 2, 13, 13, 13) Block [0], Mode: [train], Input's Shape: (None, 2, 13, 13, 13) Block [1], Mode: [train], Input's Shape: (None, 4, 11, 11, 11) Block [2], Mode: [train], Input's Shape: (None, 5, 9, 9, 9) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [train], Output's Shape: (None, 6, 7, 7, 7) Pathway [FC], Mode: [train], Input's Shape: (None, 12, 19, 19, 19) Block [0], Mode: [train], Input's Shape: (None, 12, 19, 19, 19) Pathway [FC], Mode: [train], Output's Shape: (None, 5, 19, 19, 19) Pathway [NORMAL], Mode: [infer], Input's Shape: (None, 2, 7, 7, 7) Block [0], Mode: [infer], Input's Shape: (None, 2, 7, 7, 7) Block [1], Mode: [infer], Input's Shape: (None, 4, 5, 5, 5) Block [2], Mode: [infer], Input's Shape: (None, 5, 3, 3, 3) Pathway [NORMAL], Mode: [infer], Output's Shape: (None, 6, 1, 1, 1) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [infer], Input's Shape: (None, 2, 7, 7, 7) Block [0], Mode: [infer], Input's Shape: (None, 2, 7, 7, 7) Block [1], Mode: [infer], Input's Shape: (None, 4, 5, 5, 5) Block [2], Mode: [infer], Input's Shape: (None, 5, 3, 3, 3) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [infer], Output's Shape: (None, 6, 1, 1, 1) Pathway [FC], Mode: [infer], Input's Shape: (None, 12, 1, 1, 1) Block [0], Mode: [infer], Input's Shape: (None, 12, 1, 1, 1) Pathway [FC], Mode: [infer], Output's Shape: (None, 5, 1, 1, 1) Pathway [NORMAL], Mode: [infer], Input's Shape: (None, 2, 45, 45, 45) Block [0], Mode: [infer], Input's Shape: (None, 2, 45, 45, 45) Block [1], Mode: [infer], Input's Shape: (None, 4, 43, 43, 43) Block [2], Mode: [infer], Input's Shape: (None, 5, 41, 41, 41) Pathway [NORMAL], Mode: [infer], Output's Shape: (None, 6, 39, 39, 39) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [infer], Input's Shape: (None, 2, 19, 19, 19) Block [0], Mode: [infer], Input's Shape: (None, 2, 19, 19, 19) Block [1], Mode: [infer], Input's Shape: (None, 4, 17, 17, 17) Block [2], Mode: [infer], Input's Shape: (None, 5, 15, 15, 15) Pathway [SUBSAMPLED[3, 3, 3]], Mode: [infer], Output's Shape: (None, 6, 13, 13, 13) Pathway [FC], Mode: [infer], Input's Shape: (None, 12, 39, 39, 39) Block [0], Mode: [infer], Input's Shape: (None, 12, 39, 39, 39) Pathway [FC], Mode: [infer], Output's Shape: (None, 5, 39, 39, 39) =========== Building Trainer =========== Building Trainer. COST: Using cross entropy with weight: 1.0 ...Initializing state of the optimizer... ----------- Creating Tensorboard Loggers ----------- Loggers created successfully -----------=============================----------- =========== Compiling the Training Function =========== ======================================================= ...Building the training function... ...Collecting ops and feeds for training... Done. =========== Compiling the Validation Function ========= ...Building the validation function... ...Collecting ops and feeds for validation... Done. =========== Compiling the Testing Function ============ ...Building the function for testing and visualisation of FMs... ...Collecting ops and feeds for testing... Done. =========== Initializing network and trainer variables =============== All variables were initialized. Saving the initial model at:/vf/users/user/deepmedic/deepmedic-0.8.4/examples/output/saved_models//trainSessionWithValidTiny//tinyCnn.trainSessionWithValidTiny.initial.2023-07-26.12.10.28.410496 ======================================================= ============== Training the CNN model ================= ======================================================= ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Starting new Epoch! Epoch #0/2 ~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *********************************************************************************** * Starting new Subepoch: #0/2 * *********************************************************************************** [MAIN|PID:1245871] MULTIPROC: Before Validation in subepoch #0, submitting sampling job for next [VALIDATION]. [VAL|SAMPLER|PID:1245871] :=:=:=:=:=:=: Starting to sample for next [Validation]... :=:=:=:=:=:=: [VAL|SAMPLER|PID:1245871] Out of [2] subjects given for [Validation], we will sample from maximum [50] per subepoch. [VAL|SAMPLER|PID:1245871] Shuffled indices of subjects that were randomly chosen: [0, 1] [VAL|SAMPLER|PID:1245871] Will sample from [2] subjects for next Validation... [VAL|JOB:0|PID:1245871] Started. (#0/2) sampling job. Load & sample from subject of index (in user's list): 0 [VAL|JOB:0|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/Flair_subtrMeanDivStd.nii.gz [VAL|JOB:0|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [VAL|JOB:0|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [VAL|JOB:0|PID:1245871] Done. Samples per category: [Uniform: 2500/2500] [VAL|JOB:0|PID:1245871] TIMING: [Load: 0.7] [Preproc: 0.1] [Augm-Img: 0.0] [Sample Coords: 0.1] [Extract Sampl: 0.4] [Augm-Samples: 0.0] secs [VAL|JOB:1|PID:1245871] Started. (#1/2) sampling job. Load & sample from subject of index (in user's list): 1 [VAL|JOB:1|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/Flair_subtrMeanDivStd.nii.gz [VAL|JOB:1|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [VAL|JOB:1|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [VAL|JOB:1|PID:1245871] Done. Samples per category: [Uniform: 2500/2500] [VAL|JOB:1|PID:1245871] TIMING: [Load: 0.6] [Preproc: 0.1] [Augm-Img: 0.0] [Sample Coords: 0.1] [Extract Sampl: 0.4] [Augm-Samples: 0.0] secs [VAL|SAMPLER|PID:1245871] TIMING: Sampling for next [Validation] lasted: 2.6 secs. [VAL|SAMPLER|PID:1245871] :=:=:=:=:=:= Finished sampling for next [Validation] =:=:=:=:=:=: [MAIN|PID:1245871] MULTIPROC: Before Validation in subepoch #0, submitting sampling job for next [TRAINING]. V-V-V-V- Validating for subepoch before starting training iterations -V-V-V-V [TRA|SAMPLER|PID:1245871] :=:=:=:=:=:=: Starting to sample for next [Training]... :=:=:=:=:=:=: [VALIDATION] Processed 0/100 batches for this subepoch... [TRA|SAMPLER|PID:1245871] Out of [2] subjects given for [Training], we will sample from maximum [50] per subepoch. [TRA|SAMPLER|PID:1245871] Shuffled indices of subjects that were randomly chosen: [0, 1] [TRA|SAMPLER|PID:1245871] Will sample from [2] subjects for next Training... [TRA|JOB:0|PID:1245871] Started. (#0/2) sampling job. Load & sample from subject of index (in user's list): 0 [TRA|JOB:0|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/Flair_subtrMeanDivStd.nii.gz [VALIDATION] Processed 20/100 batches for this subepoch... [TRA|JOB:0|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [VALIDATION] Processed 40/100 batches for this subepoch... [VALIDATION] Processed 60/100 batches for this subepoch... [TRA|JOB:0|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [VALIDATION] Processed 80/100 batches for this subepoch... [VALIDATION] Processed 100/100 batches for this subepoch... +++++++++++++++++++++++ Reporting Accuracy over whole subepoch +++++++++++++++++++++++ VALIDATION: Epoch #0, Subepoch #0, Overall: mean accuracy: 0.0436 => Correctly-Classified-Voxels/All-Predicted-Voxels = 218/5000 +++++++++++++++ Reporting Accuracy over whole subepoch for Class-0 ++++++++ [Whole Foreground (Pos) Vs Background (Neg)] ++++++++++++++++ VALIDATION: Epoch #0, Subepoch #0, Class-0: mean accuracy: 0.0950 => (TruePos+TrueNeg)/All-Predicted-Voxels = 475/5000 VALIDATION: Epoch #0, Subepoch #0, Class-0: mean sensitivity: 1.0000 => TruePos/RealPos = 475/475 ... VALIDATION: Epoch #0, Subepoch #0, Class-4: mean Dice: 0.0258 =============== LOGGING TO TENSORBOARD =============== Logging VALIDATION metrics Epoch: 0 | Subepoch 0 Step number (index of subepoch since start): 0 --- Logging per class metrics --- Logged metrics: ['samples: accuracy', 'samples: sensitivity', 'samples: precision', 'samples: specificity', 'samples: Dice'] ====================================================== TIMING: Validation on batches of subepoch #0 lasted: 0.9 secs. [TRA|JOB:0|PID:1245871] Done. Samples per category: [Foreground: 250/250] [Background: 250/250] [TRA|JOB:0|PID:1245871] TIMING: [Load: 0.8] [Preproc: 0.1] [Augm-Img: 3.9] [Sample Coords: 0.1] [Extract Sampl: 0.1] [Augm-Samples: 0.3] secs [TRA|JOB:1|PID:1245871] Started. (#1/2) sampling job. Load & sample from subject of index (in user's list): 1 [TRA|JOB:1|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0006_1/Flair_subtrMeanDivStd.nii.gz [TRA|JOB:1|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [TRA|JOB:1|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [TRA|JOB:1|PID:1245871] Done. Samples per category: [Foreground: 250/250] [Background: 250/250] [TRA|JOB:1|PID:1245871] TIMING: [Load: 0.6] [Preproc: 0.1] [Augm-Img: 0.0] [Sample Coords: 0.1] [Extract Sampl: 0.1] [Augm-Samples: 0.3] secs [TRA|SAMPLER|PID:1245871] TIMING: Sampling for next [Training] lasted: 6.8 secs. [TRA|SAMPLER|PID:1245871] :=:=:=:=:=:= Finished sampling for next [Training] =:=:=:=:=:=: [MAIN|PID:1245871] MULTIPROC: Before Training in subepoch #0, submitting sampling job for next [VALIDATION]. -T-T-T-T- Training for this subepoch... May take a few minutes... -T-T-T-T- [VAL|SAMPLER|PID:1245871] :=:=:=:=:=:=: Starting to sample for next [Validation]... :=:=:=:=:=:=: [TRAINING] Processed 0/100 batches for this subepoch... [VAL|SAMPLER|PID:1245871] Out of [2] subjects given for [Validation], we will sample from maximum [50] per subepoch. [VAL|SAMPLER|PID:1245871] Shuffled indices of subjects that were randomly chosen: [0, 1] [VAL|SAMPLER|PID:1245871] Will sample from [2] subjects for next Validation... [VAL|JOB:0|PID:1245871] Started. (#0/2) sampling job. Load & sample from subject of index (in user's list): 0 [VAL|JOB:0|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0003_1/Flair_subtrMeanDivStd.nii.gz [VAL|JOB:0|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [VAL|JOB:0|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [VAL|JOB:0|PID:1245871] Done. Samples per category: [Uniform: 2500/2500] [VAL|JOB:0|PID:1245871] TIMING: [Load: 0.6] [Preproc: 0.1] [Augm-Img: 0.0] [Sample Coords: 0.0] [Extract Sampl: 0.5] [Augm-Samples: 0.0] secs [VAL|JOB:1|PID:1245871] Started. (#1/2) sampling job. Load & sample from subject of index (in user's list): 1 [VAL|JOB:1|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/validation/brats_2013_pat0004_1/Flair_subtrMeanDivStd.nii.gz [VAL|JOB:1|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [VAL|JOB:1|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [VAL|JOB:1|PID:1245871] Done. Samples per category: [Uniform: 2500/2500] [VAL|JOB:1|PID:1245871] TIMING: [Load: 0.7] [Preproc: 0.2] [Augm-Img: 0.0] [Sample Coords: 0.1] [Extract Sampl: 0.6] [Augm-Samples: 0.0] secs [VAL|SAMPLER|PID:1245871] TIMING: Sampling for next [Validation] lasted: 3.0 secs. [VAL|SAMPLER|PID:1245871] :=:=:=:=:=:= Finished sampling for next [Validation] =:=:=:=:=:=: [TRAINING] Processed 20/100 batches for this subepoch... [TRAINING] Processed 40/100 batches for this subepoch... [TRAINING] Processed 60/100 batches for this subepoch... [TRAINING] Processed 80/100 batches for this subepoch... [TRAINING] Processed 100/100 batches for this subepoch... +++++++++++++++++++++++ Reporting Accuracy over whole subepoch +++++++++++++++++++++++ TRAINING: Epoch #0, Subepoch #0, Overall: mean accuracy: 0.3992 => Correctly-Classified-Voxels/All-Predicted-Voxels = 2738260/6859000 TRAINING: Epoch #0, Subepoch #0, Overall: mean cost: 1.38944 +++++++++++++++ Reporting Accuracy over whole subepoch for Class-0 ++++++++ [Whole Foreground (Pos) Vs Background (Neg)] ++++++++++++++++ ... TRAINING: Epoch #0, Subepoch #0, Class-4: mean precision: 0.3182 => TruePos/(TruePos+FalsePos) = 299811/942154 TRAINING: Epoch #0, Subepoch #0, Class-4: mean specificity: 0.8967 => TrueNeg/RealNeg = 5577735/6220078 TRAINING: Epoch #0, Subepoch #0, Class-4: mean Dice: 0.3792 =============== LOGGING TO TENSORBOARD =============== Logging TRAINING metrics Epoch: 0 | Subepoch 0 Step number (index of subepoch since start): 0 --- Logging average metrics for all classes --- Logged metrics: ['samples: accuracy', 'samples: cost'] --- Logging per class metrics --- Logged metrics: ['samples: accuracy', 'samples: sensitivity', 'samples: precision', 'samples: specificity', 'samples: Dice'] ====================================================== TIMING: Training on batches of this subepoch #0 lasted: 16.7 secs. *********************************************************************************** * Starting new Subepoch: #1/2 * *********************************************************************************** [MAIN|PID:1245871] MULTIPROC: Before Validation in subepoch #1, submitting sampling job for next [TRAINING]. V-V-V-V- Validating for subepoch before starting training iterations -V-V-V-V [TRA|SAMPLER|PID:1245871] :=:=:=:=:=:=: Starting to sample for next [Training]... :=:=:=:=:=:=: [VALIDATION] Processed 0/100 batches for this subepoch... [TRA|SAMPLER|PID:1245871] Out of [2] subjects given for [Training], we will sample from maximum [50] per subepoch. [TRA|SAMPLER|PID:1245871] Shuffled indices of subjects that were randomly chosen: [0, 1] [TRA|SAMPLER|PID:1245871] Will sample from [2] subjects for next Training... [TRA|JOB:0|PID:1245871] Started. (#0/2) sampling job. Load & sample from subject of index (in user's list): 0 [TRA|JOB:0|PID:1245871] Loading subject with 1st channel at: /vf/users/user/deepmedic/deepmedic-0.8.4/examples/dataForExamples/brats2015TrainingData/train/brats_2013_pat0005_1/Flair_subtrMeanDivStd.nii.gz [VALIDATION] Processed 20/100 batches for this subepoch... [VALIDATION] Processed 40/100 batches for this subepoch... [VALIDATION] Processed 60/100 batches for this subepoch... [VALIDATION] Processed 80/100 batches for this subepoch... [VALIDATION] Processed 100/100 batches for this subepoch... +++++++++++++++++++++++ Reporting Accuracy over whole subepoch +++++++++++++++++++++++ VALIDATION: Epoch #0, Subepoch #1, Overall: mean accuracy: 0.8306 => Correctly-Classified-Voxels/All-Predicted-Voxels = 4153/5000 ... VALIDATION: Epoch #0, Subepoch #1, Class-4: mean sensitivity: 0.7910 => TruePos/RealPos = 53/67 VALIDATION: Epoch #0, Subepoch #1, Class-4: mean precision: 0.1312 => TruePos/(TruePos+FalsePos) = 53/404 VALIDATION: Epoch #0, Subepoch #1, Class-4: mean specificity: 0.9288 => TrueNeg/RealNeg = 4582/4933 VALIDATION: Epoch #0, Subepoch #1, Class-4: mean Dice: 0.2251 =============== LOGGING TO TENSORBOARD =============== Logging VALIDATION metrics Epoch: 0 | Subepoch 1 Step number (index of subepoch since start): 1 --- Logging per class metrics --- Logged metrics: ['samples: accuracy', 'samples: sensitivity', 'samples: precision', 'samples: specificity', 'samples: Dice'] ====================================================== TIMING: Validation on batches of subepoch #1 lasted: 0.6 secs. [TRA|JOB:0|PID:1245871] WARN: Loaded labels are dtype [float32]. Rounding and casting to [int16]! [TRA|JOB:0|PID:1245871] WARN: Loaded ROI-mask is dtype [float64]. Rounding and casting to [int16]! [TRA|JOB:0|PID:1245871] Done. Samples per category: [Foreground: 250/250] [Background: 250/250] ... TIMING: Training process lasted: 134.6 secs. Closing worker pool. Saving the final model at:/vf/users/userXC /deepmedic/deepmedic-0.8.4/examples/output/saved_models//trainSessionWithValidTiny//tinyCnn.trainSessionWithValidTiny.final.2023-07-26.12.12.43.174418 The whole do_training() function has finished. ======================================================= =========== Training session finished ================= ======================================================= Finished.
[user@biowulf]$ deepMedicRun -model ./examples/configFiles/tinyCnn/model/modelConfig.cfg \ -train examples/configFiles/tinyCnn/train/trainConfigWithValidation.cfgi -dev cuda0 ... TIMING: Training process lasted: 85.7 secs. Closing worker pool. Saving the final model at:/vf/users/$USER/deepmedic/deepmedic-0.8.4/examples/output/saved_models//trainSessionWithValidTiny//tinyCnn.trainSessionWithValidTiny.final.2023-07-26.12.19.06.158002 The whole do_training() function has finished. ======================================================= =========== Training session finished ================= ======================================================= Finished. [user@cn0861 ~]$ exit salloc.exe: Relinquishing job allocation 46116226