Contents
- Introduction
- Userguide
BETA RELEASE
BIANCA is currently considered to be in testing, meaning that you will need to inspect its output a bit more carefully to see if you are happy with it. The output will depend critically on the choice of options and the quality of the training data and manual segmentations. Further recommendations and automated tuning methods, will be supported in an upcoming release.
Data preparation
Images preparation
BIANCA works in single subject's space, but all the MRI modalities need to be registered to a common base image.
Before running BIANCA you need to:
- Choose your base image. This will be your reference space for your input and output images.
- Perform brain extraction on (at least) one modality to be able to derive a brain mask *
- Register all the other modalities to your base image for each subject.
* If you want to further restrict the area were lesions will be detected (and reduce false positives) you can consider pre-masking as described in section on Masking)
Training dataset preparation
The algorithm requires a training set with pre-classified voxels (i.e. manually segmented images) that is used to create a set of feature vectors for lesion and non-lesion classes.
The lesion masks to be used as training dataset need to be:
- binary (1=lesion; 0=non-lesion)
- in nifti (nii.gz) format
- in the same space as your base image. If the manual segmentation was done on an image that was not the base image, the lesion mask need to be registered and transformed to the base image space.
Running BIANCA
Master file preparation
The master file is a text file that contains a row per subject (training or query) and on each row a list of all files needed for that subject:
- The images you want to use for classification (e.g. T1 and FLAIR), all coregistered to the same base space (see above).
- One brain extracted image to be able to derive a brain mask (see above)
- The binary manual lesion mask (for query subjects use any "placehold" name to keep the same column order of the training subjects), coregistered to the base space (if needed).
- [optional] The transformation matrix from subject space to standard space. Needed to calculate spatial features (from MNI coordinates)
These can be in any (consistent) order, as the following options will specify the meaning of each column.
Example master file (masterfile.txt): subj01/FLAIR_brain.nii.gz subj01/T1_to_FLAIR.nii.gz subj01/FLAIR_to_MNI.mat subj01/WMHmask.nii.gz subj02/FLAIR_brain.nii.gz subj02/T1_to_FLAIR.nii.gz subj02/FLAIR_to_MNI.mat subj02/WMHmask.nii.gz ... subj<N>/structural/FLAIR_brain.nii.gz subj<N>/T1_to_FLAIR.nii.gz subj<N>/FLAIR_to_MNI.mat subj<N>/WMHmask.nii.gz
BIANCA options
Compulsory arguments:
--singlefile=<masterfile> name of the master file (e.g. masterfile.txt)
--querysubjectnum=<num> row number in the master file of the query subject (the one to be segmented)
--brainmaskfeaturenum=<num> column number in the master file containing the name of the image to derive non-zero (brain) mask from (1 in the example above). Note that this does not need to be a binary/mask image - it only needs to have zeros outside the brain (or ROI) and non-zeros inside.
- Training dataset specification:
- If the training subjects to use are listed in the master file, the following arguments need to be specified:
--labelfeaturenum=<num> column number in the master file containing the name of the manual lesion mask files (labelled images; 4 in the example above)
- and
--trainingnums=<val> subjects to be used in training. List of row numbers (comma separated, no spaces) or all to use all the subjects in the master file. If the query subject is also a training subject, it is automatically excluded from the training dataset and the lesions are estimated from the remaining training subjects
Alternatively load from file (previously saved with --saveclassifierdata, see below):
--loadclassifierdata=<name> load training data (and labels) from file
- If the training subjects to use are listed in the master file, the following arguments need to be specified:
Optional arguments:
-o output (base) file name (default: bianca_output)
--featuresubset=<num>,<num>,... list of column numbers (comma separated, no spaces) in the master file containing the name of the images to use as intensity features (1,2 in the example above to use FLAIR and T1)(default: use all modalities as features). The image used to specify the non-zero (brain) mask from must be part of the features subset.
--matfeaturenum=<num> column number in masterlistfile of matrix files (transformation matrix from the base space to the MNI space). Needed to extract spatial features (MNI coordinates; 3 in the example above)
--spatialweight=<value> weighting for spatial coordinates (default = 1, i.e. variance-normalised MNI coordinates). Requires --matfeaturenum to be specified. If set to 0 the spatial coordinates will be ignored (no need to specify --matfeaturenum). Higher value for spatial weighting leads to the neighbouring feature vectors being more likely to come from similar spatial locations (effectively making the training data more local). At present on linear transformations are supported, with non-linear transforms planned for a future release.
--patchsizes=<num>,<num>,... list of patch sizes in voxels (comma separated, no spaces) for local averaging.
--patch3D use 3D patches (default is 2D)
--selectpts=<val> where to select the non-lesion points from the training dataset. Options: any (anywhere outside the lesion - default), noborder (exclude voxels close to the lesion’s edge), surround (preferably close to the lesion’s edge)
--trainingpts=<val> number (max) of (lesion) points to use (per training subject) or equalpoints to select all lesion points and equal number of non-lesion points (default: 2000)
--nonlespts=<val> number (max) of non-lesion points to use (per training subject). If not specified will be set to the same amount of lesion points (specified in --trainingpts)
--saveclassifierdata=<name> save training data to file. Two files will be saved: <name> and <name>_labels. When loading the training dataset with --loadclassifierdata, just specify <name> and both files will be loaded.
-v use verbose mode
BIANCA example call
bianca --singlefile=masterfile.txt --labelfeaturenum=4 --brainmaskfeaturenum=1 --querysubjectnum=1 --trainingnums=1,2,3,4,5,6,7,8,9,10 --featuresubset=1,2 --matfeaturenum=3 --trainingpts=2000 --nonlespts=10000 --selectpts=noborder -o sub001_bianca_output –v
With this command BIANCA will use data from masterfile.txt. It will look for information about pre-labelled images in the 4th column of the master file and will limit the search to the mask derived from the image in the 1st column. The subject to segment is the first subject of the master file (first row). Since this subject is also one of the training subjects, BIANCA will use only the remaining 9 for the training (like the LOO approach, to avoid bias and overfitting). BIANCA will use as spatial features the images in the 1st and 2nd columns of the master file. It will also extract the spatial features (MNI coordinate) using the transformation matrix listed in the 3rd column of the master file. For the training, BIANCA will use, for each training subject, (up to) 2000 points among the voxels labeled as lesion and (up to) 10000 points among the non-lesion voxels, excluding voxels close to the lesion’s edge. The output image will be called sub001_bianca_output. Verbose mode is on.
Post-processing
Threshold
BIANCA’s output is a 'probability' map of voxels to be classified as lesions. In order to obtain a binary mask, a thresholding step is needed. This can be easily done with fslmaths (e.g. to threshold at 0.9):
fslmaths sub001_bianca_output –thr 0.9 –bin sub001_bianca_output_thr09bin
Check your own data to establish the best threshold (e.g. by evaluating the overlap with the manual mask – see section Performance Evaluation for more details)
Masking
If you see false positive hyperintensities due to artefacts such as CSF pulsation artefacts on FLAIR, it might be useful to apply a mask to exclude the affected region(s). Note that BIANCA is not optimized for segmentation of (juxta)cortical, cerebellar and subcortical lesions.
The script below creates an example of inclusion mask from T1 images, which excludes cortical GM and the following structures: putamen, globus pallidus, nucleus accumbens, thalamus, brainstem, cerebellum, hippocampus, amygdala. The cortical GM is excluded from the brain mask by extracting the cortical CSF from single-subject’s CSF pve map, dilating it to reach the cortical GM, and excluding these areas. The other structures are identified in MNI space, non-linearly registered to the single-subjects’ images, and removed from the brain mask.
make_bianca_mask <structural_image> <CSF pve> <warp_file_MNI2structural> <keep_intermediate_files>
The first input is the basename of the structural image (e.g. T1_biascorr). The script works under the assumption that the brain extracted image would be called <structural image>_brain.nii.gz. The second input is the CSF pve map (e.g. output from FSL-FAST). The third input is the non-linear transformation warp file from standard space to structural image. If you ran fsl_anat, you can use the file MNI_to_T1_nonlin_field.nii.gz in the fsl_anat output directory. If you have the warp file from structural to MNI, you can calculate the inverse with the command invwarp (invwarp -w warpvol -o invwarpvol -r refvol) If you use 1 for the last command line argument (keep_intermediate_files), the folder containing temporary files will not be deleted.
Main output file:
<structural image>_bianca_mask.nii.gz. Binary mask with 0 for regions to exclude and 1 to include.
In case T1 is not your base space, you need to register the mask to the base space.
This mask can be applied to the BIANCA output (either before or after thresholding):
fslmaths sub001_bianca_output –mas T1_bianca_mask_to_FLAIR sub001_bianca_output_masked
Alternatively, this can be used to mask the input image, creating a tighter brain mask:
fslmaths FLAIR –mas T1_bianca_mask_to_FLAIR FLAIR_masked
where FLAIR_masked.nii.gz can be used instead of FLAIR_brain.nii.gz in the master file and used for the --brainmaskfeaturenum option.
Additional output:
<structural image>_vent.nii.gz. Binary mask of segmented ventricles. This can be used to extract periventricular lesions (see Volume Extraction for details)
Performance evaluation
The script below can be used to evaluate BIANCA performance against a manual (reference) segmentation:
bianca_overlap_measures <lesionmask> <threshold> <manualmask> <saveoutput>
It extracts the following overlap measures (see reference paper for details):
- Dice Similarity Index (SI): calculated as 2*(voxels in the intersection of manual and BIANCA masks)/(manual mask lesion voxels + BIANCA lesion voxels).
- Voxel-level false positive fraction (FPF): number of voxels incorrectly labelled as lesion divided by the total number of voxels labelled as lesion by BIANCA; also called False Discovery Rate
- Voxel-level false negative ratio (FNR): number of voxels incorrectly labelled as non-lesion divided by the total number of voxels labelled as lesion in the manual mask
- Cluster-level FPF: number of clusters incorrectly labelled as lesion divided by the total number of clusters found by BIANCA
- Cluster-level FNR: number of clusters incorrectly labelled as non-lesion divided by the total number of lesions in the manual mask
- Mean Total Area (MTA): average total WMH volume of the manual mask and BIANCA output.
- Detection error rate (DER): sum of voxels (WMH volume) belonging to FP or FN clusters, divided by MTA.
- Outline error rate (OER): sum of voxels belonging to true positive clusters (WMH clusters detected by both manual and BIANCA segmentation), excluding the overlapping voxels, divided by MTA
In addition it calculates:
- Volume of BIANCA segmentation (after applying the specified threshold)
- Volume of manual mask
The first input is the lesion mask calculated by BIANCA (e.g. sub001_bianca_output.nii.gz), the second input is the threshold that will be applied to <lesionmask> before calculating the overlap measures (if you have already thresholded and binarised the lesion mask you can simply put 0), the third input is the manual mask, used as reference to calculate the overlap measures. If <saveoutput> is set to 0 it will output the measures' names and values on the screen with the following order: SI, FPF, FNR, FPF(cluster-level), FNR(cluster-level), DER, OER, MTA, lesion mask's volume, manual mask's volume. If <saveoutput> is set to 1 it will save only the values (in the same order) in a file called Overlap_and_Volumes_<lesionmask>_<threshold>.txt in the same folder where the lesion mask is.
Volume extraction
The script below can be used to extract number of clusters (lesions) and the volume of lesions in any BIANCA output image.
bianca_cluster_stats <bianca_output_map> <threshold> <min_cluster_size> [<mask>]
Total
If only 3 inputs are provided it will output the total number of clusters and the total lesion volume after applying <threshold> (if you have already thresholded and binarised the lesion mask you can simply put 0) and including clusters bigger than <min_cluster_size>, where the size is expressed in number of voxels.
Within a mask
If the optional <mask> file is specified, it will also calculate the number of clusters and lesion volume within the specified mask. The mask needs to be in the same space as <bianca_output_map>
To extract WMH volumes within a certain distance from the ventricles, as a measure of periventricular (or deep) WMH, you can do the following:
Generate a ventricle mask (<structural image>_vent.nii.gz) from your T1 image with make_bianca_mask (see section Masking for details)
- (if needed) register it to your base image
Create a distance map from the ventricles using FSL function distancemap, where the intensity of each voxel represent its distance in mm from the ventricles mask: distancemap <structural image>_vent.nii.gz dist_to_vent
Threshold the distance map to obtain the region within <dist> mm of the ventricles, fslmaths dist_to_vent -uthr <dist> -bin dist_to_vent_periventricular or, alternatively, to look for deep WMH (more than <dist> mm from the ventricles) fslmaths dist_to_vent -thr <dist> -bin dist_to_vent_deep
Use the obtained mask to calculate the volumes of WMH within that area with bianca_cluster_stats or to mask BIANCA output with fslmaths.