This page provides an introduction to the training and inferencing of an InversionNet model with the FlatVel-A dataset and how to visualise the seismic data and velocity maps. It has been adapted from the official OpenFWI documentation.
Create your working space
You could copy the ready-to-use archive from /g/data/up99/sandbox/openFWI/repo to your working space as below.
$ cd YOUR_WORKING_DIRECTORY $ cp -r /g/data/up99/sandbox/openFWI/repo openFWI $ cd OpenFWI
OpenFWI training in a batch job
In your working space, you can submit a PBS job with GPU resources to run the OpenFWI training process.
Within the job script, you would need to load in the following module:
$ module use /g/data/up99/modulefiles $ module load openfwi/23.11
Then you can run the training process via train.py with the appropriate configuration flags. For example, you could run:
python train.py -ds flatvel-a -n nci_v100_eb-40 -j 12 --print-freq 20 -eb 40 -nb 1 -m InversionNet -t flatvel_a_train.txt -v flatvel_a_val.txt
The flags used in the command above are listed below:
'-ds': dataset name;
'-n' : 'folder name for this experiment
'-j' : number of data loading workers (default: 16)
'--print-freq':print frequency
'-eb': epochs in a saved block
'-nb': number of saved block
'-m' : inverse model name
'-t' : name of train anno
'-v' : name of val anno
The above command will run the training process with the following configuration settings. Those settings undefined by user arguments are filled with their default values:
device : cuda dataset : flatvel-a file_size : None anno_path : split_files train_anno : split_files/flatvel_a_train.txt val_anno : split_files/flatvel_a_val.txt output_path : Invnet_models/nci_v100_eb-40/ log_path : Invnet_models/nci_v100_eb-40/ save_name : nci_v100_eb-40 suffix : None model : InversionNet up_mode : None sample_spatial : 1.0 sample_temporal : 1 batch_size : 256 lr : 0.0001 lr_milestones : [] momentum : 0.9 weight_decay : 0.0001 lr_gamma : 0.1 lr_warmup_epochs : 0 epoch_block : 40 num_block : 1 workers : 12 k : 1 print_freq : 20 resume : None start_epoch : 0 lambda_g1v : 1.0 lambda_g2v : 1.0 sync_bn : False world_size : 1 dist_url : env:// tensorboard : False epochs : 40 distributed : False
An example OpenFWI training PBS job script (training.sh) is given below:
#!/bin/bash #PBS -q gpuvolta #PBS -P <insert_compute_project_code> #PBS -l ngpus=1 #PBS -l ncpus=12 #PBS -l mem=190GB #PBS -l jobfs=100GB #PBS -l walltime=01:00:00 #PBS -l storage=gdata/up99+<insert_gdata_or_scratch_project_code_that_contains_the_OpenFWI_scripts> #PBS -l wd #PBS -N OpenFWI_training module use /g/data/up99/modulefiles module load openfwi/23.11 cd $PBS_O_WORKDIR ### or cd to where the OpenFWI repo is located (that contains the files train.py, test.py, training.sh, etc...). python train.py -ds flatvel-a -n nci_v100_eb-40 -j 12 --print-freq 20 -eb 40 -nb 1 -m InversionNet -t flatvel_a_train.txt -v flatvel_a_val.txt
To submit training.sh:
$ qsub training.sh
After the training process finishes, you will see a checkpoint file and a model file created under the directory "Invnet_models/nci_v100_eb-40". This is the directory we specified to store the training results. Now you are ready to conduct the testing process.
OpenFWI testing in a batch job
After running the training process to produce the pertaining model, you can run the testing process to check how the trained network performs on the training data and compare it with the ground truth results.
You can put the following command into a PBS job with GPU resources:
python test.py -ds flatvel-a -j 12 -n nci_v100_eb-40 -m InversionNet -v flatvel_a_val.txt -r checkpoint.pth --vis -vb 2 -vsa 3
The flags have the same meaning as those in train.py. Make sure the folder name is consistent between the training and testing processes. The 3 new flags are listed below:
'–vis': visualization option
'-vb': number of batch to be visualized
'-vsa': number of samples in a batch to be visualized
An example OpenFWI testing PBS job script (testing.sh) is provided below:
#!/bin/bash #PBS -q gpuvolta #PBS -P <insert_compute_project_code> #PBS -l ngpus=1 #PBS -l ncpus=12 #PBS -l mem=50GB #PBS -l jobfs=100GB #PBS -l walltime=00:02:00 #PBS -l storage=gdata/up99+<insert_gdata_or_scratch_project_code_that_contains_the_OpenFWI_scripts> #PBS -l wd #PBS -N OpenFWI_testing module use /g/data/up99/modulefiles module load openfwi/23.11 cd $PBS_O_WORKDIR python test.py -ds flatvel-a -j 12 -n nci_v100_eb-40 -m InversionNet -v flatvel_a_val.txt -r checkpoint.pth --vis -vb 2 -vsa 3
To submit testing.sh:
$ qsub testing.sh
After running the testing process, you will find several figure files under the folder "nci_v100_eb-40/visualization". Each file contains both "Prediction" and "Ground Truth" velocity figures, such as:
OpenFWI inference in JupyterLab
If you've set up your working environment by copying the ready-to-use archive from the 'project up99' directory on Gadi, you'll find a Jupyter notebook named 'inference.ipynb.' This notebook, prepared by NCI, is designed for visualizing prediction results.
The notebook already includes pretrained checkpoint files located in the 'Invnet_models/nci_v100_eb-40' directory, enabling direct execution of inference tasks. Additionally, you have the option to execute the training process, as demonstrated in the section above, to update the provided pretrained checkpoint files.
To run this notebook, launch a JupyterLab session on the Australian Research Environment (ARE) available at https://are.nci.org.au. For further details about ARE, please refer to the ARE User Guide.
To run the inference notebook, let's request the following in a ARE JupyterLab session
Walltime (hours): 2 |
Under the “Advanced options…” tab, add in the following Module:
Module directories: /g/data/up99/modulefiles Modules: openfwi/23.11
Now click the “Launch” tab and your JupyterLab job will join the queue:
After starting JupyterLab, navigate to the 'Open JupyterLab' tab.
In the left filesystem pane of your JupyterLab session, locate the directory where the 'inference.ipynb' Jupyter notebook is located. For instance, if you copied it to your home directory, navigate to the 'home' folder.
Launch and execute the OpenFWI inference notebook by opening the 'inference.ipynb' file.
In this notebook, you can choose to run from a clean test dataset by setting the args "-ds flatvel-a -j 12 -n nci_v100_eb-40 -m InversionNet -v flatvel_a_val.txt -r checkpoint.pth --vis -vb 2 -vsa 3"
It will plot figures comparing the "Prediction" and "Ground Truth" velocity:
You can also choose to add in Gaussian noise to the test datasets by using the "-- missing" and "--std" flags in args. For example:
"-ds flatvel-a -j 12 -n nci_v100_eb-40 -m InversionNet -v flatvel_a_val.txt -r checkpoint.pth --missing 1 --std 0.001 --vis -vb 2 -vsa 3".
In this case, the notebook will plot 5 channels of the seismic data with:
- the Gaussian noise added in the "Prediction".
- the original "Clean" seismic datasets.
- their difference, i.e. the noise itself.
The notebook will also produce the "Prediction" and "Ground Truth" velocity models. The left side velocity model is created from the model with the input of test datasets with additive Gaussian noise (i.e., the left side 5 channels from the figure above):