Parabricks provides a VM that can be easily run on GCP.
This VM comes pre-installed with the Parabricks genomic pipeline and algorithms.
By the end of this tutorial, you will having a running instance of Parabricks deployed on GCP and should be able to SSH into the instance to run genomic analyses (for example, germline pipeline).
This installation method will work well for customers who want to run one or more servers on GCP with Parabricks installed. For advanced use cases, you also have the option of installing the software yourself (see the installation guide for Server Installation) on local servers.
Running Parabricks on GCP requires a valid software license. Free trials are available so you can test the software without incurring licensing fees from Parabricks. You can contact email@example.com if you'd like to request a license for a software pilot. (Note: GCP will charge its own fees for any computing resources that you consume.)
By default, GCP Compute Engine has resource quotas in place. Parabricks requires some special compute resources which are not enabled under the default quotas. In order to run Parabricks you will need to increase some of the quotas.
Quotas for regular (non-preemptible) Virtual Machine instances:
Minimum Per Parabricks Node
NVIDIA V100 GPUs
4-8 when using V100 GPUs
NVIDIA P100 GPUs
4 when using P100 GPUs
48 for nodes with 4 GPUs.
80 for nodes with 8 GPUs.
Local SSD (GB)
(Optional) If you attach a data volume to the App in Google Martketplace, it must be smaller than this quota.
Quotas for preemptible Virtual Machine instances: The quotas for preemptible instances are the same as the ones above except that GCP adds the word "Preemptible" in front of them. For example, "CPUs" becomes "Preemptible CPUs" and "NVIDIA V100 GPUs" becomes "Preemptible NVIDIA V100 GPUs".
Step 1: Visit your GCP Console and choose the project in which you want to launch Parabricks. Find the Marketplace and click on it.
Step 2: Search for the "Parabricks" application in the Marketplace and select it. Choose "Launch".
Step 3: Verify that you are satisfied with the virtual machine settings and choose "Deploy". (The default options are suitable for many use cases but most options are configurable in case you wish to change them to suite your needs.)
Step 4: The virtual machine may take 5-15 minutes to startup. One the instance has started, you should be able to SSH into it to run genomic analysis using Parabricks.
Copy your local license file up to the virtual machine:
$ gcloud compute scp ~/your-parabricks-license.bin \<your-instance>:~/
You can ssh into your instance directly from the GCP Console or else from your local terminal (assuming you have gcloud installed and have permission to ssh into the instance):
$ gcloud compute --project <your-project> ssh --zone <zone-of-your-instance> \<name-of-your-instance>
Copy your license from home folder to installation location
$ sudo cp ~/license.bin /opt/parabricks
Now you are ready to run Parabricks on Google Cloud Platform
# Step 3: verify your installation.# This should display the parabricks version number:$ pbrun version