The most common obstacle users encounter when trying to create a CloudyCluster environment is a conflict with their GCP Project Quotas.
Projects have resource limits in place, which are called Quotas. GCP Quotas are designed to prevent users of a project from accidentally creating too much of a certain resource. By default, the quotas for most resources are set reletively low, which is good for protecting against accidental charges to your billing account, but can impede CloudyCluster from creating a functional Torque/Slurm environment.
Estimating and adjusting the quotas in your project is critical for accommodating your HPC workloads inside the GCP project running CloudyCluster. This will serve as a brief guide to estimate resource requirements and properly adjust quotas when using CloudyCluster.
Understanding Quotas
The most important step in this process is to understand what quotas actually are and how to adust them.
Please start by reading the GCP documentation on quotas. You will learn how to check and adjust quotas for each resource type.
Estimating Quotas
You will need to estimate the total resource requirements of a deployed CloudyCluster environment running your specific workload. This can be simplified into a basic formula:
GCP_QUOTA = BASE_ENVIRONMENT_RESOURCES + MAX_WORKLOAD_RESOURCES
GCP_QUOTA is the necessary quota for a particular resource type.
BASE_ENVIRONMENT_RESOURCES are the resources needed to start a CloudyCluster environment of the size you desire. This includes the control, login, scheduler, and filesystem VM instances and their CPU, memory, and disk requirements. Quotas must be adjusted to accommodate your base environment before it is created.
MAX_WORKLOAD_RESOURCES are the maximum resources required to to accommodate your various workloads. This includes your compute VM instances, and any hardware they are using(including GPUs). Compute instances are created dynamically at the time a CloudyCluster job is submitted. Quotas can be adjusted to accommodate larger workloads at any time, even after the base environment has been created.
It is highly recommended that you create a document or spreadsheet to keep track of your resource requirements!
Base CloudyCluster Resources
To determine the resource requirements of your base environment, first decide how large you want it to be!
Start with the control instance. When you deploy CloudyCluster from the GCP Marketplace, take note of the Persistent Disk(PD), Memory, and CPU requirements of your desired control instance type. The default values are sufficent, but you may want to raise the boot disk space to create a custom image, or perhaps raise the processing power of the control node.
Next, follow the GCP Quick Start Deployment Guide to deploy the control node via the marketplace and gain access. Once you enter the cluster creation wizard, follow the Quick Start Environment Setup Procedure to determine the right size cluster for your use case. Keep careful note of the Persistent Disk, CPU, and Memory of each node: login, scheduler, and filesystem(s). Also be sure to keep track of the number of instaces, because that is a quota as well! The # of instances for a base environment is 3 + the number of fileystem instances specified(at least 1).
Add these requirements to your doc, check and adjust the quotas if necessary, and then create your cluster!
Workload Resources
While you are reading about the different resource types in the GCP documentation, keep your particular workload in mind, and think of specific requirements in terms of the resource types outlined in the documentation. For example, if you know your jobs require GPUs and will run best on NVIDIA V100s, be sure to adjust the NVIDIA_V100_GPUS quota in the region you intend to run your CloudyCluster environment.
Then calculate the totals for the number of desired compute instances, memory, CPU, and disk requirements. The disk will be the default size of the CloudyCluster VM image(currently 55GB), unless you created a custom image.
Adjusting Quotas
Once you have the totals for each quota, you will need to adjust the quotas for your GCP project. Follow the GCP documentation and head to the Quotas page to view and adjust them. If you run into permissions issues, contact the owner of your project, or GCP customer support if you are using a project Google has provisioned for you.
Help
Our team is here to help! If you are unsure about what your quotas should be, contact us by emailing support@cloudycluster.com. Our support team will be happy to assist you with this process!