Many workloads require CPU processing power as all associated tasks can’t be accelerated with the GPU, or even if they could, this might require a custom fork of the specific machine learning algorithm. For instance, object detection algorithms such as YOLOv5 benefit from prudent image pre-processing mainly handled by the CPU. In such cases, the CPU becomes a bottleneck in the training runs and doesn’t allow for optimal GPU utilization. To mitigate this problem, we have put efforts to introduce a new type of NVIDIA® GeForce® RTX 3090 and NVIDIA® GeForce® RTX 3080 GPU instances. For only a 10% price increase, the new ‘CPU and Memory optimized’ instances now offer double the CPU count and system memory size compared to the normal instances.
In this blog post, we will share with you the results from one of our benchmarking tests performed on both CPU and Memory optimized and normal RTX 3090 GPU instances. We trained the YOLOv5 model on the COCO dataset and thus investigated the differences between the two instances and how the optimized one can offer great help to the Genesis Cloud community. We trained our model with a batch count of 48 and in 10 epochs as we tried to inspect the differences in CPU/GPU utilization between the two instance types as well as the time taken to complete the training.
When training the YOLOv5 model on a normal instance, the CPU reached a value of 97.7% and thus becomes a bottleneck in the training runs and only allows the GPU for an average utilization of 38%. In contrast, when running the training on an optimized instance, the maximum CPU utilization is only 89.9%, allowing the GPU to reach up to 78% maximum utilization and 65% on average as we can see in the table below:
Mitigating CPU bottlenecks can result in a great benefit to customers in terms of training time, and therefore service costs: Training on an optimized instance was executed in about half the time of training on a normal instance. Considering a 10% increase in price, it can be said that using the optimized instances can enable the user to train their algorithms with a 1.5x cheaper price.
The table and graph below showcase the training time and price comparison between the optimized and normal RTX 3090 instances (prices are as of November 2022):
As highlighted throughout this blog post, the new CPU and Memory optimized RTX 3090 and 3080 GPU instances now offer more affordable and cost-effective solutions for all Genesis Cloud Customers. However, it is not always the case that training your model results in a CPU bottleneck: If your model doesn’t require a lot of CPU processing power and does not hamper the GPU utilization of your instance, it is always advisable to use the normal instances as they are cheaper and will give the same results in the end.
The Genesis Cloud team
Never miss out again on Genesis Cloud news and our special deals: follow us on Twitter, LinkedIn, or Reddit.
Sign up for an account with Genesis Cloud here. If you want to find out more, please write to contact@genesiscloud.com.