# 📖 How to Find a GPU Hosting Service – a Guide by Viraaj Akuthota

*For his project "[Human Rights Predictor](https://prototypefund.de/project/human-rights-predictor/)" (Round 15) our grantee Viraaj Akuthota was looking for a GPU hosting service. Here he explains how he went about it:*

To fine-tune models and create embeddings on large corpuses of qualitative data, a high amount of GPU RAM (VRAM) is required. For example, fine-tuning BERT on a dataset of 15k cases that vary in size creates roughly 100k-200k sequences at a 512 token limit. This requires approximately 140 GB of VRAM. This hardware requirement means such tasks cannot be conducted on most consumer-grade machines. I conducted an exercise to hopefully identify an affordable and relatively easy-to-use cloud compute option. During this search, I faced many difficulties. The benefits and disadvantages of the majority of service providers I reviewed can be found in the table below.

Overall, the production system I landed on is to utilize:  
· PaperSpace's Core using a Windows Server instance to avoid using the terminal as much as possible.  
· Always available Multi-GPU instances, for example, 4 x A6000 Nvidia GPUs with 192 GB VRAM total for roughly $7 USD an hour.  
· Approximately $3 USD per month for 50 GB persistent storage, making offline costs negligible.  
· For Linux users, they have a Python ML template which will save time installing python, packages, cuda, etc.

Before production, I utilise either Google Colab or HuggingFace:  
· For testing fine-tuning or creating embeddings, I believe Google Colab's free T4 instance provides the highest amount of VRAM for any free tier.  
· For testing LLMs, HuggingFace's serverless inference free tier allows you to utilize a variety of LLMs such as LLAMA 405B. However, the Pro tier at $9 USD per month increases the rate limit on this inference. I receive approximately 300 API calls per hour.

<table border="1" id="bkmrk-provider-benefits-di" style="border-collapse: collapse; width: 100%; height: 823.4px;"><colgroup><col style="width: 24.9383%;"></col><col style="width: 24.9383%;"></col><col style="width: 24.9383%;"></col><col style="width: 24.9383%;"></col></colgroup><tbody><tr style="height: 29.6px;"><td style="height: 29.6px;">Provider</td><td style="height: 29.6px;">Benefits</td><td style="height: 29.6px;">Disadvantages</td><td style="height: 29.6px;">GPU Limit  
</td></tr><tr style="height: 155.45px;"><td style="height: 155.45px;">Amazon EC2  
</td><td style="height: 155.45px;">- Relatively affordable compared to other cloud providers

</td><td style="height: 155.45px;">- Requires familiarity with AWS
- Application for quotas is not straightforward and the approval process takes time

</td><td style="height: 155.45px;">- Essentially unlimited

</td></tr><tr style="height: 222.65px;"><td style="height: 222.65px;">Amazon Notebooks  
</td><td style="height: 222.65px;">- Easy to set up an ML system
- Relatively affordable compared to other cloud providers

</td><td style="height: 222.65px;">- Notebooks are limited to certain GPU sizes, essentially under 100GB VRAM.
- Even if you have a quota for the underlying resource it will not work for a notebooks

</td><td style="height: 222.65px;">- Under 100GB VRAM

</td></tr><tr style="height: 205.85px;"><td style="height: 205.85px;">Microsoft Azure  
</td><td style="height: 205.85px;">  
</td><td style="height: 205.85px;">- The registration system and console is sufficiently complicated that I did not utilise this service.
- Quota application process did not seem straight forward.

</td><td style="height: 205.85px;">- Essentially unlimited

</td></tr><tr style="height: 121.05px;"><td style="height: 121.05px;">Google Cloud  
</td><td style="height: 121.05px;">  
</td><td style="height: 121.05px;">- Unable to secure access to a high-end GPU as they were **ALWAYS** unavailable

</td><td style="height: 121.05px;">  
</td></tr><tr style="height: 29.6px;"><td style="height: 29.6px;">Google Colab  
</td><td style="height: 29.6px;">- Very easy to use and set up

</td><td style="height: 29.6px;">- Relatively more expensive
- Not guaranteed access to the most powerful GPUs that is claimed to be accessible even with premium services

</td><td style="height: 29.6px;">- A100 GPU with 40GB VRAM, if available, which is rare

</td></tr><tr style="height: 29.6px;"><td style="height: 29.6px;">Paperspace Notebooks  
</td><td style="height: 29.6px;">- Very easy to use and set up
- Multiple ‘free’ GPU availability with unlimited hours at the premium option

</td><td style="height: 29.6px;">  
</td><td style="height: 29.6px;">- PaperSpace has plans which provide various systems at 6 hours of continuous use at a mix of free or paid options. The free options still require a base payment plan to be purchased
- For the premium plan, a single P5000 15gb VRAM machine is available for free.
- A 'core' machine can also be purchased where you can pay per hour without having to pay for a monthly plan. I currently have 4 x A6000 48gb VRAM for $7.56 an hour.

</td></tr><tr style="height: 29.6px;"><td style="height: 29.6px;">Paperspace Server/Console  
</td><td style="height: 29.6px;">- Always available multi-GPU instance
- ML template server instances
- Easy server setup

</td><td style="height: 29.6px;">- More expensive than the big players
- Some of the ML template server instances come with certain issues with libraries

</td><td style="height: 29.6px;">- Essentially unlimited

</td></tr></tbody></table>