# 📖 How to Find a GPU Hosting Service – a Guide by Viraaj Akuthota *For his project "[Human Rights Predictor](https://prototypefund.de/project/human-rights-predictor/)" (Round 15) our grantee Viraaj Akuthota was looking for a GPU hosting service. Here he explains how he went about it:* To fine-tune models and create embeddings on large corpuses of qualitative data, a high amount of GPU RAM (VRAM) is required. For example, fine-tuning BERT on a dataset of 15k cases that vary in size creates roughly 100k-200k sequences at a 512 token limit. This requires approximately 140 GB of VRAM. This hardware requirement means such tasks cannot be conducted on most consumer-grade machines. I conducted an exercise to hopefully identify an affordable and relatively easy-to-use cloud compute option. During this search, I faced many difficulties. The benefits and disadvantages of the majority of service providers I reviewed can be found in the table below. Overall, the production system I landed on is to utilize: · PaperSpace's Core using a Windows Server instance to avoid using the terminal as much as possible. · Always available Multi-GPU instances, for example, 4 x A6000 Nvidia GPUs with 192 GB VRAM total for roughly $7 USD an hour. · Approximately $3 USD per month for 50 GB persistent storage, making offline costs negligible. · For Linux users, they have a Python ML template which will save time installing python, packages, cuda, etc. Before production, I utilise either Google Colab or HuggingFace: · For testing fine-tuning or creating embeddings, I believe Google Colab's free T4 instance provides the highest amount of VRAM for any free tier. · For testing LLMs, HuggingFace's serverless inference free tier allows you to utilize a variety of LLMs such as LLAMA 405B. However, the Pro tier at $9 USD per month increases the rate limit on this inference. I receive approximately 300 API calls per hour.

Provider	Benefits	Disadvantages	GPU Limit
Amazon EC2	- Relatively affordable compared to other cloud providers	- Requires familiarity with AWS - Application for quotas is not straightforward and the approval process takes time	- Essentially unlimited
Amazon Notebooks	- Easy to set up an ML system - Relatively affordable compared to other cloud providers	- Notebooks are limited to certain GPU sizes, essentially under 100GB VRAM. - Even if you have a quota for the underlying resource it will not work for a notebooks	- Under 100GB VRAM
Microsoft Azure		- The registration system and console is sufficiently complicated that I did not utilise this service. - Quota application process did not seem straight forward.	- Essentially unlimited
Google Cloud		- Unable to secure access to a high-end GPU as they were ALWAYS unavailable
Google Colab	- Very easy to use and set up	- Relatively more expensive - Not guaranteed access to the most powerful GPUs that is claimed to be accessible even with premium services	- A100 GPU with 40GB VRAM, if available, which is rare
Paperspace Notebooks	- Very easy to use and set up - Multiple ‘free’ GPU availability with unlimited hours at the premium option		- PaperSpace has plans which provide various systems at 6 hours of continuous use at a mix of free or paid options. The free options still require a base payment plan to be purchased - For the premium plan, a single P5000 15gb VRAM machine is available for free. - A 'core' machine can also be purchased where you can pay per hour without having to pay for a monthly plan. I currently have 4 x A6000 48gb VRAM for $7.56 an hour.
Paperspace Server/Console	- Always available multi-GPU instance - ML template server instances - Easy server setup	- More expensive than the big players - Some of the ML template server instances come with certain issues with libraries	- Essentially unlimited