Some Background

A while back I got tired of the hassle of colocating a server took it out of colo. My important data got migrated to a couple external HDDs (yay sneakernet!) and my essential services to an R210 II hosted locally. It stayed this way for a while.

Enter December 2018, I caught wind of BuyVM's block storage feature and pricing. $1.25/mo per 256GB is hard to pass up! I've hosted "cloud storage" in the past for myself using Seafile (which is why I contributed that functionality to ShareX), and embarking a new project felt like fun.

I hatched my plan - instead of Seafile I'd run NextCloud with OnlyOffice (mainly because I liked the sound of that challenge). I also decided to do it all with Docker for encapsulating it all and Minio for object storage as the backend for NextCloud's filesystem.

I also decided to migrate all of my existing services away from cPanel (for websites) and LXC (via Proxmox, for apps) to Docker. This was quite the challenging endeavor (having minimal prior experience with Docker) but was well worth it in the end.

My "stack" I settled with for Docker is Traefik, Portainer, and NetData. This allows me to reverse proxy, manage, and view host performance data - all by simply copying some Docker Compose configs, updating hostnames, and running docker-compose up -d.

I run Docker Compose configs for all my services and use bind mounts to local directories for persistent data. I have a cronjob that backs all the configs and their local persistent data up. If I lose a server I can restore my config & persistent data, update DNS records, and spin it up without much hassle at all. I quite enjoy this setup.

Anyway, that's enough of the backstory.

December 22nd, 2018 - the brown out

I woke up downtime alerts. All my services I had migrated to BuyVM were down. Around when I started investigating a few starting coming back up and I received this email.

Outage Notification

Alright - that's a bad sign. First of all: their entire block storage array had no redundant power? Who thought launching before finishing that installation was a great idea?

As I began investigating my remaining offline service, NextCloud, I found that the entire block storage volume was missing random assortments of data with corruption throughout. I feared the worst, but decide to open a ticket just in case (read bottom to top).

All the data ended up being scrapped. As the data was just my local synced folder I hadn't seen the need to back it up (it's backed up on my desktop and laptop by being synced). In retrospect I should have backed up the Minio configs as those were a hassle to create.

At this point I should have cut ties and found a new provider, but Francisco had apologized and guaranteed redundant power would be setup soon. Apologizing means alot to me. He also gave me a free month of hosting as an apology (which helped) - granted it got mishandled and I had to remind him of that promise.

Not being able to trust their block storage not to be a hassle to me in the future, especially after noticing that none of their material mentions any sort of redundancy, I relegated that slab to be a backup endpoint for other projects. I however kept all my Docker containers on my VMs with them.

June 2019 - terms of service inconsistencies and preventable downtime.

At this point half a year has passed without significant issues or downtime. I start getting payment failure emails starting June 1st. Unfortunately at the time I was unable to resolve the issue due to 2FA issues (phone died, waiting on replacement).

Having checked BuyVM's terms of service I noted I didn't have to worry about suspension until 5 days after the due date. I opted to wait for my replacement instead of restoring my 2FA tokens to a friend's phone temporarily.

The replacement arrived nice and early (thanks to FedEx Early AM and an effective warranty). After restoring my backup my first order of business was to resolve the billing issue.

I'm not quite sure what was wrong with their billing system, as my solution was to login and pay using the same payment method it had been trying for the last 2 days. No changes were made on my end ¯\_(ツ)_/¯.

Roughly 20 minutes after I paid all my services went offline and I got spammed with the inevitable wall of downtime notifications. I logged back in and checked their control panel, which said the services were "suspended".

I logged into the client area and checked them there - each service said they were suspended with the supplied reason being "Overdue on Payment". Less than half an hour after I paid. And hold up - the terms of services gives 5 days, at this point it was only at 4? As "Karen" later clarifies, they've updated their policy but neglected to update their terms of service, essentially making themselves in violation of their own ToS.

I use code-server for software development and was actively working when they went down. Among other reasons, I definitely wanted my services back online ASAP so I opened a ticket. For a simple system error I would expect their 24/7 support to handle it.

Wrong. My ticket was promptly flagged to billing, which was unavailable at the time. I submitted a few other tickets to different departments in the hopes that a human would see it ASAP. The original ticket was replied to by "System" - not knowing their internal support structure I assumed that was possibly an automation / bot.

My other tickets were merged and I was forced to wait. And wait... 2 hours later (I enjoyed my extended break, I guess) I received a response and my services were restored. I also asked for clarification as to why they were suspended in the first place.

The provided reason was because the suspension cron was already running when I paid, and couldn't be stopped. Perhaps I just haven't worked on projects with a large enough scale to understand, but I would expect any queuing system to double check the job before processing it, just in case?

I never received an apology. I wouldn't expect an account credit from a heavily budget provider meant only for hobbyists with no service level agreement to speak of, but I would expect at least a simple "We apologize for the inconvenience". This alongside the realization that despite offering "24/7 support" they couldn't resolve a simple ticket in a timely manner was the final straw.

It was short relationship BuyVM, and rough. You lost my data, you had network issues, and you dropped the ball at the end. I can't say I'll miss you.

Ticket regarding the suspension (I realize I was being more passive aggressive than absolutely necessary).