Hitting The Ground Running: My First Ever Crash

Sunday, September 3, 2017

By Etin Obaseki

Today, I became a man. No, I didn’t lose my virginity or battle a wild tiger, it was something even grander than that. Today, my servers crashed. Multiple times. But, to tell this story properly, I must start from the beginning.

The Prelude

I’ve recently gotten a job as a web developer for a new age media outfit in Lagos. At first, I thought it was going to be a real easy gig, I mean, I was being offered money to maintain a regular old WordPress site, which I was sure I could do in my sleep. My prospective employer, over the phone is talking to me about how challenging it’ll be and I’m like “Pfft…oga relax, I got this.”.

I arrive at my new workplace, and one after the other, all my expectations of mediocrity are shattered.
First, they have an incredibly cool office. One of the nicest I’ve seen (and that, is saying something). They’ve got really cool portraits around the office and there’s a lounge chair for me to lie on while debugging, just like home.

My boss is a Yale alumni. Worked as an executive for one of the big oil guys for a while too. Then, for my direct supervisor, I get this really smooth dude (don’t tell him I called him that, he’s a father already) who worked at IBM for 11 years and now consults for the government. My colleagues include people who’ve worked at big publications like Business Day, artists whose “cover bottle” masterpieces I heard about while I was in secondary school, First Class alumni from a University I couldn’t get into, amongst others.

So, I’m just there feeling like a small fish in a big pond.

My first few days on the job turn out to be fun. I don’t even have to do any generic WP “Edit some text here” stuff for a while, instead, I’m building APIs in Lumen and consuming them to build voting platforms for a comedy contest. I’m also creating a local file server and generally just enjoying linuxing around the whole place.

Then, on one faithful day, I go to bed, content and looking forward to having another good day at the office only for me to wake up to disaster.

The Immediate Past

My precious system is down. Over the past few weeks, myself and the website of my work place had really began to get cozy. I was getting to know her, she was sure she liked me and our initmacy was progressing at a good pace.

I wake up to a flurry of messages on the work group chat, and I’m wondering what these people are doing awake at such an ungodly hour (for God’s sake, it’s 6:30AM!). I read through the messages, trying to find the cause of the fuss. I scroll past messages like “It seems to be down” and “I haven’t been able to access it”, sure that our rival sites must be having problems again. I then see a screenshot with an “Error in Database Connection”, with the URL unmistakably showing my website. I nearly burst into tears. Questions like “But how?!” and “Why na?!” flashing through my mind. With very little time to lose, I spring into action, not even bothering to take a bath or dress properly, I head straight for the office in my SpongeBob shorts (What? He’s an inspiration, never gives up, no matter what!) ignoring the disbelieving looks I’m getting from everyone.

As soon as I arrive at the office, I boot up my trusty Titan-II and immediately SSH into the remote server, upon doing this, I discover something grave: We’ve run out of Disk Space“How could this have happened?”, I wondered. I sprang into action immediately, and thanks, of course to some of the brilliant people on Stack Overflow, I found this snippet to help me find the files that took up the most space on our server. Using my trusty $ rm -F, I deleted the offending files and restarted the server.

Hooray! The site was back up. I strutted back home (I don’t live too far away from work) with all the swagger and confidence of a man who had just decluttered his hard drive. I arrived home and began explaining to my roommate just how smart I was to have fixed the problem so quickly. I was not even halfway done with listing out my numerous accomplishments till date (as a precursor to telling him the specifics of today) when my phone begins to buzz again.

It’s from the work group chat, I’m sure they want to congratulate me on a job well done. I open the messages and it’s just the opposite,the “Load More News” feature isn’t working. With a growing apprehension in my belly, I dismiss the complaint, thinking his browser might have the content cached. Three seconds after deceiving myself, my phone rings — it’s my boss (the IBM one, so I can’t pull the “it’s caching” scam like I would with anyone else) and he says “the site is down is down” and I begin to trudge again to the office, wondering what could have happened.

I get there and try to use bash’s autocomplete feature and I get a “no more space on device” and I’m confused. I eventually find out that our error log has swelled to 11 GB!

Eleven! At this point, I’m almost panicking. Have we been hacked? I delete the log again and try to access the WP Dashboard and find that it’s got a 302 — Too Many Redirects. Aha! Now we’re getting somewhere! I inspect the logs, which I have to delete again, and find that there’s a problem with the SSL we installed yesterday.

Actually, at this point, I’m a little confused because I didn’t install any SSL, so, what happened?

Turns out I went to bed a little too early, so they started doing it without me.

At this point, my supervisor has arrived at the office and at least now it’s not only me trying to figure out how quickly a 40GB HDD can be filled up by text files of less than a kilobyte each.

So, we call the consultant who’s helping us out with some other stuff and we begin piecing together bits and pieces of the solution.

Over the next 10 hours, (using my incredible flair for bashing) we continuously debugged, implemented, thought we were done, breathed a sigh of relief, discovered that we weren’t done, had anxiety again, rinse and repeat.

In the end, it was a day of plenty (plenty) of highs and lows, excitement and code but, thankfully, it’s finally over and I can return to working on my file server (which is driving me nuts with a permissions problem, by the way).