Upgrading the SFG Wordpress Site on AWS
1 CommentThis past week I needed to upgrade my blog (Spring Framework Guru – aka SFG). This post is not going to be a technical how-to. But more of an overview targeted to new developers for them to see steps taken to migrate a production WordPress site to a new instance, with near zero downtime.
Disclaimer – I am not a WordPress expert, nor a PHP developer. But I have been doing this IT stuff for about 25 years.
Overview
The SFG website is hosted on AWS. It runs in a t2.medium instance. This instance type gives the VM 2 vCPUs and 4 GB of RAM.
I created the VM in 2015, when I launched Spring Framework Guru. The base AMI I selected was from a company called Bitnami. They develop a variety of ready made images for AWS and other platforms.
While I certainly have the skills to start from a base linux instance and install Apache, PHP, MySQL, and WordPress – a ready made image was the path of least resistance.
Bitnami actually did a really nice job with the base image.
I selected a WordPress theme called ‘Loma’ from Dahz.
It gave me a real nice looking theme for the site. The Loma theme was also highly customizable. Thus SFG was a decent looking site just a few mouse clicks away.
All together, this setup quickly gave me a nice website.
The only downside was performance. Always seemed a bit sluggish to me.
I did some optimizations (caching, memory, etc). Which did help some.
This little LAMP stack was running on a good bit of hardware. The performance was not passing my personal smell test. Something seemed off.
But, I called it good enough for now.
Victim of Success – Part 1
The first month the SFG site was live, I think I had 500 page views – for the whole month.
Not a huge surprise, new website. It had very little content.
Over the course of 2015-2017 I added a steady stream of posts to SFG.
Which steadily grew traffic to the site.
The Wall
Around 5,000 page views a day, the site started to hit a wall.
The server could not keep up with the load.
Apache was spawning a process to service each request.
Running the ‘top’ linux command, all I saw was php processes.
The server was running at 100% CPU. Load running at 25.
A load of 25 would not be bad for a 16 CPU server. For 2 CPUs – not good.
Not good at all.
The server was not keeping up with the load from website traffic.
It was completely swamped.
5,000 page views a day is not huge by any means. I’d expect that hardware profile to support 5-10x that load.
This Java guy was not exactly impressed with this PHP stuff.
Part of the problem was the server was running PHP 5.4.
PHP 7 was out, but the theme I had did not support PHP 7.
PHP 7 is significantly more performant that PHP 5.x.
I also suspect the Loma theme was not very efficient.
Again, I’m not a PHP programmer. Just my suspicions.
Near term, I could either reduce load on the server, or add hardware.
Going to a larger AWS instance was not all that attractive of an option. The next size instance would roughly double my monthly costs.
Let’s go back to reducing load.
Enter Cloudflare
Cloudflare is a really slick DNS based CDN (Content Delivery Network).
Actually, it is much more than just a CDN.
It’s also stupid easy to use.
I decided to give it a try.
Turned out to be exactly what the doctor ordered.
The Cloudflare CDN network started to service ~70% of the incoming requests.
Server load dropped from 25, to 3. CPU was hovering around 25-30%.
Winning!
The Plan
I had a soft plan to re-skin the SFG site.
To do so, I figured I have a design contest for the look and feel. Then hire a firm to create the team. I’d require it to be optimized for PHP 7 – for performance.
But that soft plan has never materialized.
And traffic on the site continued to grow.
Victim of Success – Part 2
In November of 2018, the SFG site started seeing 8,000 page views a day.
New daily record!
Also, found a new limit around 8,100 page views.
Back to that tipping point I saw before I implemented Cloudflare.
Server was exhausted again. 100% CPU, load 25+.
Completely CPU bound.
MySQL was largely idle. No easy fix by adding a missing index here.
I looked for updates to the Loma theme. Hoping there was PHP 7 support.
Turns out Loma is no longer supported. So, that’s a dead end.
I wanted to see if the Loma theme would work under PHP 7.
This is not an experiment I want to run on my production server.
It may be limping, but still running!
I needed a dev instance to experiment with.
Enter AWS and virtualization.
Creating a Dev Instance on AWS
My AWS instance for SFG was using a modest 10GB of EBS storage.
A cool part about EBS storage is you can create snapshots.
Through a snapshot, you can take a point in time ‘snapshot’ of the state of storage on a running system.
Great for back-ups.
Also, great for creating a new instance.
Which is exactly what I did. I took a snap of the SFG production instance. Then told AWS to use that snap to create a AWS AMI (Image).
From the newly created AMI, I launched a new VM – using the same t2.medium spec.
Creating dev.springframework.guru
The new VM had its own public IP.
But Apache on that VM was configured to use vhosts (Virtual Hosts).
Nifty way to run multiple websites off a single server.
One server can handle requests for foo.com and foobar.com. Requests for the respective websites will get routed differently.
Which is the case I had.
On my server, I’m actually supporting several websites.
Creating dev.springframework.guru was a two step process.
- Tell Cloudflare (my DNS provider) to route traffic for dev.springframework.guru to the IP of the new dev server.
- Config Apache’s springframework.guru vhost to be dev.springframework.guru
I now had a development SFG site to work with.
As we say in the ‘biz, now we’re cooking with gas!
Going Scorched Earth in the Freedom of Dev
Now I had the freedom to do whatever I wanted with the dev instance.
If I completely crash it, no worries. I’ll just delete it and create another.
First thing I did was update the server. Update the OS and all libs.
All good. Updates went smoothly.
Next step was to try PHP 7.
After backing up the MySQL database and WordPress directory, I used an installer from Bitnami to install their latest WordPress stack. (which included PHP 7)
Then it was just a matter of restoring my WordPress files and MySQL database.
The process went surprisingly smooth.
WordPress came right up on the restored database.
The Loma theme was an epic fail though.
It was not happy at all about PHP 7.
Blog pages failed to render. Browser would get a few lines of HTML, then nothing.
Server error logs had chatter about unsupported PHP functions.
Loma on PHP 7 was a fail.
Plan B
Dahz has a similar theme to Loma called ‘Verko’.
I decided to give that a try.
Installed the theme, and verified it worked under PHP 7.
In about 20 minutes I had the Verko theme styled pretty darn close to the production SFG site.
Working with the new dev server, WordPress and theme – over all felt much faster.
Good sign, but just me on the box. So I was cautiously optimistic.
I felt my newly created dev instance was ready for production.
Now to take dev to production with zero down time.
Migrating to Production with Zero Downtime
First step is to update the vhosts on the dev instance from dev.springframework.guru to just springframework.guru.
Simple, but important step. This tells the server to route traffic for the host ‘springframework.guru’.
Next is to update the DNS in Cloudflare.
I deleted the A record for ‘dev.springframework.guru’.
Then changed the IP for the A record of ‘springframework.guru’ to the IP of my new VM.
Now for the moment of truth. Try going to my blog…
An a Apache welcome page… Doh!
I realized right away what I did wrong.
I forgot to bounce Apache after updating the vhosts settings.
After a quick bounce of Apache, I refreshed my browser – and there was SFG on the new theme!
I failed at the goal of zero downtime. But, I don’t think many requests were affected.
It can take some time for DNS changes to propagate around the world. So, I had that going for me!
Results of the Migration
I’ve been impressed with the results of the new server.
Both instances are on AWS using the t2.medium instance size and in the same region. I have the same AWS spec, but no way of knowing the underlying hardware the VM is really running on. This is a variable I cannot account for.
Observations of the new server running the same page load (or more):
- Site response time significantly improved. Pages load much faster.
- CPU rarely gets over 10%. Usually under 5%. (previously 100%)
- Load average is 0.25 – 0.30 (previously 23 – 25)
- Rare to see more than 5 http processes from Apache. (previously dozens)
- Day after new server implemented, had record page views. Day after that, also set a new record.
Conclusion
Overall, I’m more than happy with the update. I expected to see an improvement, but certainly not to this degree.
For folks new to development and IT, I hope you’ve gained benefit from this post. I hoped to explain how you can use the tools of AWS and Cloudflare to migrate to a new server with little or no downtime.
Just to recap what I did:
- Snapped the storage of the production instance.
- Created an AMI image from the snap
- Launched new VM
- Assigned dev URL to the IP of the new VM
- Made changes to the dev instance until it was ‘production’ ready.
- Changed DNS record for production url to the IP of the new instance.
One caveat to be aware of when doing this, is to keep in mind DNS updates can take some time to propagate (ie hours, or days). So, don’t drop that old production VM right away!
Shailesh Prakash
Next time experiment with Nginx and PHP-FPM. I am sure it will break your next wall.