#StackBounty: #apache-2.2 #php #wordpress #rds #elastic-beanstalk WordPress admin pages timeout, or take 30s to load on Elastic Beansta…

Bounty: 50

I recently migrated a WordPress website I inherited over to AWS from Rackspace and am experiencing large performance decreases. I’m pretty new to the whole DevOps thing so I’m not sure where to start looking.

The problem:

After I point the DNS (Cloudflare) from our old server to the new one, WordPress (specifically in the admin section) things are find for a short bit. But after an hour or so admin page loads timeout, or take up to 30 seconds to load. I haven’t tested all pages just yet, but this seems to happen on pages with editors, so “new post” or editing a page. When the page times out, I see a Cloudflare timeout message.

I should note that I am making this switch at around 11pm at night, no one but me should be logged in, and web traffic is at about 150 users.

I have also noticed that our RDS instance starts hitting 100% CPU usage, and the DB connections skyrocket up to about 1000 connections.

Because RDS has WELL surpassed it’s max_connections limit, WordPress can no longer connect to the database and now the front end of the site shows the “unable to establish connection to database” message.

At this point, I could restart my ECV2 instances, but the RDS database is still sorting out the 1000 connections.

I have also noticed that it seems the (classic) Elastic Load Balancer stops distributing traffic evenly across both instances.

I’ll have another go at it this weekend, but what should I be looking for in the logs? Before the EC2 instances shut down I tailed the logs and al I saw was:

pid 17186:tid 139743773734656] (70007)The timeout specified has expired: [client 127.0.0.1:58604] AH01075: Error dispatching request to : (polling), referer: http://m.facebook.com

Spec overview:

Server –
On AWS we are using 2 to 6 load balanced/autoscaled m3.large EC2 servers, managed by Elastic Beanstalk (for the Github integration) and a classic load balancer, Cloudflare is used for the DNS and SSL termination.

We are using Apache 2.4 with PHP 5.6, and PHP-FPM all on a 64bit Amazon Linux/2.7.1 AMI

RDS –
R3.Large running Aurora and part of a cluster with a read replica running. I tried using an R3 Large a couple days ago, and admin page loads were 15-30 seconds… Still very slow.

I should also mention that RDS was setup on it’s, outside of Elastic Beanstalk. I don’t think that should matter tho. There are however 2 other databases on that server, for a couple smaller sites that get basically no traffic and will be decommissioned soon.

I have enabled object caching via W3TC, and I have added some Cloudflare rules to disable performance and apps for /wp-admin* as recommended here

A couple things I’ve read around the internet

  • Maybe the ELB timeout limit is lower than the Apache limit and thats causing issues
  • This article suggests changing my MPM from event to prefork or worker


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.