Blingy Slowness Issues

May 10th, 2008

[#1: Edit Options>MightyAdsense>Adsense Code]

We’ve continued to offload the problematic file server and loads across the cluster have actually looked quite good (almost all servers under a load of 5, many around 2-3 and only one was ever at or above 10) so we had hoped that we could restore snapshot backups for users on that server. Unfortunately the result increased the disk usage again on that file server to the point that is has been displaying the same symptoms as before. The good news is that the fix is simply a matter of disabling the backup snapshots again and then dropping that data. We’ve already done the disabling and are in the process of dropping - this is resulting in some very inflated loads across the cluster but as soon as we have completed this I’ll be issuing soft reboots (easier on the hardware) that will fix the loads you’re seeing as well [...] Read the rest of this entry »

Failing over Milk

May 10th, 2008

[#2: Edit Options>MightyAdsense>Adsense Code]

[#2: Edit Options>MightyAdsense>Adsense Code] The web-server ‘Milk’ is being failed-over due to bad hardware causing it to crash a couple minutes after it boots. It is being moved over to new hardware and the move should be completed within 30 minutes. This should only effect people with ‘milk’ as their webserver. You can find out if milk is your server by clicking “account status” in the panel and it would be listed as “Your web server” Read the rest of this entry »

Emergency OS patch on file server

May 9th, 2008

[#3: Edit Options>MightyAdsense>Adsense Code]

Per an open ticket with Sun we need to apply two patches to one of our file servers. This is to hopefully fix a degraded zpool which will not finish a parity rebuild. There are exactly 78 users in the frisky cluster on this file server. This will bring your email and web services offline for the time being. The patch should only require about 15 minutes of downtime. I apologize for doing this patch during peak hours, but we really need to get this data back up to full integrity.
Tech nerd details: The raid array is a raidz2 operating with one failed disk. It has been rebuilding off of a hot spare for a day or two and reset the rebuilding process itself after getting to 99%. We contacted Sun and after analyzing troubleshooting information believe a kernel + ZFS patch should resolve the problem. Fortunately this is a [...] Read the rest of this entry »

DingDong and Pizarro down

May 8th, 2008

The HTTP servers DingDong and Pizarro are both currently unresponsive to our reboot efforts. We are working on getting manual reboots done in their respective data centers or evaluating whither moving to new hardware is necessary. Estimated downtime as long as 1 hour.
Update 7:44p
Pizarro is back up and seemingly stable on new hardware
DingDong is awaiting a tech to reach it’s data center still. Read the rest of this entry »

Central database crash

May 7th, 2008

[#1: Edit Options>MightyAdsense>Adsense Code] Our central database server crashed and restarted itself. It is currently replaying transaction logs and should be back in under an hour. This should not affect your websites, email, etc, but the user control panel (https://panel.dreamhost.com) and similar services are down until it comes online.
We are monitoring the situation and will report back where when it comes online!
Edit: And webmail! I forgot those two were tied together. Regular IMAP/POP3 email access should continue to work.
Edit: 5:28PM Pacific It looks like we’re back in business! The user control panel and webmail are working. We will continue to check the rest of our central services and update this if we find anything else still broken! If you are having problems, please contact technical support. Read the rest of this entry »

FTP problems (connection drops)

May 7th, 2008

[#1: Edit Options>MightyAdsense>Adsense Code] Some of our customers are experiencing problems related to their FTP service. This includes error messages while connecting or dropped connections. We’re looking into it, and will post an update as soon as we know more. Sorry about the inconvenience.
Please check back here for updates. Read the rest of this entry »

Webserver Hermes being moved to new hardware

May 7th, 2008

[#1: Edit Options>MightyAdsense>Adsense Code] Hermes crashed earlier and server isn’t coming back up when rebooted. It’s currently being migrated over to new hardware and should be back up shortly.
Sorry for the inconvenience this has caused.
UPDATE: Migration is complete. Your sites should be back up and running. Contact support if your sites are still down. Read the rest of this entry »

Email server emergency maintenance tonight (janky,randy,postal,spunky)

May 6th, 2008

About 30 minutes ago we had some major problems with one of our email load balancers that keeps email for janky,randy,spunky and postal clusters. Tonight at approximately 11:30 PM PST we will be doing some maintenance that may affect the performance / uptime of any customers that have email in these clusters. It is only expected to last about 10-15 minutes at most and we apologize for the short notice. This post will be updated as soon as the maintenance has completed. Read the rest of this entry »

Problems caused by apache service updates

May 6th, 2008

[#1: Edit Options>MightyAdsense>Adsense Code] We have noticed that any apache service updates initiated from the control panel are breaking the service. This includes anything that has to do with the domain’s web service, like adding a new domain, changing FTP users on domain, etc. This doesn’t just break the actual domain that initiated the change, but all domains on that apache service.
We are fixing this as we find them, and trying to catch up with the ones that broke a little while ago. Very sorry about the web downtime this has caused you, and please check back here for updates. Read the rest of this entry »

Delay in quarantined junk mail delivery

May 3rd, 2008

[#2: Edit Options>MightyAdsense>Adsense Code] We have noticed that messages that were quarantined by our junk filter are not getting delivered to the Junk Folder. We checked on the messages, and they are still on the servers. The reason they weren’t getting delivered was due to a connection problem to the mysql database that controls the Junk Folder. The problem is now fixed, but it will be a while before it all catches up, as mail has been backed up since last night. We apologize for all inconvenience this has caused. We’ll post updates on the progress of junk mail delivery as soon as we have them.
Update 05/04/2008 noon Pacific — Quarantined junk mail is again being delivered to the Junk Folder. Sorry about the delay. Read the rest of this entry »