Help - Search - Members - Calendar
Full Version: Outage?
The Planet Forums > System Administration > Network
ElfinStrider
None of my servers are responding, network has been spotty all morning, now it's one big bad gateway. What's going on?
pblinux
Is there a problem with the nameservers? I'm getting errors in my logs about domains not resolving.

QUOTE (ElfinStrider @ Sep 17 2009, 02:23 PM) *
None of my servers are responding, network has been spotty all morning, now it's one big bad gateway. What's going on?

Tomy Durden
I just received word that we're having some intermittent issues with the authoritative DNS infrastructure. I'm digging up more details and an ETA.
spicyjem
Yeah, I am noticing the same thing. My server stopped responding and wouldn't ping. Even ThePlanet.com site was down. Seems to be intermittent, like Tomy says....hopefully it gets resolved soon.
Evision
I'm having trouble resolving external hosts using the default caching name servers.

CODE
root@governator [~]# cat /etc/resolv.conf
nameserver 67.15.31.131
nameserver 66.98.240.131


CODE
root@governator [~]# ping api-3t.sandbox.paypal.com
ping: unknown host api-3t.sandbox.paypal.com


CODE
root@governator [~]# dig api-3t.sandbox.paypal.com

; <<>> DiG 9.2.4 <<>> api-3t.sandbox.paypal.com
;; global options:  printcmd
;; connection timed out; no servers could be reached


This is intermittent though.

CODE
root@governator [~]# dig api-3t.sandbox.paypal.com

; <<>> DiG 9.2.4 <<>> api-3t.sandbox.paypal.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11988
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 4

;; QUESTION SECTION:
;api-3t.sandbox.paypal.com.     IN      A

;; ANSWER SECTION:
api-3t.sandbox.paypal.com. 3050 IN      A       216.113.191.88

;; AUTHORITY SECTION:
paypal.com.             2683    IN      NS      ppns1.den.paypal.com.
paypal.com.             2683    IN      NS      ppns1.phx.paypal.com.
paypal.com.             2683    IN      NS      ppns2.phx.paypal.com.
paypal.com.             2683    IN      NS      ppns2.den.paypal.com.

;; ADDITIONAL SECTION:
ppns1.den.paypal.com.   3175    IN      A       216.113.188.121
ppns1.phx.paypal.com.   3175    IN      A       66.211.168.226
ppns2.den.paypal.com.   3175    IN      A       216.113.188.122
ppns2.phx.paypal.com.   3175    IN      A       66.211.168.227

;; Query time: 4 msec
;; SERVER: 67.15.31.131#53(67.15.31.131)
;; WHEN: Thu Sep 17 13:45:30 2009
;; MSG SIZE  rcvd: 211


This server is in Phase I of H2.
pblinux
Forums are not responsive enough, so the conversation has moved to Twitter. Check my post at

Twitter update on @theplanet

or search Twitter for @theplanet

QUOTE (Evision @ Sep 17 2009, 02:47 PM) *
I'm having trouble resolving external hosts using the default caching name servers.

pblinux
The Planet needs to notify us in advance of DNS server maintenance and schedule it for off-hours, not during the prime-time business day!

EV1 was good about this. The Planet, not.

Then they're not even responding on Twitter, leaving us all to fend for ourselves and speculate.

Not good when their forums aren't working half the time either.

Get with the plan already.


QUOTE (pblinux @ Sep 17 2009, 03:22 PM) *
Forums are not responsive enough, so the conversation has moved to Twitter. Check my post at

Twitter update on @theplanet

or search Twitter for @theplanet

Tomy Durden
QUOTE (pblinux @ Sep 17 2009, 02:22 PM) *
Forums are not responsive enough, so the conversation has moved to Twitter. Check my post at

Twitter update on @theplanet

or search Twitter for @theplanet

Ref that RT:
There were no changes to our recursive/caching DNS infrastructure and it wasn't affected by the authoritative DNS issue, unless you were attempting to lookup a domain on the authoritative DNS infrastructure.

It's inadvisable to change the primary to something outside of our network. Not only does this increase the latency of DNS lookups and traffic across our network and the Internet, but external providers have been known to block DNS lookups coming from outside of their network because of this. In addition to this, the propagation of our zones as perceived by you may be slower due to their configuration.

Kevin(@ThePlanet) isn't available at this time and he's our official twiterer. In the end, while forums and twitter are a supplemental communication standard, the official support channels will remain via tickets, phone, and chat.
Tomy Durden
QUOTE (pblinux @ Sep 17 2009, 02:57 PM) *
The Planet needs to notify us in advance of DNS server maintenance and schedule it for off-hours, not during the prime-time business day!

EV1 was good about this. The Planet, not.

Then they're not even responding on Twitter, leaving us all to fend for ourselves and speculate.

Not good when their forums aren't working half the time either.

Get with the plan already.

Again, there were NO changes to the resolving infrastructure. There was an unplanned incident with the authoritative infrastructure which was resolved well before most people saw the residual fall out.
kennygadams
QUOTE (Tomy Durden @ Sep 17 2009, 08:07 PM) *
Again, there were NO changes to the resolving infrastructure. There was an unplanned incident with the authoritative infrastructure which was resolved well before most people saw the residual fall out.


Lets try to keep these servers up, okay.This was a big loss for me and I'm sure it was for you all at ThePlanet as well. I absolutely agree with pblinux. ThePlanet needs to keep us all updated with latest real-time news! Maybe you could do a status page like Google http://www.google.com/appsstatus#hl=en These random outages have made me consider using other hosting companies such as [MODERATED: Removed link]. Do any of you have experience with them?

Kenny
www.ClipArtof.com
ElfinStrider
Bad Gateway and intermittent problems for hours during the prime hours of the day isn't what I'd call, "Before most people saw". . . I didn't have a client on your network NOT notice.

We'd really like to see the Planet take a proactive stance to informing us, you'd find we'd probably submit less tickets if there was a common point of relation.
pblinux
First off, we were very much affected by this. How does email work? Well, let's see, we lookup the domains of all incoming AND outgoing email to make sure the domains are valid. FAIL when The Planet DNS does not respond. So this caused an outage of our email at the least. All outgoing email backed up and incoming email was rejected during this time period because DNS wasn't working.

Secondly, I can't just sit there for hours with NO information posted ANYWHERE I could find talking about this from "official" channels. I couldn't even get the forums to load half the time, and when they did there was no mention of this from official The Planet sources. At least Twitter is a different service, and it was up and running!

Thirdly, if you're going to be on Twitter, then BE ON TWITTER. This should be covered 24x7 for a service like The Planet. Let Kevin do it when he is available, but someone should be on it 24x7 in shifts. If you're not willing to do that, then shut down your Twitter account.

There was no word at all from The Planet, leaving us all in the dark, speculating and furious.

QUOTE (Tomy Durden @ Sep 17 2009, 04:06 PM) *
Ref that RT:
There were no changes to our recursive/caching DNS infrastructure and it wasn't affected by the authoritative DNS issue, unless you were attempting to lookup a domain on the authoritative DNS infrastructure.

It's inadvisable to change the primary to something outside of our network. Not only does this increase the latency of DNS lookups and traffic across our network and the Internet, but external providers have been known to block DNS lookups coming from outside of their network because of this. In addition to this, the propagation of our zones as perceived by you may be slower due to their configuration.

Kevin(@ThePlanet) isn't available at this time and he's our official twiterer. In the end, while forums and twitter are a supplemental communication standard, the official support channels will remain via tickets, phone, and chat.

Tomy Durden
QUOTE (ElfinStrider @ Sep 17 2009, 11:13 PM) *
Bad Gateway and intermittent problems for hours during the prime hours of the day isn't what I'd call, "Before most people saw". . . I didn't have a client on your network NOT notice.

We'd really like to see the Planet take a proactive stance to informing us, you'd find we'd probably submit less tickets if there was a common point of relation.

We encourage you to submit tickets, because that's a valid way to measure the impact and it makes the information consistently available to everyone within the company. The forums, Twitter, any status pages will be suplimental to the official channels. Not to mention, in the case of SLA qualified outages, it's a requirement if you want to make such a request.

Ref: http://content.theplanet.com/Documents/legal/Planet-SLA.pdf
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.