Help - Search - Members - Calendar
Full Version: Spotty Access Problems in New Data Center
The Planet Forums > System Administration > Network
abovetopsecret.com
Last Saturday, we completed a transition of our site form a couple dual-Xeons to three new Dual Opterons in the new Data Center. After a full week of the transition, we're having reports of spotty access problems that seem to indicate network issues.
http://www.abovetopsecret.com

1) Intermittent Slow Connections: Regular users report occasional slowness in establishing an HTTP connection during their sessions (it's a very active discussion board). Users report seeing intermittent "connecting to" in their browser status bar for several seconds, followed by a very fast page render once the connection is established (footer "page process" speeds are under .4 seconds).

2) Minority of very-slow or no-connections: Some regular users report exceptionally slow access, or no access at all. This began happening as soon as their local DNS began resolving to the new server IP's.

3) Overall 10% decrease in regular user traffic, 30% decrease in impressions: For several months leading up to this server upgrade, we've been experiencing regular increases in traffic with regular daily averages over 60,000 unique user sessions. After the transition to the new server our 5-day user session average is 10% lower than previous weeks (not counting the first 3-days after the transition). Overall page impressions are down over 30% from previous weeks.

One traceroute this morning from a user with intermittent access issues:
CODE
2  204.249.177.1 (204.249.177.1)  1.689 ms  2.913 ms  1.718 ms

3  sl-gw27-nyc-4-0-ts25.sprintlink.net (160.81.172.101)  4.311 ms  4.317 ms  4.338 ms

4  sl-bb23-nyc-15-1.sprintlink.net (144.232.7.21)  14.593 ms  4.794 ms  4.803 ms

5  sl-bb20-nyc-8-0.sprintlink.net (144.232.7.13)  98.839 ms  12.368 ms  37.031 ms

6  144.232.8.74 (144.232.8.74)  19.081 ms  27.361 ms  20.560 ms

7  tbr2-g3301.n54ny.ip.att.net (12.123.0.102)  49.152 ms  49.750 ms  54.854 ms

8  tbr2-cl15.wswdc.ip.att.net (12.122.10.54)  52.359 ms  58.433 ms  51.793 ms

9  tbr1-cl17.attga.ip.att.net (12.122.10.70)  50.895 ms  51.336 ms  51.053 ms

10  tbr2-cl1835.attga.ip.att.net (12.122.9.158)  52.960 ms  51.629 ms  51.795 ms

11  tbr1-cl13.dlstx.ip.att.net (12.122.2.89)  52.112 ms  50.626 ms  50.924 ms

12  ar8-p3120.dlstx.ip.att.net (12.123.16.161)  48.230 ms  92.191 ms  72.658 ms

13  12.119.136.18 (12.119.136.18)  47.718 ms  50.288 ms  53.340 ms

14  vl32.dsr01.dllstx3.theplanet.com (70.85.127.61)  48.168 ms vl31.dsr02.dllstx3.theplanet.com (70.85.127.30)  203.281 ms vl32.dsr02.dllstx3.theplanet.com (70.85.127.62)  192.416 ms

15  vl21.dsr01.dllstx2.theplanet.com (70.85.127.67)  50.895 ms vl22.dsr02.dllstx2.theplanet.com (70.85.127.76)  53.802 ms vl21.dsr01.dllstx2.theplanet.com (70.85.127.67)  57.045 ms

16  vl2.car08.dllstx2.theplanet.com (12.96.160.55)  49.404 ms  54.016 ms  49.198 ms

17  d6.25.344a.static.theplanet.com (74.52.37.214)  55.659 ms  49.367 ms  48.533 ms


Two more from a user with more serious connection issues:
http://images.abovetopsecret.com/trace1.jpg
http://images.abovetopsecret.com/trace4.jpg



Is anyone else experiencing anything like this?[/code]
Hogie
So the 25% and 10% packetloss at hop 2 is not your problem, but The Planet's? When you have packetloss that close to you, you can't expect it to show each hop correctly. If you start showing packetloss on hop 2, you can see random packetloss anywhere else, because your bad link is at your local side.
abovetopsecret.com
It does appear as though there are local ISP issues in some of the traces we're seeing.

However, these are from regular (several times a day) users of our site who have not had access problems until the move to the new data center & servers.

This combined with the sudden drop in page views causes us to wonder if there is an intermittent issue that can be discovered and repaired.
nForcer
Keep in mind ISP's and datacenters are notorious for giving ICMP packets 'lower priority' so you cannot base your results on pings and traceroutes alone - even over a legnth of time.

I ping from St Louis to Dallas via ATT and get an average of 50ms response times - but its only 'ping' packets as everything else flows as fast as fiber.
abovetopsecret.com
I understand, but's pretty much the only diagnostic for an issue like this... unless someone has some ideas? icon_wink.gif
Elena
Are you still having this problem? If you are, would it be possible to find out which ISP your users are using to access your server? I've been seeing problems with Cox users for a few months which is why I'm curious.
www.areyouserved.com
I am having lag spikes with our new cortex Dual Opterons, when i do a tracert it is always in the planet network that i get this lag.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2010 Invision Power Services, Inc.