Saturday, March 31, 2012

"Connection Reset" errors, MTU, DHCP, and Time Warner Cable

So long, AT&T DSL

Not too long ago, I made the move from AT&T DSL to Time Warner Cable for my family's home Internet connection. AT&T's pricing was no longer competitive, and their terms of service were nothing to be proud of.

Hopefully most readers have heard about the recent AT&T policies regarding 150 GB data caps for DSL connections. While most people have recently been complaining about similar data caps for mobile data plans - these caps are something I largely agree with, but only for wireless. There is only so much wireless spectrum available to share for everyone within a given area - and the laws of physics don't make these limits the easiest to overcome, at least without having towers on every street corner. (This is what we have Wi-Fi for.) While it is great that the wireless industry is marketing their mobile video capabilities, etc. - they need to ensure that they are offering the actual network services to match, instead of balancing on the edge of false advertising. However, for wired connections, there should be few excuses - as a wire can always be upgraded, or another wire (or fiber!) can always be added. Especially as my family is watching more Netflix and other video content online - combined with regular remote sessions to work and other computer-related activities, hitting 150 GB would not be too difficult. Personally, I also don't expect to be able to watch Netflix videos on mobile - I don't even have a mobile data plan. There is no place for data caps on wired Internet connections.

The other concern that I had was a provision that allowed AT&T to forcefully upgrade us from our DSL account to a higher-priced U-verse account at their discretion. I agree that U-verse is cool, but I'd really like to see it stand up to competition by also having something like Version FiOS available to the same customers. Verizon FiOS is not currently available in my area. I had also expressed some related thoughts in a previous post, when Appleton was considering a bid for Google Fiber. In that post, I had also stated some concerns with Time Warner Cable (TWC) - but they seem to have cleaned-up their act a bit since then - including with a new, impressive local retail presence.

"Connection Reset" errors, MTU, and DHCP with Time Warner Cable

So we switched to TWC's Road Runner service for our Internet service. Of course, it would be too easy if this was without issue. Without any other changes in my computer or network configurations, I noticed "The Connection was reset" errors the very first night with the service. Unfortunately, these issues were very difficult to troubleshoot, as the errors were quite sporadic, and I wasn't able to reproduce on-demand. The issues also were more prevalent on some devices than others - even though all devices worked just fine when connected to other networks. Interestingly, Google sites and services also seemed to be affected more than others. I'm guessing this had to do with most of Google's services being accessed over https:// - which increased the packet sizes and likely led to some of the issues. Having the issues happen the most often with encrypted connections also made troubleshooting with Wireshark, etc., quite difficult.

I had called TWC on this, but got the typical run-around. (Obligatory technical support comic: http://xkcd.com/806/.) They don't see any issues with their service - simply saying that it had to be an issue with my computer or router. They were somewhat correct - but only because my router was following TWC's direction.

eth1      Link encap:Ethernet  HWaddr **:**:**:**:**:**
          inet addr:75.87.***.***  Bcast:255.255.255.255  Mask:255.255.240.0
          UP BROADCAST RUNNING MULTICAST  MTU:576  Metric:1
          RX packets:807 errors:0 dropped:0 overruns:0 frame:0
          TX packets:482 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:85701 (85.7 KB)  TX bytes:77492 (77.4 KB)

See the issue? Frustratingly, it had taken me at least another week to spot it. Per TWC's own Road Runner help pages, the Ethernet standard for the Maximum Transmission Unit (MTU) is Ethernet standard of 1,500 bytes. So why was my router being configured for an MTU of only 576 bytes? Here was the latest from my /var/lib/dhcp3/dhclient.eth1.leases file after reproducing the issue:

lease {
  interface "eth1";
  fixed-address 75.87.***.***;
  option subnet-mask 255.255.240.0;
  option routers 75.87.192.1;
  option dhcp-lease-time 43200;
  option dhcp-message-type 5;
  option dhcp-server-identifier 10.65.64.1;
  option interface-mtu 576;
  option broadcast-address 255.255.255.255;
  option host-name "********";
  renew 0 2012/04/01 05:31:32;
  rebind 0 2012/04/01 10:41:37;
  expire 0 2012/04/01 12:11:37;
}

As I mentioned, my router was only doing "as it was told". I'm not exactly sure what is acting as the DHCP server or where it is located, but as the 10.65.64.1 address is not resolvable nor does it appear active, my guess is that this is being served from the cable modem itself, and a function of firmware loaded and controlled by TWC's Road Runner's servers. I'm also guessing that this option is not requested by or respected by most other devices (including SOHO routers and Microsoft Windows, etc.) - otherwise I'm sure TWC would have recognized this and corrected it by now. Fortunately, dhclient under Linux provides an easy work-around - with "work-around" probably being too drastic of a name, as it is really just a simple configuration change. Here is a portion of the default /etc/dhcp/dhclient.conf on my distribution, with the critical detail highlighted:

request subnet-mask, broadcast-address, time-offset, routers,
  domain-name, domain-name-servers, domain-search, host-name,
  netbios-name-servers, netbios-scope, interface-mtu,
  rfc3442-classless-static-routes, ntp-servers,
  dhcp6.domain-search, dhcp6.fqdn,
  dhcp6.name-servers, dhcp6.sntp-servers;

I simply commented-out this field, and my Internet connection has been stable since.

I called back TWC to report my findings, in an effort to hopefully help others having the same issue. Realizing that they refused to recognize that anything was wrong, I also wrote-up a summary of the issue and hand-delivered it to the local TWC retail office - including an invite to contact me with any needed requests for further details - hoping that it would be directed to someone who would be empowered to fix this. Needless to say, 4 months later, this issue still has not been addressed or resolved by TWC.

2 comments:

- said...

Just ran into this. I spent way too much time debugging this before I discovered your post and face palmed my way to yet other TW connection issues that we holding me up.

In my case, the MTU really needed to be 546 and instead, my router was manually configured to 1492 instead. *facepalm*

Anonymous said...

EXCELLENT post. Fixed my issue. Thanks.