Wednesday 30 August 2017

Incompetent ISPs

Some background history


Local education authorities in the UK often operate a "broadband consortium" for the schools in the region.  The cost of providing a decent internet connection has often been very dependent on the school's geographic location - some schools can be connected up fairly cheaply whilst others are out in the sticks and very costly to connect.  So the idea is that the authority can provide internet connections to all of their schools at a fixed(ish) price to make things fairer.  Essentially, the schools that can have cheap connections subsidise those that can't.

Connecting schools into a single WAN offers a few other advantages, such as being able to centralise the internet filtering for the entire region.

As safeguarding requirements have become more stringent, the centralised filtering systems have often not kept pace and some schools have opted to buy third party filtering systems which they considered superior, instead.  So as an online safety supplier, we have picked up a number of customers who still use the authority-supplied internet connection, but buy in their own filtering system.

More recently, bandwidth requirements have been increasing at an unprecedented rate, and internet connections have been getting steadily cheaper, so the original reasons for the broadband consortia are becoming less important.

Some authorities also found that their networks couldn't cope with the increased bandwidth requirements.  Somerset was one of those authorities, with a network that was part of the South West Grid for Learning (SWGfL - operated by RM Plc), and decided to disband their network, leaving the schools to choose between contracting directly the SWGfL, or to switch to a new ISP.

For the schools that were still using the centralised filtering system, sticking with the SWGfL made sense, since they could continue more or less as-is (their internet connection itself was replaced, but the filtering system remained the same).

As an aside, I'll note that my experience of the SWGfL is that they seem to have significantly more problems than other ISPs and it takes them a lot longer to sort them out.  Part of this may be that operating a filtering system introduces a few more points of failure on their network.  But their support is done through a foreign callcentre (so you're already having to deal with a language barrier when trying to get problems sorted) and their staff don't really seem technical enough to do their jobs.  On many occasions I have been told that a problem is "fixed" when it clearly isn't and it seems they simply didn't have the technical skills to actually test it themselves.

The problem

 

One of our customers, who was using the Somerset connection only for internet connectivity (not filtering), asked us to help them choose which ISP to switch to.  We got them a few quotes for leased lines, but they eventually decided to stick with the SWGfL - I'm not completely clear on the costs but I'm under the impression that it wasn't terribly dissimilar to the other quotes, and the reason given for picking the SWGfL was because its "what they know".  I must admit that I don't quite understand that because as far as I can tell "what they know" is that they have had no end of problems with the SWGfL-based authority-supplied connection in the past.

Also interesting to note that most ISPs now do IPv6 as standard over leased lines (although it's still hit and miss for VDSL, etc.) but IPv6 isn't an option at all on an SWGfL connection and it doesn't sound like RM have any plans to implement it any time soon!  You also don't get a public subnet like you would with pretty much every other ISP - instead everything is done on private RFC1918 addresses and the ISP does NAT for you.  This might make sense for schools who rely on RM's filtering, but is a bit nonsensical for anyone who just wants a plain unfiltered connection.

Anyway, the school bought an unfiltered connection from the SWGfL and it was duly connected.  They paid for one of our engineers to go on site to ensure the switch over went smoothly.  One of the things we had to do was expand their external-facing IP network because RM informed the school that they needed to set up VRRP for the backup internet connection.

Very soon, the school complained about problems connecting to HTTPS websites and we did some diagnosis - a high proportion of HTTPS connections just seemed to not be successful.  This was very odd behaviour - you'd try to make a connection and the SYN packets would all just be dropped somewhere in the ISP's network, but traffic for other connections would be working just fine.  We decided the likely explanations were that the traffic was being intercepted by a broken transparent proxy, or that RM's CGNAT system was broken.

We also noticed that some proxy headers were being added to HTTP traffic, indicating that it was being transparently proxied.  This suggested that the HTTPS traffic was probably also being transparently proxied.  Transparently proxying HTTPS traffic doesn't make any sense for an unfiltered connection, and RM eventually confirmed that the traffic was indeed being directed at their filtering system.  They agreed to turn off the filtering and the school submitted a change request for this change to be made.

Several months later, the problem reoccurred and we found that the traffic was once again being directed through a transparent proxy.  The school raised a support request and RM confirmed that the traffic was being sent through their filtering system, and that the school would need to submit another change request to have this corrected.  However, RM flatly refused to turn off filtering for the whole school's network, only for a single IP address.

After an official complaint was made by the school (since RM were not supplying the "unfiltered connection" that the school had been sold), they agreed to move the school onto an unfiltered connection, but that this would require moving them onto a public IP subnet.  They allocated a /28 network and told us that the school currently had 16 public IPs allocated to 1:1 NAT rules which would need to be moved into that new /28.

We pointed out that a /28 network doesn't actually have 16 usable IP addresses (surely this was obvious and should have been taken into account when they allocated it?)  We also asked which IPs were going to be reserved for VRRP - that totally confused the issue because it turns out that, despite specifically asking us to reconfigure things so that they could use VRRP, they never configured it and don't actually need it for anything.

So eventually, RM allocated a new /27 network and we sent an engineer over for another half day to make sure things were reconfigured properly.  Of course, everything got reconfigured and the switch-over occurred and nothing worked.  After some toing and froing it turned out that they had used a completely different network than the /27 network they had allocated.  So more reconfiguration later and everything worked.

Then a few hours later, everything broke again and the school staff came in in the morning to find no internet connection.  We are assured by RM that this was a problem with their core network and nothing to do with the reconfiguration, but I certainly have my suspicions.

Conclusions


  • Using SWGfL may make sense for schools who want to use their filtering system, but they seem ill equipped to provide reliable unfiltered connections (I can't comment on the reliability of filtered connections, but we did see significant enough problems with their transparent proxies that I wouldn't want to vouch for the reliability of their filters).
  • RM's network seems to have significantly more problems than other ISPs.
  • They seem to take a lot longer to resolve problems than other ISPs - the school has had significant problems associated with RM's broken transparent proxies for months.
  • Dealing with foreign call centres is very frustrating, if only because you're constantly battling with a language/strong-accent barrier (I would have similar concerns with any company basing call centres in "strong accent" areas in the UK too).
  • The competence of their technical staff seems very questionable - we shouldn't be told to reconfigure stuff to allow services that they don't need and won't ever set up (VRRP), subnet allocations shouldn't spontaneously change without anyone being informed.
  • We had considered doing the second network reconfig remotely - its lucky we didn't since RM's mistake would have left the school's network unreachable from the internet.

No comments:

Post a Comment