Tuesday, 18 September 2018

Chasing wild geese

So, a quick one this morning.  One of our customers has been having problems accessing Lloyds Bank's corporate payments gateway.

The first thing they did was phone Lloyds (very sensible).  Lloyds told them that there were no problems at their end and to clear cookies, add the site to the ActiveX trusted sites list (seriously, why is anyone using ActiveX these days?), etc.  Still not working, so must be a problem with the customer's firewall.

So the customer phoned us.  We pointed a browser at https://payments.corporate.lloydsbank.com/ (on an independent internet connection) and nothing happens - just sits there waiting.  So clearly Lloyds are having problems.

Lets try some slightly lower level debugging:
# openssl s_client -connect payments.corporate.lloydsbank.com:443 -servername payments.corporate.lloydsbank.com
And it just sits there...  Eventually:
And more sitting there doing nothing...  Then eventually:
no peer certificate available
No client certificate CA names sent
SSL handshake has read 0 bytes and written 0 bytes
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
Oh dear...  I should've got a certificate back from that but instead the web server dropped the connection.  So a few of obvious problems to look into here:
  1. It took a really long time for "CONNECTED" to appear...
  2. It took a really long time for anything to happen once we're connected...
  3. Finally, the web server failed to send us a certificate and negotiate an encrypted TLS session.
Firing up tcpdump, I found that:
  1. There was a significant delay before the first packet (SYN) even appeared.  So something else was going on before the connection was even attempted.  A DNS problem was a good bet.
  2. The first packet (SYN) was resent about 3 times before the web server responded.  This would also cause a significant delay in starting the connection.
So, investigating a potential DNS problem:
dig payments.corporate.lloydsbank.com
Resulted in a long wait before failing - definitely a DNS problem then.  Lets find the name servers responsible:
# dig payments.corporate.lloydsbank.com ns +trace

payments.corporate.lloydsbank.com. IN NS  ns-lv6.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv2.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv7.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv3.lloydsbanking.com.
Looking those up gives me:
ns-lv6.lloydsbanking.com. IN    A
ns-lv2.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv7.lloydsbanking.com. IN    A
ns-lv3.lloydsbanking.com. IN    A
So there are 11 name servers.  And by trying to look up against each of those name servers I find that 9 of them are down.  That means that about 82% of DNS requests are going to time out - at best things are going to be very slow while the customer's DNS server makes repeated DNS lookups and waits for each to time out; at worst it will fail to find a working DNS server and give up, rendering the website inaccessible.

To summarise:
  • 9 out of 11 of Lloyds' DNS servers were down, resulting in intermittently very slow or even completely broken DNS lookups.
  • If you managed to resolve the web server's IP address, it took a long time to accept the connection.
  • If you managed to get a connection, the web server may fail to negotiate an encrypted TLS session with the client.
With multiple Lloyds Bank servers having serious problems, I wouldn't mind betting that they are being attacked.  But why didn't Lloyds' own support people know / admit that there were problems on their end rather than sending the customer on a wild goose chase - it only took us 15 minutes to diagnose a problem that Lloyds' own people must have already known about.

Tuesday, 31 July 2018

Why do they not listen?

I don't usually talk much about our customers, but sometimes things happen which truly beggar belief.

For many years we have been contracted by a consortium of schools who were geographically close and originally wanted to be able to share a single connection for financial reasons.  This is quite a common arrangement.  Due to the shifting landscape of internet provision, costs and politics, the arrangement came to an end some time ago, which is fine - projects don't last forever, customers come and go and the project ended amicably, with all of the schools involved being pretty happy with us.  In fact, they all ended up taking out independent contracts with us for one thing or another after the project ended anyway.

Just one of the schools has been somewhat... "odd" at times though (I will refer to them simply as "School" to retain their anonymity), and despite our best efforts it has gradually caused problems for them.  Their ICT support is outsourced, which can be both good or bad - some of the companies that provide outsourced ICT services are pretty good, but some of them seem to have a "rip everything out" attitude and insist on unnecessarily replacing a school's equipment on day one instead of spending some time looking to see what is working well and what isn't and only replacing the stuff that isn't working.  This generally seems to be because they want to use systems that they are already used to rather than learning something new.  There's some merit in trying to standardise on systems you know, but obviously leads to a lot of disruption and expense for the school, so in my view is not a great way of doing things.

Anyway, the story probably starts in 2013.

Summer 2013

We had been providing the connectivity between the schools for some years by this point.  Because of limitations of the technology which was available at the time when it was installed, the schools had unreliable, but redundant, interconnects.  These weren't installed by us, but we were contracted to provide and maintain systems to use those unreliable interconnects to provide reliable connectivity.  We were also contracted to provide online safety systems (web filtering, etc.) to the whole consortium.

School: We've just changed our ICT provider and the new provider has decided to replace the existing online safety system with a third party system.
Us: That's fine, but the connections you use for internet access aren't reliable enough to use independently and the equipment you're proposing to remove is used to provide reliable internet access over those unreliable connections. The third party equipment that you want to install is incompatible with the protocols used by the existing equipment, which means you will also need to replace the equipment at the far end of the connections.

Obviously we prefer not to lose a customer, but if they want to switch to another provider then that's fine and we try to guide them and minimise the disruption as best we can.

The ICT provider ordered the third party system, plugged it in, discovered that it didn't work with the interconnects (as we had told them) and ended up backtracking on the whole thing.  So they left our online safety system in place and in fact I think their ICT provider probably decided it was ok in the end - at least they made no further moves to replace it.

Summer 2015

School changed ICT provider again.  This didn't really have any impact on us.

Summer 2016

By now, all of the schools in the consortium, except for School had installed their own internet connections and the interconnects were just being used for a small amount of local traffic and as redundancy in case of failure of one of the internet connections.  The consortium decided that the project had run its course and announced that it would be dismantled by summer 2017.

There was also some discussion about retiring the infrastructure early:

Consortium: We should simplify things by removing all of the equipment that is managing the connections immediately.
Us: We can't do that because that equipment is still need to provide the connection to School.
Consortium: No problem, we'll leave it as-is until summer 2017.

Spring 2017

School: We intend to continue using the existing interconnects after the summer.
Us: It isn't economic to do so since you're now footing the whole bill instead of it being shared by the whole consortium.  The existing equipment is also very old so liable to fail soon.  Replace everything with a new connection, it'll cost less than continuing with the existing equipment.  We can do this for you or you can get a third party in, we don't mind either way.

Summer 2017:

Us: As agreed last year, the existing interconnects will now be shut down.  School will need to migrate over to their new connection.

School: No alternative connection has been procured, there's no time left to get one now, we need to keep using the existing connections.
Us: We already said this was uneconomic, but as a good will gesture we'll take some of the hit ourselves and knock 50% off the cost.  But this is a one year only deal - we will not support this next year because the equipment is well past its end of life.  Also, as the equipment is very old, we recommend you follow our original recommendation and replace the interconnect ASAP since it might fail at any point.  If any of the hardware fails, we won't fix it.

Spring 2018:

Us: Just a reminder, you need to replace the interconnects ASAP.

Summer 2018:

School: We've just changed our ICT provider (again) and the new provider has decided to replace the existing online safety system with a third party system.
Us: That's fine.  As you already know, the connection you are using for your internet access is going out of service this summer, we presume you've procured a replacement?
School: No we haven't, we intend to continue using the existing interconnects.
Us: But those interconnects aren't reliable enough to use independently and the equipment you're proposing to remove is used to provide reliable internet access over those unreliable connections. The third party equipment that you want to install is incompatible with the protocols used by the existing equipment, which means you will also need to replace the equipment at the far end of the connections.
Us: In fact, this is exactly what we said in the summer of 2013, then again in the summer of 2016, then again in spring 2017, and in summer 2017, and in spring 2018.
Us: Also, we already told you a year ago that we weren't going to support any of this equipment which manages those interconnects any more as it is far too old.
School: We don't need that equipment, we're just going to use a single (15 year old, unreliable) connection in isolation.
Us: Errm, that will be really unreliable, here are some statistics from our monitoring data to show just how unreliable it will be.

Panic ensues at School.

School: Why didn't you tell us that we would need a new connection!  It's now far too late to procure one in time.  You must extend our contract for free.
Us: Umm, no.
Us: We've made every effort to recommend the most economic and reliable way forward and have been ignored at every step.
Us: Last year we dug you out of a hole you'd made for yourselves and even gave you a big cost reduction out of our good will.  You have repaid us by continuing to ignore our recommendations, blaming us for the mess you've got yourself in, paying your last invoice months late and cancelling your contract with us.
Us: You have now demanded that we dig you out of a hole again out of the goodness of our hearts at extremely short notice.
Us: Here are a selection of get-out-of-jail cards from our standard price list, which we are happy for you to buy from us at the standard price, go pick one.

And apparently this is all our fault...  At least now that the contract with them has ended we won't have to deal with any of the fallout from this mess.  Seems like a classic case of "I think we've heard enough from the experts" to me :)

Thursday, 5 April 2018

Thoughts on the gender pay gap

I've already managed to upset someone on Facebook because I apparently said that women are worth less than men (I didn't).  I don't regard myself as a feminist, but rather an equalist.  Discrimination of all forms is bad - women shouldn't be discriminated against based on their gender, but similarly men shouldn't be discriminated against in order to give women an advantage.

Discrimination is as old as time itself but I don't think you fix the problem by just changing which group you discriminate against, any more than you prevent war by changing which group of innocent people you're blowing up.

The British government is now requiring all businesses with over 250 employees to publish figures summarising the gender pay gap.  There are a few bits of information required, but essentially this boils down to a simple average difference in pay between men and women.

The press is then using the figures to bash the companies with the biggest pay gap.  No consideration is being given to what types of work are being done since that isn't in the information that companies are expected to publish.  i.e. this is not a like-for-like comparison.

There are a few potential reasons why women may be earning less than men:
  1. Maybe there are less women qualified and applying for the higher paid jobs.
  2. Maybe employers are refusing to employ women in the higher paid jobs.
  3. Maybe employers are employing women in the higher paid jobs but are paying them less than the men who are doing the same work.
All of these points are, of course, a problem.  The two last points are directly under the employer's control and any employer undertaking in this kind of discrimination deserves to have the book thrown at them - I firmly believe that equally qualified men and women should be given the same opportunities as each other and the same pay for doing the same work.

However, it is fairly unclear to me how the first point can be regarded as a specific employers' fault.  Since the information that is being made available doesn't do anything to differentiate between these possibilities, it seems completely unfair to vilify an employer based solely on this data.

So we have STEM companies and the construction industry with a fairly big pay gap simply because it's very difficult to recruit women to do the higher paid jobs in these fields.  I'm sure that in some cases there is discrimination going on, but you can't determine that from the data being used by the press to attack employers.

To demonstrate the issue, lets take a simple fictional employer - they aren't discriminating and they have the following breakdown of employees:

Total Men Women Salary
Engineers 200 180 20 £50,000.00
Admin staff 50 15 35 £20,000.00
Total 250 195 55

The mean pay is £47,692.31 for men and £30,909.09 for women, yielding a pay gap of £16,783.22, even though everyone doing the same job is paid the same.

So, how can the employer fix their pay gap?  Since we're already assuming there is no discrimination going on, we can look at it rationally with maths:

1. Increase the proportion of female engineers
It's very unclear to me how a single employer can increase the proportion of applicants who are female.  In order to do this the proportion of women training in engineering needs to be increased, starting with school kids.  There is some scope for the industry as a whole working to promote engineering to women, but it takes years and a single employer can't do a lot on their own.

At work, when we were last recruiting, we didn't end up hiring a man because we're horrible people who support the patriarchy; we ended up hiring a man because no women applied for the job.  We would usually want to pick the best person for the job, regardless of their gender.

So with far less women in the engineering job market than men, the immediate options are:
  • Increase the women's benefits and decrease the men's.  The women will be "overpaid" with respect to other employers and want to work for you whilst the men will be "underpaid" and get a job elsewhere.
  • Recruit underqualified women to make up the numbers, since there aren't enough qualified women applying.  Recruiting people who aren't qualified to do the job sounds like a bad idea for the business.
  • Restrict the number of applications from men.
All of these options discriminate against men based solely on their gender and are therefore pretty unethical (and probably illegal).

2. Increase the proportion of male administrative staff
So given that we probably can't do a lot to recruit more female engineers, we could tweak the balance elsewhere in the business.  The average women's pay is being dragged down by the fact that a disproportionate number of women are employed in the lower paid administrative roles.  Only 10% of the engineers are women, but 70% of the administrative staff are women.  If the employer reduces the proportion of women in the admin roles down to 10%, that will eliminate the pay gap.

This problem is pretty much the opposite of (1) - the same options apply, but this time the employer must discriminate against women.  Again, doesn't strike me as a good plan.

3. Increase the women's pay or decrease the men's pay
So far, the employer has paid men and women the same amount for the same work, which has led to a big pay gap simply because of the job role demographics.

Adding about 55% to the womens' pay in both job roles eliminates the pay gap.  We're now paying female engineers £77,000 and female administrative staff £30,800.  Of course, the employer may well not be able to afford these kinds of expenses, especially when the men find out that they are earning far less than the women and take their employer to court for sexual discrimination.

Similarly, reducing the men's pay by 65% across the board achieves a similar result - the employer gets sued into the ground for sexual discrimination, and if they survive that, with the men now being paid far below the market rate, they all leave for greener pastures.

4. Pay the engineers and administrative staff the same
Another option is to decrease the pay the engineers receive and increase the pay the administrative staff receive.  By paying everyone £44,000 irrespective of what job they do, the pay gap is eliminated whilst keeping the total wage bill the same.

Unfortunately, the company's engineers are now underpaid relative to the market rate, so they will leave and it will be impossible to recruit replacements.

The entire industry could follow suit, but this would lead to a long term shortage in engineers - it costs tens of thousands of pounds and several years to become an engineer, and how many people would do that if their pay is the same as someone who hadn't spent that time and money on training?

I am a firm believer that equally qualified and experienced women and men are equally valuable and should be paid the same.  For some jobs it is easier than others to ensure that this is happening - where there are fixed non-negotiable pay scales things are obviously clearer than jobs where employees are expected to negotiate their salaries.  In the latter case, people who are poor negotiators are obviously not going to do as well as good negotiators.  I don't know if there's a gender bias when it comes to negotiating skill, but any protection should surely extend far beyond the work place since poor negotiators also lose out when negotiating other things, such as buying a car, etc.

There are also big differences in demographics that need to be accounted for: is a 35 year old who graduated from university at 21 and has been in relevant employment for the whole time (giving them 14 years industry experience) as valuable as someone else who took 10 years off between 23 and 33 to bring up their children (so they have only 4 years industry experience)?  These people could be either men or women, but in the current society the majority of child carers are women and it doesn't seem right to completely discount that when comparing pay.

From taking a rational look at the data being collected, it seems clear to me that it is tackling the wrong thing.  Whilst I'm sure that discrimination is happening, the data being published cannot be used to determine who is discriminating.  Indeed, in many cases reducing the pay gap seems to actually require the employer to discriminate, so the whole thing seems very counter productive to me.

The things that need to be tackled are:
  • Ensuring that men and women are given the same opportunities.
  • Ensuring that men and women receive similar pay for similar work.
  • Ensuring that neither men and women are put off from taking any opportunities that are open to them.
Lets work on those points instead of attacking employers for things that they can't do much about.