Tuesday, 18 September 2018

Chasing wild geese

So, a quick one this morning.  One of our customers has been having problems accessing Lloyds Bank's corporate payments gateway.

The first thing they did was phone Lloyds (very sensible).  Lloyds told them that there were no problems at their end and to clear cookies, add the site to the ActiveX trusted sites list (seriously, why is anyone using ActiveX these days?), etc.  Still not working, so must be a problem with the customer's firewall.

So the customer phoned us.  We pointed a browser at https://payments.corporate.lloydsbank.com/ (on an independent internet connection) and nothing happens - just sits there waiting.  So clearly Lloyds are having problems.

Lets try some slightly lower level debugging:
# openssl s_client -connect payments.corporate.lloydsbank.com:443 -servername payments.corporate.lloydsbank.com
And it just sits there...  Eventually:
CONNECTED(00000003)
And more sitting there doing nothing...  Then eventually:
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 0 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
---
Oh dear...  I should've got a certificate back from that but instead the web server dropped the connection.  So a few of obvious problems to look into here:
  1. It took a really long time for "CONNECTED" to appear...
  2. It took a really long time for anything to happen once we're connected...
  3. Finally, the web server failed to send us a certificate and negotiate an encrypted TLS session.
Firing up tcpdump, I found that:
  1. There was a significant delay before the first packet (SYN) even appeared.  So something else was going on before the connection was even attempted.  A DNS problem was a good bet.
  2. The first packet (SYN) was resent about 3 times before the web server responded.  This would also cause a significant delay in starting the connection.
So, investigating a potential DNS problem:
dig payments.corporate.lloydsbank.com
Resulted in a long wait before failing - definitely a DNS problem then.  Lets find the name servers responsible:
# dig payments.corporate.lloydsbank.com ns +trace
<snip>

payments.corporate.lloydsbank.com. IN NS  ns-lv6.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv2.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv7.lloydsbanking.com.
payments.corporate.lloydsbank.com. IN NS  ns-lv3.lloydsbanking.com.
Looking those up gives me:
ns-lv6.lloydsbanking.com. IN    A    141.92.88.1
ns-lv2.lloydsbanking.com. IN    A    141.92.96.1
ns-lv7.lloydsbanking.com. IN    A    195.171.195.169
ns-lv7.lloydsbanking.com. IN    A    195.171.195.168
ns-lv7.lloydsbanking.com. IN    A    195.171.195.167
ns-lv7.lloydsbanking.com. IN    A    195.171.195.166
ns-lv7.lloydsbanking.com. IN    A    195.171.195.165
ns-lv7.lloydsbanking.com. IN    A    195.171.195.164
ns-lv7.lloydsbanking.com. IN    A    195.171.195.163
ns-lv7.lloydsbanking.com. IN    A    195.171.195.170
ns-lv3.lloydsbanking.com. IN    A    141.92.104.1
So there are 11 name servers.  And by trying to look up against each of those name servers I find that 9 of them are down.  That means that about 82% of DNS requests are going to time out - at best things are going to be very slow while the customer's DNS server makes repeated DNS lookups and waits for each to time out; at worst it will fail to find a working DNS server and give up, rendering the website inaccessible.

To summarise:
  • 9 out of 11 of Lloyds' DNS servers were down, resulting in intermittently very slow or even completely broken DNS lookups.
  • If you managed to resolve the web server's IP address, it took a long time to accept the connection.
  • If you managed to get a connection, the web server may fail to negotiate an encrypted TLS session with the client.
With multiple Lloyds Bank servers having serious problems, I wouldn't mind betting that they are being attacked.  But why didn't Lloyds' own support people know / admit that there were problems on their end rather than sending the customer on a wild goose chase - it only took us 15 minutes to diagnose a problem that Lloyds' own people must have already known about.

No comments:

Post a Comment