Thursday 17 July 2014

Data Retention and Investigatory Powers (DRIP) bill

In April, the European Court of Justice ruled that the routine collection of location and traffic data about phone calls, texts, emails and internet use and its retention for between six months and two years meant a very detailed picture of an individual's private life could be constructed, that this amounted to a severe incursion of privacy and it therefore contravened EU law.

In response, the British government have put forward the Data Retention and Investigatory Powers (DRIP) bill to restore their ability to snoop on everyone.  Not wanting this bill to come under too much scrutiny, it was passed by the commons after a single afternoon's debate.  51 MPs voted to do their job properly and take time to make a decision, but they were overruled by 441 votes to just push it through as quickly as possible.  Assuming the Lords agree, this will pass into law tomorrow.

Once enacted, this legislation will allow the government to require internet service providers, internet application providers and telephone companies to record and retain metadata on any members of the public, without needing a warrant.  The bill would also apply to non-UK companies like Facebook (although quite how they expect to enforce this I'm not sure).

"The police will not know who that suspect is until they come to the police’s attention, at which point they have to get historical evidence. These days, part of that historical evidence will be in data records. They have to be able to access everybody’s data records in order to find those of one particular person, because the police, no more than the rest of us, are not given powers of clairvoyance with which to anticipate who is and who is not to be a suspect. Unless or until I hear from opponents of this Bill and of data retention how the police can be expected to identify in advance those who are going to be suspected of crime, I have to say that the whole logical basis of their argument completely falls away." - Jack Straw

Traditionally, a certain amount of information was kept by an ISP/telco/whatever for normal day to day business purposes. For example, a telco may keep call metadata records for a certain amount of time for billing purposes. If the police have a suspect then I see no issue with them getting a warrant to access that data.

If someone is a suspect, then I also see no problem with the police getting a warrant to record extra information that wouldn't normally be recorded/retained. e.g. they may get a warrant to have the ISP log web requests made by a suspect, or have the telco record call audio.

However, there is a distinction between the above examples, which require the police to suspect someone and convince a judge to issue a warrant, and what the government is increasingly trying to do, which is to capture data about *everyone's* activities, specifically for law enforcement purposes, just in case they later become a suspect.  That is something that's fundamentally wrong IMHO - as someone who has committed no crime, I have a right to privacy, and that right is being violated by having data recorded and retained for law enforcement purposes.

The whole "but it won't be used unless you become a suspect" argument is flawed - once the data is there, I have no confidence that access to it will be tightly controlled. Data may be leaked by accident, on purpose (illegally), the laws regarding under what circumstances it can be accessed may not be robust enough to prevent legal access for questionable purposes, and the whole thing is subject to feature creep - the access controls may be ok now, but I can't demand the historical data be deleted if their scope expands in the future.

Once upon a time, people were  considered innocent until proved guilty, but these days it seems that everyone is treated as guilty from the start and just happen to be allowed some freedom until the authorities can figure out what crime they committed.

All three main parties are backing this attack on our privacy, but is an afternoon of debate really enough to decide to throw away everyone's freedom?

Wednesday 9 July 2014

Decoding Freesat, Part 2

As I mentioned in the last post, I'm reverse engineering the Freesat transmissions in order to extract the channel numbers so I can automatically update my MythTV system to use sensible channel numbers.

I've now managed to figure out most of the important bits:  Transport stream 2315 is broadcast on 11.428GHz Horizontal, at 27.5 Mbaud, FEC 2/3 and contains a stream with PID 3002.  This stream transmits a carousel of service description tables (SDTs) and bouquet association tables (BATs).  The SDTs aren't especially interesting, so I'm ignoring that for now.

Freesat tailor their channels to groups of consumers by grouping them into bouquets - each of the four countries (England, Wales, Scotland and Northern Ireland) get three bouquets - one for standard definition receivers, one for high definition receivers and one for G2 (second generation) HD receivers, so 12 bouquets in total for the time being.  (But more about regionalisation later).

Each BAT is packetised into one or more sections, and there is one BAT for each of Freesat's bouquets.  To collect all the information, you just keep watching the carousel until you've seen all of the sections for all of the BATs.

A BAT consists of a header, zero or more "descriptors" (lumps of data that have an ID that identifies the type of data they hold) and zero or more "transport streams".  Each "transport stream" in the BAT contains zero or more descriptors that contain information relating to a DVB transport stream (i.e. satellite transponder).

The top level descriptors in the BAT include standard descriptors (bouquet name, country availability, private data specifier descriptor) and some non-standard ones:
Descriptor IDDescription
0xd4Region table
0xd8Category table

I haven't investigated any of these except for the region table.  0xd5 - 0xd7 appear to be binary data.  0xd8 looks like a list of (ID, language, category name) tuples, but I'm not sure what the category IDs are referenced by; the category names are stuff like "Entertainment", "News", "Shopping", etc.

0xd4 is the one that's of interest to me - The bouquets are geographically pretty coarse, and Freesat tailor the channels to much smaller regions.  So the south of England gets BBC One South on channel 101 whilst the East Midlands gets BBC One East Midlands on channel 101, etc.  Descriptor 0xd4 contains a list of the regions that are served by the bouquet (I don't understand why bouquets are used at all for regionalisation though, since I can't see a reason for not handling it all through this fine grained regionalisation system?)  The data in this descriptor is a bunch of variable length chunks concatenated together, with the header of each chunk containing its size so the next chunk can be found. The data format of the chunks is:

Offset (octets) Length (bits) Description
0 16 Region ID
2 24 Language
5 8 Length of region name
6 Variable Region name

"Language" is a three letter text string and is always "eng" at the moment.

Now, as I mentioned above, the BAT also contains a list of transport streams, with a bunch of descriptors in each.  So looking at the descriptors within a transport stream, as well as a few standard descriptors there is descriptor ID 0xd3, which maps service IDs to channel numbers.  This contains a bunch of variable length chunks concatenated together, with the header of each chunk containing its size. The data format of the chunks is:

Offset (bytes) Length (bits) Description
0 16 Service ID
2 16 Unknown
4 8 Length of remainder of the chunk
5 Variable LCN/region mappings

The "LCN/region mappings" data is a concatenated set of fixed length subchunks as follows:

Offset (bytes)Length (bits)Description
0.512Logical channel number
216Region ID

So, we can select the appropriate bouquet (e.g. if we're using an HD receiver in England, we would choose bouquet 272, which is England HD) and pick our region (such as region 15 - "E Midlands/Central E").  In theory we can filter down the data to get a list of channel numbers and what transport ID and service ID (i.e. what channel) they are assigned to.

There are a couple of gotchas:

Firstly, one service can be assigned to multiple channel numbers.  The BBC regions are all typically available on 9xx channel numbers, but your local region is on 10x as well.

Region number 65535 appears to be a fallback or default region.  So, for example, in the England HD region, service ID 10060 (ITV 1 London) is assigned to channel 103 in regions 1, 18, 27, 31 and 38 and channel 977 in region 65535.  Region 1 is London, the other regions don't appear in the region list so I assume they are legacy IDs.  So in this example, if you're in London then ITV 1 London appears on channel 103, but if you're anywhere else it is on channel 977.

Region number 0 is a complete unknown...  See below!

Open questions

Region 0 - I can't figure it out at all.  It only seems to be used for BBC One (logical channel numbers 101 and 108), and it seems that multiple channels can end up assigned to a single channel number in region 0.

For example, looking at the Wales HD bouquet (274), BBC One London, BBC One West Midlands and BBC One South are all assigned to channel 108 in region 0.  Channel 108 isn't assigned in any other region.

Similarly, the Wales SD bouquet (258) assigns these same three channels to 101 for region 0.  We clearly can't just ignore region 0 because there's no other way to assign a channel to 101 in this case, but I can't see how set top boxes can choose between the three assigned channels.  Also, I note that the SD version of BBC One Wales (service ID 10311) isn't listed in the BAT at all - do SD Freesat receivers in Wales no longer get a regional BBC One?

Even more confusing are bouquets 272 and 280 (England HD and England G2), which seem to have BBC One Scotland HD (service ID 8901) assigned to channel 108 in region 0!

It would certainly be interesting to look at a branded Freesat decoder and see what channels appear on it; unfortunately I don't have one.

Monday 7 July 2014

Decoding Freesat, Part 1

To watch/record TV, I use MythTV connected to a satellite receiver.  Unfortunately, MythTV's handling of channel numbering is a bit bonkers... So I've been doing a bit of reverse engineering of the data transmitted by FreeSat to try and automagically pull out the local channel numbers and update the MythTV channels database...  Information on the internet seems thin on the ground, so this is what I've figured so far:

The "FreeSat Home" transponder is the interesting one (transport stream ID 2315).  This is located at 11.428GHz Horizontal, with a symbol rate of 27500 and FEC 2/3.  PID 3002 on this transponder transmits a bouquet association table (BAT).

For each transport stream, there is an entry in the BAT, containing a descriptor tag 0xd3 and an associated lump of data.  The data is a set of variable length chunks concatenated together, with each chunk containing a length value so the offset of the next chunk can be calculated.

The chunk format appears to be:

Offset (octets) Length (bits) Description
0 16 Service ID
2 16 Unknown
4 8 Length of remainder of the chunk
5 4 Unknown
5 + 1 nybble 12 Local channel number
7 Variable Unknown

I haven't been able to figure out how the channels are selected by region - for example, local channel number 101 is allocated to BBC 1 London if you're in London, BBC 1 Wales if you're in Wales, etc. but I haven't found this information in the BAT yet. Compare:

BBC One London BBC One West
Service ID 18 9d (6301) 18 c5 (6341)
Unknown 1 81 f9 (33273) 82 01 (33281)
Size 08 (8) 08 (8)
Unknown 2 d (13) d (13)
LCN 3 b6 (950) 3 c5 (965)
Unknown 3 ff ff f0 6c 00 00 ff ff f0 6c 00 00

Edit: "Unknown 2", "LCN" and "unknown 3" appears to be an array mapping LCNs to regions:

ITV 1 London ITV 1 Granada
Service ID 27 4c (10060) 27 60 (10080)
Unknown 1 83 f3 83 f2
Size 18 (24) 0c (12)
Unknown 2 d (13) d (13)
LCN 067 (103) 067 (103)
Region 00 01 00 07
Unknown 2 d (13) d (13)
LCN 067 (103) 067 (103)
Region 00 12 00 27
Unknown 2 d (13) d (13)
LCN 067 (103) 067 (103)
Region 00 1b 00 2b
Unknown 2 d (13)
LCN 067 (103)
Region 00 1f
Unknown 2 d (13)
LCN 067 (103)
Region 00 26
Unknown 2 d (13)
LCN 3d1 (977)
Region ff ff

Descriptor 0xd4 in the BAT seems to translate the 16 bit region IDs into human readable strings (is it me, or does 16 bits sound a bit excessive for region IDs?)

Changing ISP...

BT just sent round a reminder for me to renew my annual "line rental saver"...  it seems to have gone up significantly - £159.84 (so equivalent of £13.32/month) and its kicked me into having a look at my options.  Currently I pay BT for the POTS line and then UK Free Software Network (an EntaNet reseller) get £23.70 for my internet connection.  The internet connection is a plain old ADSL2+ connection* with a /29 static IPv4 subnet and a /56 static IPv6 subnet and is currently synced at about 6.7Mbps down, 960Kbps up.

(* It's supposedly ADSL2+, but my TP Link ADSL modem won't resync properly when the noise floor increases, so I actually have to run it in G.DMT mode...  There isn't a huge difference in speed though).

So anyway, all in I'm basically paying £37.02/month for POTS and internet.  The only reason I need the POTS bit at all is because it's required for the ADSL connection - I get free evening/weekend calls from BT, but that's not really worth the cost of the line.  In fact, I think it's bonkers that BT are putting their prices up, given the increasingly wide selection of alternative providers.  SIPGate, for example, charge 1.19p/minute for geographic calls, and even my pay as you go mobile is only 3p/minute.

Unfortunately, UKFSN don't appear to do the POTS bit themselves, expecting you to use BT for that, but I have been pretty happy with them so I've been looking at other EntaNet resellers.  One that has stood out is FalconNet - they are offering FTTC internet connections (40Mbps down, 10Mbps up) for £22 and POTS for £9.50, totalling £31.50/month.  Their installation cost is £96 - given that my "line rental saver" has to be paid up front, a £96 up-front cost doesn't seem bad at all - I basically end up in credit for the first 8 months.  I fired off an email to FalconNet and they confirm that they do IPv6 and a /32 static IPv4 subnet.

This is pretty compelling: amortised over 18 months, I get a much faster internet connection for about the price I'm already paying and everything after that is a saving; and no more up-front annual fees.  I just lose the free evenings and weekends calls - FalconNet charge 1.14p/min for geographic calls, so the amount I'm saving can pay for about 8 hours of calls a month.  Although truth be told, for the sake of 0.05p/minute I'll probably just use SIPgate (or another SIP gateway).

Seems like a no brainer.