Thursday, 25 July 2013

'Light touch' police and crime panels must shift scrutiny powers up a gear

Yesterday, The Guardian Public Leaders Network published my article about how Police & Crime Panels need to shift up a gear or three in their scrutiny of PCCs, especially their decisions about scarce resource allocation. You can read the whole article here.

In the article I say:
The question is, how well the police and crime panels (PCPs) will scrutinise these resource decisions over the coming months. Chief constables will be using all their considerable skills to ensure good and professional decisions are made about the operational deployment of tight police resources but they will be subject to the policy influence of their PCCs – and that influence needs to be carefully unpicked by the PCPs...

So how is your PCP doing..? 

Tuesday, 16 July 2013

The politics of stats, trends & probability

I am no statistical expert and it is many years since I studied the application of Student’s T test, two tails and correlation coefficients to psychological experiments. But I have retained just about enough of what I learnt then and since (about targets, trend analysis and statistical process control) to be pretty darn fed up with how most politicians and members of the media treat data.

The ridiculous way in which the media and certain sections of the government are treating the results of Dr Bruce Keogh’s investigation into hospital care is an object lesson in how complex data sets are twisted into political rhetoric. Some of this is clearly about politics and for that I can almost forgive them. But when it comes to the proportion of this twisting that is down to plain ordinary ignorance, I really can’t!

(And if you want to know what I mean about the Keogh report, read this blog post and please read it carefully.)

We entrust politicians with a huge amount of power which they wield on our behalf. They spend vast amounts of money, our money, on projects which the evidence shows (if they looked closely) were never going to work. It is time this ended. All politicians ought to have a good introduction to stats, trends and probability (and scientific methods, while they are about it) so that they are better able to make decisions that will actually make a real difference.

Now I am not saying that all politicians and members of the media are ignorant of such matters, but many are. The information on Payment by Results that I have uncovered in the last few days disturbs me. In fact it horrifies me that we may well be paying service providers for results that could simply be chance results rather than the robust outcomes of a better service.

I do not intend to make this blog post a long lesson in stats. Other people can do that far better than me. But here is just one idea: if you were to throw a dice a couple of dozen times, you would expect the numbers to come up in reasonably even quantities. Perhaps the four might have come up 5 times and the three just once. But you would not presume the dice was loaded. However, if you threw the dice a hundred times and the three only came up (say) 5 times and the four came up (say) 26 times you would begin to think something odd was happening.

Stats is simply about measuring when the threshold between chance and a real variation (a loaded dice in this example) is crossed. Without statistics, you cannot know whether an occurrence of (say) less teenage pregnancy (etc. etc.) is just a random chance or that something ‘significant’ has happened (i.e. it is NOT chance, or at least very unlikely to be).

I would hope that most people reading this, get this. But do most politicians? And journalists? What do you think?

And I won’t even start to talk about about systems theory, the role of blame and the fact that complex things really are complex!! (I will leave that for another blog post.)

But please… please can we have less of the ignorance around trends, evidence and chance occurrences and a bit more understanding that ‘wicked’ social (and medical) problems require some pretty darn ‘wicked’ solutions…

Friday, 12 July 2013

Tilting at lamposts

Yesterday I received a response to my inquiries concerning the pilot data being used by the MoJ to support the extension of  Payment by Results. I have reprinted the letter I received in the blog post below.

This blog contains some commentary on the replies I received.
  • The Moj used Section 22 of the Freedom of Information Act to not answer two of my questions. They say that the information I was seeking is about to be published on July 25. I am prepared to wait until then before deciding to appeal their reply or not.
  • I asked why they published the results early, they said they did this "to ensure the information was made public as soon as it was available" despite admitting in other answers that it was incomplete. I suspect they wanted to get in some 'good news' before the summer vacation and in advance of CSR negotiations. But in my view, it looks shoddy and is evidence of using data and stats for political purposes. However, I guess all governments do that... don't they?
  • Their answers do seem to assert that they have sought to compare like with like in terms of cohort comparisons.
  • Their answer to question 4 does evidence the fact that this data is incomplete and premature, in my view
  • They originally said that a key difference between the cohort is that in this group "reconvictions only count offences for which the offender was convicted at court, whereas the National Statistics proven re-offending measure also includes out of court disposals (cautions)”. I asked what the impact of that difference was likely to be. They referred me to "Table B3 of annex B from the MoJ’s proven re-offending statistics quarterly bulletin: https://www.gov.uk/government/publications/proven-re-offending--2". I have looked at this table and it is not entirely clear so I think I am going to have go back to them on this and seek further clarification. But do note that they said "We have not produced alternative interim figures on what the impact would be if different rules (such as including cautions) had applied". Which seems a bit sloppy to me. This is a critical difference after all and I suspect that if the data was showing not in favour of the pilot providers, they would be seeking further clarification!
  • I asked whether the comparison groups (to evidence that the pilot intervention was in fact working) were selected using some kind of randomised selection. They said "The control group will be selected by an Independent Assessor using Propensity Score Matching (PSM), the methodology for which has been published at: Peterborough Social Impact Bond: an independent ... - Gov.ukSo the answer is 'NO': comparator groups will be selected by 'independent' assessor (being paid by the government, I assume). I looked the reference document and here is a quote from it: "It should be noted that, unlike random control allocation, PSM cannot take account of unmeasured differences which may account for variation in reconviction aside from ‘treatment received’" Uh huh. But it goes onto assert that: "However, PSM [propensity score matching] is widely regarded as one of the best ways of matching quasi-experimentally (Rosenbaum, 2002), and it has been increasingly used in a criminological context (e.g. Wermink et al., 2010)." So that is alright then. Excuse me while I give you this new medicine that has been quasi-experimentally tested on people who are sort of similar to you...
  • I asked "For Doncaster, success “will be determined by comparison with the reconviction rate in the baseline year of 2009”. How will this accommodate national and/or local trends in (say) sentencing practice or levels of crime?". They replied "The five percentage point reduction target was agreed after analysis of historic reconviction rates established that this would illustrate a demonstrable difference which could be attributed to the new system and not just natural variation." That is not an answer to my question, I will need to go back to them on this.
  • I asked about the 6 versus 12 month comparison and how the headline data (based on six months) was going to look against the usual (12 month) data. They said in reply "The statistical notice made clear the limitations of the information presented and the care that should be taken in interpreting these interim figures." Remind me - was that subtlety in the press releases that went out when this interim data was released...?
  • They missed the point completely on my question about seasonality...
  • Please read their answer to my question about why 19 month data. Please let me know what you think. I am thinking 'wool', 'eyes' and 'what do you really mean?!'
  • The maths question is funny. They said "The figures presented were the rounded versions of the actual figures, which were 68.53 and 79.29". So I have done the calculations again and the result I get this time is 15.7. So they are sort of correct - but why leave out the first decimal point?
  • I asked about statistical significance (the test of whether a difference is just a chance difference or one that indicates a real effect is in play). This is what they said "We have not carried out statistical significance tests on the interim figures because, when it comes to the final results, neither pilot will be assessed on the basis of whether they have achieved a statistically significant change."
OK. Let me repeat that again in big and bold:
We have not carried out statistical significance tests on the interim figures because, when it comes to the final results, neither pilot will be assessed on the basis of whether they have achieved a statistically significant change.
So, Payment by Results could well be based upon purely random chance events that may just have happened.

Is that a solid basis for the distribution of taxpayers' money?

Payment by Results, lamp posts... lit?

The day after #tagginggate you would expect me to be somewhat sceptical about how well government manages complex contracts with external suppliers. Moreover, of course, questions remain about how well the external suppliers manage these contracts too! But that is for another blog post one day.

But meanwhile, I received a reply to my questions about the Payment by Results pilots (and if you thought tagging contacts were complex...!). Below I have reprinted in full the reply I have received from the relevant person in the Ministry of Justice. It is already quite a long piece, so I will leave my commentary to another posting. Please read what they have to say critically - you will be able to then to see whether your thoughts match, contradict or add to my interpretations.

Dear Mr Harvey,

Thank you for your email of 13th June 2013, in which you asked for the following information from the Ministry of Justice (MoJ):

(I have left out their repetition of the questions - as they are shown below anyway)

I can confirm that the department holds information that you have asked for, however, please be aware that questions 15 and 16 of your request [these are the questions in question: 15. Given that you must have the data for Peterborough for the missing 19 month period (September 08 to March 11), and acknowledging that this overlaps with the pilot beginning, please could I have this data nonetheless.
16. Likewise, please could I have the data for the quarter beginning April 2012] have been handled under the Freedom of Information Act 2000 (FOIA) and the remaining questions have been dealt with as normal business.  

Section 84 of the Act states that in order for a request for information to be handled as a Freedom of Information request, it must be for recorded information. For example, a Freedom of Information request would be for a copy of an HR policy, rather than an explanation as to why we have that policy in place. 

Following our assessment of your correspondence we believe that questions 1-14 and 17-21 relate to general questions and not recorded information.

The responses are as follows:

Questions 15 and 16 – Dealt with under the FOIA

I can confirm that the department holds information that you have asked for, but it is exempt from disclosure because it is intended for future publication.

We are not obliged to provide information that is intended for future publication (section 22 of the Act). In line with the terms of this exemption in the Freedom of Information Act, we have considered whether it would be in the public interest for us to provide you with the information ahead of publication, despite the exemption being applicable. In this case, I have concluded that the public interest favours withholding the information.

You can find out more about Section 22 by reading the extract from the Act and some guidance points we consider when applying this exemption, attached at the end of this letter.

You can also find more information by reading the full text of the Act, available at http://www.legislation.gov.uk/ukpga/2000/36/section/22.

When assessing whether or not it was in the public interest to disclose the information to you, we took into account the following factors:

Public interest considerations favouring disclosure
There are public arguments in favour of disclosure of this information at the present time.  Disclosure would for example improve transparency in the operations of Government, and of the justice system in particular.

Public interest considerations favouring withholding the information
There are public interest arguments against disclosure of this information at the present time.  These arguments include that is in the public interest to adhere to the existing publication process for official statistics, which includes time for the data to be collated and properly verified.

It is also in the public interest to ensure that the publication of official information is a properly planned and managed process, to ensure that data are accurate once it is placed into the public domain.  It is also in the public interest to ensure that the information is available to all members of the public at the same time, and premature publication could undermine the principle of making the information available to all at the same time through the official publication process.

We reached the view that, on balance, the public interest is better served by withholding this information under Section 22 of the Act at this time.

You may be interested to know that this information is due to be published in the MoJ’s Proven Re-offending Statistics Quarterly bulletin on 25th July 2013 at the following link: 

https://www.gov.uk/government/organisations/ministry-of-justice/series/reoffending-statistics

[It seems reasonable to wait until 25/7/13 to decide whether I will appeal this decision or not.]

Questions 1-14 and 17-21 – Dealt with as normal business

As mentioned above, we have dealt with these questions under the provision of normal business.

1.  The pilots began on 9 September 2010 and the 1 October 2011 (Peterborough and Doncaster respectively.) Please can you qualify “began”?

This means that each pilot included eligible offenders (as defined in Table A1, Annex A of the statistical notice) discharged from the pilot prison from these dates onwards. 

2. Given that “the next Proven Reoffending Statistics quarterly bulletin will not be published until 25 July 2013”, why did you publish your results today rather than a few weeks from now?

As set out in the publication, rather than wait until 25 July, the results were published in this ad-hoc bulletin to ensure the information was made public as soon as it was available. In accordance with the Official Statistics Code of Practice the publication date was pre-announced by MoJ statisticians in May 2013. The next Proven Re-offending Statistics quarterly bulletin on 25 July will contain updated interim figures for the pilots, with quarterly updates thereafter.  

3. I understand that “the interim re-conviction figures being published in this statistical bulletin are based on periods half the length of those that will be used for the final results” – daft question I am sure, but presumably this applies to both the ‘experimental’ subject averages and the national comparators?

Yes, the interim re-conviction figures presented in this publication have been produced in exactly the same way for each pilot prison and its national comparator.

4. You say that these “interim 6 month re-conviction figures are available for almost all of Peterborough cohort 1 (around 850 offenders) and half of Doncaster cohort 1 (around 700 offenders)”, please can you explain what has happened to the other portions of the cohorts and why they are included?

The interim figures have been provided for as much of each cohort as possible, but at this stage they do not include all offenders in cohort 1 of either pilot. This is because some offenders were released from prison too recently to be measured on this basis (the 6 month re-offending window and 3 month waiting period have not yet elapsed). However, they will be included in the interim figures in future as soon as enough time has elapsed to allow us to measure them on a consistent basis.

5. In terms of methodology, you say “offenders enter the PbR pilots after their first eligible release from the prison within the cohort period”, please can you explain “eligible” in this context and whether the national comparator figures also cover the same “eligible” group?

Not all offenders released from the pilot prisons are eligible for the pilots. The Peterborough pilot for example, only includes adult males released from a custodial sentence of less than 12 months, so a prisoner released from a sentence of 2 years would not be eligible.  For each pilot, the national comparator figures have been produced on the same basis using the same eligibility criteria. More details on eligibility are available in Table A1, Annex A of the interim re-conviction figures publication:

https://www.gov.uk/government/publications/interim-re-conviction-figures-for-the-peterborough-and-doncaster-payment-by-results-pilots

6. You explain that the key difference is that reconvictions only count offences for which the offender was convicted at court, whereas the National Statistics proven re-offending measure also includes out of court disposals (cautions)” and “Additionally, there are a number of other differences between the pilots and the 
7. National Statistics proven re-offending measure in terms of which offenders are counted within the cohort”. Are you able to say what difference these differences might make to the figures? For example, what number of offenders per hundred are usually subject to a caution (or similar disposal) as opposed to a court conviction?

We have not produced alternative interim figures on what the impact would be if different rules (such as including cautions) had applied. However for information on the effect cautions have on re-offending, please see Table B3 of annex B from the MoJ’s proven re-offending statistics quarterly bulletin:

https://www.gov.uk/government/publications/proven-re-offending--2

8. Again I assume that given that the “Peterborough pilot includes offenders released from custodial sentences of less than 12 months, whereas the Doncaster pilot includes all offenders released from custody regardless of sentence length”, the national comparisons are on a like for like basis?

Yes, the figures for the national comparisons are calculated on the same basis as their respective pilots.

9. You explain that the “success of each Peterborough cohort will be determined by comparison with a control group (of comparable offenders from across the country)”. How will this ‘control’ group be selected to ensure there is no inadvertent or unknown bias? Indeed was there (will there be) any form of randomised control trial element to either of these two trials (and extensions)? If not, what is your considered professional judgement as a statistician as to the validity of these results to guide future practice?

The control group will be selected by an Independent Assessor using Propensity Score Matching (PSM), the methodology for which has been published at:

Peterborough Social Impact Bond: an independent ... - Gov.uk

10. For Doncaster, success “will be determined by comparison with the reconviction rate in the baseline year of 2009”. How will this accommodate national and/or local trends in (say) sentencing practice or levels of crime?

The five percentage point reduction target was agreed after analysis of historic reconviction rates established that this would illustrate a demonstrable difference which could be attributed to the new system and not just natural variation.

11. Given that normally reconviction rates are measured on a 12 month basis and these interim results are measured on a 6 month one, how much is that likely (based on past data) to have depressed the reconviction rates?

The figures presented are our best assessment of change in re-conviction figures at this time and have been provided as an early indication of each pilot’s progress. It is not possible to say at this stage what the final 12 month re-conviction figures will be, though naturally the final 12 month re-conviction figures will be higher than the interim 6 month figures simply because offenders will have had more time in which to commit offences.  The statistical notice made clear the limitations of the information presented and the care that should be taken in interpreting these interim figures. 

12. You say “Whereas in this publication, to eliminate the risk of seasonality and enable a consistent comparison over time, all figures relate to offenders released in the 6 month period from October to March”. I may well be missing something here, but by only using the six winter months, are you not likely to increase the risk of a seasonal effect in the data? Please explain further. 

We would be risking a seasonal effect if we took the 6 winter months for the pilot period and compared them to a different period in other years. For example if we had compared October 11 to March 12 with January to June 2009, it would be possible that any changes were simply the result of seasonal effects rather than a real change in re-offending. Whereas by only comparing the Oct-Mar pilot period with other Oct-Mar periods, we are comparing like with like and have therefore eliminated the risk of seasonality 

13. Given that the Peterborough cohort finished on 1/7/12, and allowing for the 6 months plus 3 (for court delays), this takes us up to March 2013. So on this basis, why have the last three months of data (April, May and June 2012) been excluded? (As far as I can see there is no explanation of this decision, but forgive me if I have overlooked it.)

Before releasing official statistics the information needs to be collated, processed and quality assured. Re-conviction data for offenders discharged in April, May and June for the Peterborough pilot had not been fully collated, processed and quality assured in time for this publication. However, re-conviction figures for the full Peterborough cohort (including all releases up to the end of June 2012) will be included in the next quarterly update to be published in July. 

14. Given that I assume that data is ordinarily collected on a quarterly basis, it would have been helpful to have presented your data in a similar way so that trends could be spotted over time rather than use the fairly arbitrary 19 month period to show the data. Why did you present it this way? Please could I have the data on a quarterly basis.

The 19 month period was chosen as this shows figures for as much of the cohort as possible as explained in the statistical notice. It is not an arbitrary cut off, but simply the period of the cohort for which we were able to provide interim re-conviction figures. 

The interim figures were published as soon as the MoJ Chief Statistician judged that we were in a position to produce statistically robust interim re-conviction figures, meaning that the number of offenders being reported on was a large enough sample for each pilot.  We have not produced any figures based on quarterly cohorts because the numbers involved would be too small to give statistically robust information. 

Additionally, reporting on the cohort by quarter would not show a like for like comparison across each quarter, and would therefore be more likely to confuse than to provide meaningful information. The reason for this is that offenders join the cohort after their first eligible discharge within the period. However some offenders will be released from the prison more than once within the cohort period. These more prolific offenders (who are more likely to re-offend) would therefore be more likely to appear in earlier quarters than later quarters.

15. Given that you must have the data for Peterborough for the missing 19 month period (September 08 to March 11), and acknowledging that this overlaps with the pilot beginning, please could I have this data nonetheless.

See earlier response.

16. Likewise, please could I have the data for the quarter beginning April 2012.

See earlier response.

17. You say “Nationally the equivalent figures show a rise of 16% from 69 to 79 re-conviction events per 100 offenders”. How do you get 16%? I can see a rise of 10 ‘points’ or a rise of (10/69*100) 14.5%. 

The figures presented were the rounded versions of the actual figures, which were 68.53 and 79.29. 

18. (As an aside, this is quite a large rise nationally in re-conviction rates comparing the period from just before the last election to period after. Have national rates continued to rise or have they levelled off now?)

Re-offending rates for all adult offenders have barely changed in a decade. Please see the quarterly re-offending bulletin for information on national re-offending levels.

https://www.gov.uk/government/publications/proven-re-offending--2

19. You say “these interim figures show a fall in the frequency of re-conviction events at Peterborough” which is drop from 41.6% to 39.2%. At what threshold of probability is this statistically significant?

We have not carried out statistical significance tests on the interim figures because, when it comes to the final results, neither pilot will be assessed on the basis of whether they have achieved a statistically significant change.  Peterborough will be assessed by comparison with a national matched control group using a PSM methodology. Doncaster will be assessed against a baseline of calendar year 2009.  

20. Please can you confirm that the OGRS scores cited relate to the cohort groups in both Peterborough and Doncaster (rather than all offenders who were released)?

The OGRS scores relate to the offenders within each cohort.

21. Why are the national re-conviction scores given next to Doncaster data (which average 32.9%) differ from the scores given next to the Peterborough data (average 37.9%)? I now the period is different and there is some missing data, but this still seems like a large difference…

The criteria used to create the national comparator figures for the Peterborough and Doncaster prisons are different because the 2 pilots have different criteria. For example, the national figures for the Peterborough comparison will only include adult males released from custodial sentences of less than 12 months, whereas the Doncaster comparison includes all prisoners released from custody regardless of sentence length. For more information on the differences between the two pilots please see Table A1, Annex A of the interim re-conviction figures statistical notice.

Generally, re-conviction rates are higher for offenders released from custodial sentences of less than a 12 months than for all offenders released from prison.  Hence the national comparator group for Peterborough have higher re-conviction rates than the national comparator group for Doncaster. 

You have the right to appeal our decision if you think it is incorrect. Details can be found in the ‘How to Appeal’ section attached at the end of this letter.

I will stop there - the rest is pretty standard boilerplate about how to appeal etc. I will say thank you to the Justice Statistics Analytical Services (who signed the letter) for their work in responding to my challenges.

So what do you think about these answers to my questions? 
What would you comment upon?