Friday, 23 August 2013

The lamp posts are not for bending....

In my ongoing explorations of the statistics and rationale underlying the Government's resolute commitment to Payment by Results (PbR), I asked them a series of further questions after their responses to my last set. (These two blog posts show their answers and my queries). Here is a copy of the letter I received yesterday (my questions in bold, their answers in italics):

Many thanks for your email & attached response to my questions. I note what you say about using section 22 not to answer my questions 15 & 16. I will await publication of the new batch of statistics on 25/7/13 naturally. I reserve the right to appeal against your decision subject to this information being able to answer my questions.

With regard to some of your other answers, I would see further clarification as follows:

a)      With regard to Q2, you have not answered my question. You have merely reported on a series of facts and not explained the reasoning behind the decision to produce the statistics early. Please, may I request again, for my question to be answered fully. If that is difficult, I am happy to submit an FoI inquiry requesting all the email correspondence between senior civil servants and the Minister which led up to the early publication of this data. Which would you prefer?

As set out in the publication, we published the figures in an ad hoc bulletin on 13 June, rather than waiting to publish them in the Proven Re-offending statistics bulletin on 25 July, to ensure the information was made public as soon as it was available. This is in accordance with the Code of Practice for Official Statistics ( which requires us to “release statistical reports as soon as they are judged ready, so that there is no opportunity or perception of opportunity, for the release to be withheld or delayed”. 

Once the MoJ Chief Statistician had judged that we were in a position to publish statistically robust interim re-conviction figures, we published them at the earliest opportunity.

b)     In your answer to 6/7 (you correctly identified that this is one question where a rogue carriage return had crept in) you referred me to “Table B3 of annex B from the MoJ’s proven re-offending statistics quarterly bulletin”. I looked at this table carefully but I could not see how it answered my query about the extent of the difference between national stats and the pilot’s stats. Please could you be more precise and show me more clearly how this factor is likely to affect comparisons. Thank you.

The two middle columns of table B3 show (for offenders released from prison or starting court orders) re-offending figures including and excluding cautions. 

Looking at the top section headed ‘Proportion’, column 2 (“Previous measure: re-convictions (prison and probation offenders only), whole year”) is the re-conviction rate excluding cautions – i.e. the Doncaster measure. Column 3 (“New measure: re-offending (prison and probation offenders only), whole year”) is the re-offending rate, including cautions – i.e. the National Statistics measure. This shows, for example, for offenders discharged from prison or starting a court order in 2009 the proportion re-convicted was 34.7 per cent (column 2) but when we also count offences that receive a caution the proportion increases to 36.2 per cent (column 3). The difference over time is small - between 1 and 2 percentage points.

As noted in our previous response, we have not produced alternative interim figures on what the impact would be if different rules (such as including cautions) had applied to the pilots. However, the figures in table B3 show the impact at a national level of including/excluding cautions.

c)     You answer to Q9 confirms, I think, that there is no element of randomisation in the selection of the ‘control’ comparator groups. As a scientist I find this most disturbing and I am not sure about you, but I don’t I would be prepared to undergo a course of medical treatment that had gone through a “quasi-experiment”. As PbR spreads (as I assume the Governments wants), it will become increasingly difficult to find comparator groups on this basis. Moreover, I cannot see, no matter how independent are the people who are choosing the comparator groups that this process will control for hidden factors. As a consequence, I do not think that you have yet answered my query “what is your considered professional judgement as a statistician as to the validity of these results to guide future practice?” I look forward to your thoughts. Thanks

The Payment by Results pilots were set up to test a range of approaches to achieving reductions in re-offending through paying by results, and different pilots use different payment mechanism designs. 

Propensity Score Matching (PSM) is a well established statistical method for creating a control group when it is not possible to carry out a randomised control trial (as it is not in this case). As set out in our previous response, the control group will be selected, using the published PSM methodology, by an Independent Assessor.

The Ministry of Justice’s  consultation response, Transforming Rehabilitation: a Strategy for Reform, described how, under our proposals, to be fully rewarded, providers will need to achieve both an agreed reduction in the number of offenders who go on to commit further offences, and a reduction in the number of further offences committed by the cohort of offenders for which they are responsible.

The consultation response stated that we would discuss the final details of the payment mechanism with practitioners and potential providers. To support this engagement, we have since published a Payment Mechanism Straw Man- available at 

While the final design of the payment mechanism is still to be determined, the model set out in the straw man discusses setting a baseline for all reduction in re-offending targets for each Contract Package Area on the basis of average quarterly re-offending figures for the most recent year that data is available. 

d)    In answer to Q10 you say “The five percentage point reduction target was agreed after analysis of historic reconviction rates established that this would illustrate a demonstrable difference which could be attributed to the new system and not just natural variation” (with my added highlight). However later on you also say “we have not carried out statistical significance tests on the interim figures because, when it comes to the final results, neither pilot will be assessed on the basis of whether they have achieved a statistically significant change”. How can these two statements be compatible? Forgive me, but it seems to me you are using significant differences when it suits you and not when it does not…? Please justify this approach.
e)     Moreover, given this last statement, may I confirm that taxpayers’ money may well be doled out to the suppliers on what could be a random happenchance difference in results rather than one which is (say) beyond a standard 5% statistical threshold of significance? I am interested in your views here too.

I can confirm that testing was used in the design of the Payment by Results pilots at both Peterborough and Doncaster, to ensure that the minimum targets for outcome-based payments in each pilot are set at such a level that we can be confident that to achieve them a provider must achieve an improvement which is attributable to their interventions and not just natural variation.  Because this significance testing is built in at the target setting stage, there is then no need to conduct tests for significance again once outcomes are calculated; instead, outcomes can be judged on whether or not they exceed the targets. The benefit of carrying out the statistical significance testing prior to the start of the pilot rather than at the end is that the ‘goal posts’ can then be set and known by all parties at the outset. In addition, because these targets are set in terms of the final 12-month re-offending measure it is not helpful to carry out statistical significance tests on the interim figures, which measure re-offending over just 6 months and, in the case of the figures for the Doncaster pilot, have smaller offender cohorts than the final measure. 

f)       Your answer to my question 12 surprised me. It is well known, I thought, that certain crimes rise in the winter such as burglary due to the darker evenings etc. Whilst I recognise that you are comparing ‘like with like’ that does not exclude a seasonal effect, it could merely exacerbate one since your time sample is not across the whole year. Why not provide the summer six monthly data as well? 

Using Doncaster as an example, we are not saying that the re-conviction rate for the Oct-Mar 6 months will necessarily match the re-conviction rate for the Apr-Sep 6 months. In fact, because of seasonality it is more likely that they will differ, as you say. Therefore, because we want to compare re-conviction rates over time, we must use the same period for the comparison in each year – that is comparing the various Oct-Mar periods over time. If instead we compared the pilot period of Oct11-Mar12 with say Jan09-Jun09, any difference could reflect a real change, but it could also simply reflect seasonal effects. By comparing the equivalent period in each year, we eliminate this risk of seasonality. 

g) I hear what you say about the 19 month period but it really does look shady! Why not 18 months? Why not 6 months? Hopefully the overall data will clear all this up.

We process and analyse re-offending data on a quarterly basis. For the interim figures released on 13 June, the latest quarter for which we could provide 6 month re-conviction figures was the quarter ending March 2012. The Peterborough pilot started in September 2010 (partway through a quarter), which meant we were able to report on a maximum of 19 months of the first Peterborough cohort period. We could have chosen to round this down to a more conventional 18 months but we took the decision that we should include as much of the data as possible to maximise the robustness of the figures. The Doncaster pilot began in October 2011, at the start of a quarter, meaning we reported on a more conventional looking 6 month period.

It is all getting rather convoluted (which is one of the problems I have with PbR in that payments will steadily become more and more like arguments about how many angels can fit on a pin head). However, there are some points I will be raising from all this... (for another day)

But what are your thoughts? What questions now need to be asked?

Meanwhile, if you have not read it, here is my blog post about the next batch results that were published a couple of weeks ago.

Wednesday, 21 August 2013

Technology & the Queen's Peace: a survey

As preparation for a conference in the spring of next year, I want to carry out a survey. I need your help. I would be most grateful if you could answer the questions below. There are just three and you don't even have to answer all three if you do not want to.

You can post your answers below anonymously or using your name. Or you can email me (, tweet me (either @JonSHarvey or @CllrJonSHarvey) or text or phone me (details here). I don't mind how.

In return, I promise to publish the survey results here as soon as I have a substantive pool of replies to make that worthwhile.

Now to the questions:
1) What existing technology (that you have come across) would, with substantial investment, be an extraordinarily cost effective way of improving community safety / crime reduction?
2) What possible technology (that is just beyond what we currently have) would, with substantial investment, be an extraordinarily cost effective way of improving community safety / crime reduction?
3) What 'sci fi' technology (that is well beyond what we currently have) would, with substantial investment, be an extraordinarily cost effective way of improving community safety / crime reduction?
You may well have more than one answer to each of these questions, but please try to restrict yourself to the best one, in your opinion, for each.

I look forward to reading your answers. Please spread this around to others who you think might also wish to participate.

Thank you!

Monday, 19 August 2013

More tilting at lamp posts...

Thanks to a heads up from Kyle McKay I see that the MoJ have now published an update on the Payment by Results pilots in Doncaster and Peterborough. You can access it here. (My previous blog posts can be accessed here.)

Proving the PbR pilots have worked is still a long way off, it would seem to me: the MoJ concedes they do need full 12 month post release data in order to do a full blown comparison ("final results will not be available until 2014"). However, this does not prevent them from interpreting the results (in my view) creatively to show how PbR is working in these two areas.

However, I would point out the following:
  • They say the "interim re-conviction figures being published in this statistical bulletin are based on periods half the length of those that will be used for the final results".  I say this is not just half this is the first half of a 12 month period. Second halves are also a little harder...
  • How typical are Doncaster and Peterborough compared to the rest of the country? All the comparisons made are with national data. Given the news about Doncaster in recent months & years, it is hardly a standard place. Also Peterbrough is a 'new town' and (according to Wikipedia) "Peterborough's population grew by 45.4% between 1971 and 1991" which I think makes it a somewhat unusual place. So are national comparisons really valid?
  • They say that "Both PbR prison pilots use a 12 month re-conviction measure which differs from the National Statistics proven re-offending measure. The key difference is that re-convictions only count offences for which the offender was convicted at court, whereas the National Statistics proven re-offending measure also includes out of court disposals (cautions)" and "Additionally, there are a number of other differences between the pilots and the National Statistics proven re-offending measure in terms of which offenders are counted within the cohort". That all seems pretty important to me... does it to you?
  • Indeed the whole document seems peppered with so many caveats, footnotes and explanations as to make me wonder just what we are being told.
  • They say "Success of the Peterborough pilot will be measured against a control group of similar offenders released from other prisons, with the target met if the frequency of re-conviction events is 10 per cent lower for the Peterborough cohort than for the control group. It is not possible to replicate that comparison for these interim figures". I say: why not? Why is the 'control group' not being monitored in a similar way? It does not make it much of a control group..!
  • They say "The national comparisons included with the previous interim figures published on 13 June 2013 included all prisons, not just local prisons. However, because Peterborough is a local prison, using national figures for other local prisons provides a better comparison". Huh? Why were the local prison data not used before? Are they just using whatever data seems to give the 'best' result?
  • It would appear that frequency of conviction rates in Doncaster have been coming down since September 2007 whereas nationally during the same period, national figures have been showing a rise since September 2007. The Peteborough pilot began in October 2010. At the very least this shows that national comparisons are dodgy since the trends were going in opposite ways before all the PbR pilots began. It also potentially shows that other significant forces are present in Peterborough that could be creating the positive trends other than the PbR pilots... 
  • Similarly the Donaster reconviction rates have been in a downward trend since October 2007, the data appears to suggest: again well before the pilots were begun. 
  • (As an aside: the national data shows some very worrying trends: reconviction events per 100 offenders has gone from 66 to 84 between June 2007 and 2012. That is a rise of  over 27%. That is a bit concerning isn't it?)
  • But for me the biggest problem with this whole comparison approach is that a potentially huge Hawthorne Effect is not being controlled for. In other words, the mere presence and attention being given to the PbR pilots is what is creating any positive effect, not the pilots themselves. This is not, I repeat NOT, being controlled for. For me this calls into question the whole edifice on which these pilots are based.
Anyone with an ounce of independent thought will understand that, at the very least, these results are not the basis which to build a whole reform of the offender management system. The comparisons are shaky and riven with cautions.

It is time to develop a better experimental framework.

British Remote & Online Police Service

I first came across the internet acronym 'irl' many years ago but it was not long after I began living some of my life on line. Part of who I am, my identity exists in the virtual world of the internet. There are large bits of me, as were, held in binary code on a whole variety of servers. Some of this is current, some of this is historical. I have friends on the internet whom I will probably never meet in real life. We exchange greetings. I have been trolled and verbally abused. I have probably made thousands of financial transactions. Last week I initiated court action against a person online. Last night I watched ten minutes of a programme about One Direction's fan base and how they felt connected to their idols in ways that teenage fans of Donny Osmond and David Cassidy could only have dreamt of...

In recent weeks, there have been several tragic stories of young people being driven to suicide by abuse and threats via the internet. No doubt, although much less reported, many hundreds of people will parted with thousands of pounds via various kinds of internet scams. Probably also a fair few (dozens perhaps?) of credit cards will have been stolen, skimmed or cloned and used to pay for all many of items from airline tickets and tube fares. Also sadly some more children will have been groomed and put in danger of abuse. I could go on.

But who polices this virtual world?

Who has the resources not only to tackle such crimes when they occur but also the resources to reduce the risk of such crimes in the future? Who has the ear of the internet industry be they service providers, web designers, cloud managers and all manner of commercial people who make the internet work, so that robust preventative action can be comprehensively taken? What is the internet equivalent of a car immobiliser?

Who is taking a joined up strategic view on all this?

And yes I know we have CEOP and Action Fraud, and probably other units that I do not know about but I do wonder whether we now need a single joined up virtual police service to assemble all these resources together into one centre of excellence? Just like we have the British Transport Police (which in my view ought to look after policing at all airports and sea ports too - but that is another blog post) why do we not have the British Online Police Service?

And to complete the picture as I suspect many of these crimes overlap, I have added in the idea of 'remote' crime which would include in my book rogue phone calls and mail order scams (etc.) which also cause huge distress.

And so I arrive at the idea of the British  Remote & Online Police Service as a new legal entity, probably with new enforcement powers, a governance structure that includes the internet industry (similar to BTP) and clear partnership liaison with 'irl' police services and financial regulators. In these straightened times this will need some imaginative sources of funding (a broad band / junk mail tax perhaps?) to ensure it is adequately resourced.

Can I interest one of the major political parties in this idea in time for the next election? Or even sooner perhaps...?