Tuesday, April 7, 2009

Discredit HBOT, Take III

The nice people over at Right Brain Left Brain are at it again with yet another post about why the recent study on HBOT is all wrong.  The title of this one is :

I couldn't have put it better myself.  

This time they appear to have handed the torch, so to speak, to Prometheus who normally writes on his own blog.  I have to say, this post is better written that the earlier ones but still falls short of the mark.

I think it is worth reiterating at this point what I have said before about this study.  It is a step in the right direction but the results are still preliminary and need to be verified by further studies.

So with that in mind lets look at what the "problems" are this time:

The six centers that were involved in the study are small locations with primarily one doctor and make money from selling HBOT services.

Well, this one is true at least on its face.  The centers in the study are smaller locations that have HBOT equipment available (and it isn't cheap) - so surprise, surprise, they actually use it.

It would have been better if the centers involved in the study were all part of a large organization that happened to have HBOT equipment on hand and was willing to conduct a study with it.  However, this is an experimental treatment for autism - until there is a good amount of evidence that this is going to work then I don't think it is likely that large organizations are going to get involved.

Given that larger organizations aren't going to be involved (yet), that leaves smaller groups.  So with that in mind if you are looking for smaller groups with HBOT equipment they are likely to have it because maybe they use it as part of their business?  

Therefore it is not surprising that small centers would sell services using the equipment that they have purchased.

As to the one doctor point, look at the small centers in your area like MRI clinics.  How many doctors are on staff at those places?  A quick check around my area showed that one or two is normal.

[T]he decision to include one treatment subject who only completed nine sessions was curious. Why they included this subject and not any of the other three treatment subjects and three control subjects who also failed to complete the entire course of the study is concerning. The smart thing – and the proper response – would have been to drop this subject from analysis.

To answer this problem I think we should look at the participants that dropped out to see if the reason become apparent. Since the full text of the study is available I think that would be a good place to refer to:
In the treatment group, two children dropped out of the study prior to beginning any treatments due to an illness (one with otitis media, the other with bronchitis). Another child dropped out before finishing one full treatment due to anxiety in both the child and the parent.
So in the treatment group the three that dropped out did not complete any treatments. 
In the control group, two children dropped out of the study prior to beginning any treatments (one because of a death in the family, the other because of the time commitment). One child dropped out prior to finishing one full treatment due to parental claustrophobia.
So in the control group three of the four that dropped out did not complete any treatments.  This is the same criteria as to why children were excluded from the treatment group.

The last one, the one that was included from the treatment group was different - "one child was removed from the study after nine sessions because asthma symptoms worsened".  This child is clearly different than then other six.

So that is the answer to why the others were excluded, now, why were this one child's results included?
this child's scores performed at time of drop-out showed mild improvements in behavior (as separately ranked by both the physician and the parents) and these scores were included in the intention-to-treat analysis. The inclusion or exclusion of this child's scores had no significant effect on the statistical analysis.
Or in other words it didn't change the outcome one way or the other.  

The scores on the Clinical Global Impression (CGI) are not linear and you should not do math on them.

The issue here is that the authors took the CGI scores (1, very much improved; 2, much improved; 3, minimally improved; 4, no change; 5, minimally worse; 6, much worse; or 7, very much worse) and subtracted 4 from the number to change to scale to -3 to 3.   Also these scores aren't supposed to be real numbers so the difference between "very much improved" and "much improved" isn't the same as "no change" and "minimally improved".  So in numeric terms even though the difference might always be 1 he is saying that in some cases the relative difference might be more like 1.5 to 0.8.

Unfortunately for Prometheus  the decision to subject a fixed number from all of the scores does nothing except change the number from one pseudo number to another pseudo number.  Since each number is translated from the old scale to the new scale in exactly the same way and since the relative values of the numbers don't change this change is completely benign and meaningless.  This simple subtraction is not going to change any of the results of the analysis.

As to whether the scale itself is linear I would point out that the CGI scale has been in use 30+ years so I would think that the representation would have been found to be adequate by now.

Prometheus also raises the point that :
This may seem like nit-picking, but it is a serious concern. Imagine, if you will, that the numbers were replaced by colors. Is the difference between green and orange twice the difference between orange and red? If half of a population of birds are blue and the other half are yellow, is the “average” bird green? The simple fact is that it is not appropriate to treat these “scores” as though they were real numbers, to be added, subtracted and averaged.
This may seem like nit-picking but the analogy comparing these scores to colors is completely wrong.  There is an implicit ranking in these scores that is not present in colors.  For example, a rating of 1 (very much improved) is "better" than a rating of 2 (much improved).  There is no corresponding concept in colors - green is not "better" than red.

So you could argue that the relative change between the ratings isn't always the same but you really can't say that there is no relation between the two.  And as I pointed out this scale has been in use for a long time so I think this point is moot.

The statistics are wrong

I have a rule for myself - whenever I don't really know the subject matter that well I tend not to comment.  The different types of statistics and when they apply and when they don't is one of those areas that I don't know that well, so I won't spend any time rebutting the specifics of what Prometheus  is saying.  

The only thing I will point out here is that I assume that a journal has reviewers checking the statistical methods that are used in studies and would not let anything that was a misapplication of statistics be published.  

So my first reaction is this criticism is not valid.  My second reaction is that if you have any doubts, find someone you trust who really understands this stuff and ask for their opinion - don't rely on a random blogger's opinion of something like this (mine included).

The other issue is that there is no discussion of why HBOT is thought to be superior to providing the same partial pressure of oxygen at room pressure

They also didn't discuss why breathing heavily or why breathing in the southern hemisphere wouldn't work equally well either.  For that matter, they completely ignored the fact that the same benefits could have been had by standing at sea level while hoping on one leg.

This is the same thing that DoC said in his second post on the topic and the answer is still that it is irrelevant.  You can add on all of the extra things that you think the authors should have covered, but that doesn't make it a relevant criticism. 

This study was not about simple O2 vs HBOT vs normal environment.  This study was about HBOT vs an almost normal environment - that was the defined scope and that was what was presented.

So finally, in closing ....

I went back and read what I wrote above and I guess it is on the longer side.  I am not sure how many of you are still with me here, but let me close with a simple piece of advice to LBRB - 

Give it a rest already. 

No comments:

Post a Comment