Recently, a paper entitled "Trends in the Prevalence of Autism on the Basis of Special Education Data"

^{1}was published in Pediatrics that took a slightly different approach to the subject. In this paper, researchers looked at special education data from Wisconsin and concluded that it looked like the prevalence of autism in school districts seemed to be leveling off.

To reach these conclusions, the researchers took publicly available educational statistics

^{2}for public elementary schools in Wisconsin and used it to calculate a special education autism prevalence by year (2002 to 2008) for each of the 415 school districts. They then grouped the schools into eight groups based on the autism prevalence in 2002 - these groups are called octiles. The first octile contained all of the districts with the highest prevalence while the eighth contained the districts with the lowest prevalence.

When they analyzed the resulting data, the researchers found that the autism prevalence in the highest octile did not increase nearly as much as the other seven. Their results looked approximately like the chart I produced below.

As you can see on the chart, the line representing the top octile is increasing slowly while the lines for the seven other octiles are increasing faster and seem to be converging on the top line. The researchers took this as evidence that the grown of autism prevalence in Wisconsin seems to to be "leveling off" as it approaches the level of the highest octile.

The crux of the paper's argument is that to properly see this leveling off you have to consider changes in smaller areas that are "not necessarily influenced by special eduction practices in surrounding areas". The idea is to try and separate out the effect of changing special education policies from an actual increase in autism prevalence.

While this argument sounds good in theory, I don't think it is realistic in practice. The reasons get a little involved and I am going to talk about them in some depth below. But, for those of you who don't want all of the details, the short version is that it isn't that the prevalence is leveling off but rather it is becoming more uniform. Instead of having a large variation in autism prevalence by area, we are instead seeing a more consistent range across all of the areas. But still, the overall trend shown by the data is up and autism is still becoming more common in Wisconsin, at least by special education measures.

Before I start, let me say that all of the data in those post came from the same publicly available source that the paper used. I tried to match what the paper did as closely as possible but there are some differences between my numbers and the ones used in the paper.

The major difference was that I included data from 2009 while the paper only included up to 2008. I don't know why the paper stopped at 2008 - I suppose it is possible that the data was not available when they pulled their data. I don't think that is too likely since the paper said the data was downloaded only six weeks ago (Sept 13, 2010), but who knows. Other than that, there are a few places were I arrived at a slightly different number than the paper had but I don't think any of these differences are significant.

Ok, first of all, you have to keep in mind what these numbers represent and what they don't represent. These numbers represent the number of children in Wisconsin public schools that have a primary special education label of autism. These numbers do not represent all of the children in Wisconsin who have a diagnosis of autism (a medical diagnosis is a different than a special education label). There are going to be other children in the same age group who do not go to public school, who have a diagnosis of autism but not a primary special eduction label of autism, or who don't have a diagnosis of autism but have a special education label of one so that they are eligible for services. Or any permutation of the above.

So you can't look at these numbers by themselves and say that autism is becoming more common. You can, however, compare the trends seen in these numbers to the trends seem in other data sets - such as national surveys or HMO medical records - and see that the general trend shown by all of the data is up. But that is a topic for another time.

The first problem with the analysis done in the paper is that there are a lot of very small school districts in Wisconsin and the prevalence per 1,000 in this districts can move violently with the addition of even a single child. Take or example the district "Dover #1" which had a total of 92 elementary students in 2002. This district had 1 child with a label of autism, giving it a prevalence of 10.9 per 1,000 children, and placing it in the highest octile in 2002.

If you added a second child to the district, that would basically double the prevalence and take it to 21.5 per 1,000. If you took one child away, the district would have dropped from the highest octile all the way down to the lowest octile. Quite a big swing.

The point is that even a change of a single child is enough to skew the results when the total numbers are small. While there is no hard and fast rule about how many children you would need to get a reliable estimate, you can safely say that you need more than 100.

Since the "accepted" rate of autism in 2002 was something like 6.6 per 1,000, I think you would need to have at least 500 children in your group to overcome this bias. With 500 children, changing the number of children with autism by 1 will move the resulting figure by 2 per 1,000 while with 100 children a change of 1 child will result in a change of 10 per 1,000.

Unfortunately, less than half of the school districts in this paper had more than 500 students. When you look at the octiles in the paper, almost 60% of the highest octile and 80% of the bottom octile had less than 500 students. That means that those two octiles are going to be heavily influenced by districts that can be moved by the addition or subtraction of a single child with the label of autism.

Which brings me to the second problem - availability of services. The paper makes the assumption that the smaller geographic areas - especially ones with higher prevalence - are not going to be as heavily influenced by overall policy changes as larger ones would be. It argues that the highest octile represents the vanguard of the shift towards a higher acceptance of autism. These districts would be the early adopters if you will, and the rest of the districts are simply playing catch up.

The problem is, if this is true, that these districts would attract students from the surrounding areas because of the better services offered. I can tell you from experience that parents who children have autism compare notes about how good the services offered by a certain school district are. If there is a district in an area that has better services available than the others, parents can either ask their local district to send their children to the other district or move into the other district to access the services. This exporting of children with autism would be especially common in districts with poor or no services.

On the flip side, districts with poor services are going to attract fewer children. Parents will seek out and move into districts with better services or, failing that, will pull their children out of the public school system altogether and send them to private school or home school them.

But over time, as the lagging districts start catching up with the early adopters, parents will have less incentive to move to other districts and will be more likely to send their children to their current district. This would result in the districts with better services losing students (or not growing as fast) and the lagging districts gaining students more rapidly.

The net effect would be that the prevalence would become more even across the districts over time, independent of any real increase or decrease in prevalence. I think that is what the Wisconsin special education figures actually show.

To give a theoretical example, assume that you have two adjacent school districts (A and B) that both have 1,000 students, 5 of whom have a special education label of autism. The one district (A) has better services than the other (B), so 3 of the students with autism from B switch to A. So District A starts off with 8 students with autism (8 per 1,000) while district B only has 2 (2 per 1,000).

Now assume over the next few years that two things happen. First, the rate of autism doubles so that both districts would have 10 children with autism. Second, the services between the two districts even out and, as a result, the children who switched to A go back to B. The net effect would be that district A goes from 8 students to 10 while B goes from 2 to 10. If you then compared their relative growths, you could conclude that B is simply catching up to the higher rate in A and that A isn't growing as fast as B. This is what the paper concluded - but that isn't the complete picture.

If that example is a bit too theoretical for you, how about a real one from the Wisconsin data. Consider two adjacent districts in northwest Wisconsin - Amery and Clayton - and their seven surrounding school districts. In 2002, these districts were both included the top octile with Amery having a total of 7 children with autism out of 835 students and Clayton having 6 out of 205. These two districts had a total of 13 children and 1,050 students for a combined rate of 12.5 per 1,000. In 2002, the other seven surrounding districts had a total of 16 children with autism out of a total of 3,545 students, or 4.5 per 1,000.

In 2009 Amery and Clayton decreased to a combined 10 students with autism out of 983 students (10.2 per 1,000) while the surrounding seven districts increased increased to 30 out of 4,247 or 7.1 per 1,000. Now, does that mean that that the surrounding districts are simply coming up to the level of Amery and Clayton or are the students with autism simply being more evenly distributed?

If you look at the chart below, I think the answer is clear.

The prevalence of autism in these nine districts are getting closer together but, as you can see from the red trend line, the overall trend of the data is going up.

Now consider the fact that Amery and Clayton were in the top octile group while none of the other surrounding districts were and I think you can see what is going on with the data. If you look at the change in prevalence data for all of the districts state wide, you can see a similar pattern.

It is a little hard to see, but the overall trend is that the prevalences are becoming less varied and grouped closer together - meaning that the prevalence is becoming more uniform. And, the overall trend across all of the data, as shown by the trend line, is heading up.

To address the problems of sample size and availability of services, you have to consider a wider area. I know the idea of the paper was to look for localized trends in the data but, as I have shown, areas this small are going to be effected by the services offered by their neighboring regions and have few enough children that even a small change in the number of children with autism can greatly change the result.

To that end, I grouped the data based on Wisconsin's 12 CESA areas and charted the result. While these regions are not equal in size - the smallest had 8,000 elementary students in 2009 while the largest had 138,746 - they do have the advantages of not being so easily skewed because of small numbers and getting rid of some of the noise created by services. As you can see below, the result is quite different when you look at it like this.

All of the CESA showed a similar gain over the seven year period and the overall trend is unmistakable - the prevalence is going up. It would be tempting to look at a chart like this and assume that autism did become much more common over this time period, but you have to remember what this data is and what it isn't. You can't look at special education numbers alone and conclude that autism is increasing.

However, I do think it is clear that the special education prevalence of autism in Wisconsin is not showing any signs of leveling off.

**References**

1. Maenner, M. J., and M. S. Durkin. 2010. “Trends in the Prevalence of Autism on the Basis of Special Education Data.” Pediatrics 126:e1018-e1025. http://pediatrics.aappublications.org/cgi/doi/10.1542/peds.2010-1023.

2. Wisconsin WINSS data, accessed Oct 29, 2010

http://data.dpi.state.wi.us/data/

Excellent post. There is a more fundamental problem with the data: the statistical trap called 'regression to the mean'. This could lead to an apparent "convergence" even if there is no moving of ASD kids from one district to another, just pure random variation.

ReplyDeleteAs I posted over at Left Brain/Right Brain:

Imagine you had 100 six-sided dice. You roll them all. You get about equal numbers of 1s, 2s, 3s, etc.

You divide the dice into six categories, from the highest scoring to the lowest scoring. The top category contains mostly sixes. The bottom category contains mostly ones, a 6:1 ratio.

Now you roll all the dice again. To your amazement, in the top category, the scores went down and in the bottom category they went up. All the categories have converged on an average score of about 3.5! The ratio between the highest category and the lowest is pretty much 1.

Of course. You selected your categories based on the first scores, but they were random, so when you rolled the dice again, the categories stopped being meaningful – and the scores, therefore, converged…

The authors say that “the gap in prevalence between districts overall has narrowed” but they didn’t look at the actual raw variability of prevalence scores in 2008 compared to 2002, which is the crucial result, they just looked at the 2002 categories which is like the dice example. Or if they did it’s not in the paper.

Now it looks like you

didlook at the raw prevalence scores, in your last 3 images, and they show either no apparent convergence or a much weaker one than the authors found.So I am more convinced than ever that this result is regression to the mean...

Neuroskeptic,

ReplyDeleteThanks, I'm glad you liked the post.

I took another look at the data set and I think you may be correct about regressing to the mean. It is a little hard what is going on because there are at least two different trends in the data and the use of means obscures the trends. But, when you use medians with standard deviations the picture becomes clearer.

I did this analysis rather quickly so I need to double check this, but the median of the 1st octile is heading down while the median of the entire data set is heading up. The two almost touch in 2008 before the 1st octile heads back up in 2009. The deviation of the whole set get a little wider (although that could be a insignificant) while the deviation of the 1st octile actually gets wider.

Specifically, in 2002 the median of the whole set was 3.5 with a stdev of 4.4 while in 2009 the median is 8.3 with a stdev of 6.2. I don't think the deviation going up is significant because it is likely being caused by more districts moving away from a zero prevalence. If you look at the -1 stdev line, it doesn't get above zero until 2006.

For the 1st octile, the median is heading downward, from 10.8 in 2002 to 8.7 in 2008 before bouncing back to 10 in 2009. The stdev of the 1st octile is expanding from 4.5 in 2002 to 8.7 in 2009.

The median is heading in the opposite direction of the mean. I think this is happening because the deviation is getting wider and the outlying schools are skewing the mean. But I have to take a closer look at this when I get a chance because something doesn't seem quite right.

Regardless, the median for the 1st octile starts out about 1.5 stdev from the median for the whole set in 2002 but by the 2009 the median of the 1st is much closer to the median of the whole set. Which, if I am not mistaken, is something like what you are talking about.

The chart linked below shows the trends. The red line is the set as a whole, +1 std and +2 std. The grey dashes are the prevalence distribution points for the 1st octile. The blue lines are the median for the 1st octile and +/- 1 std.

I have limited the y axis so that any prevalence figure above 35 is cut off so that the trends are can be seen more clearly.

See here for the chart

http://bit.ly/bWqBSs