tag:blogger.com,1999:blog-1892134081049774386.post2587009036472529775..comments2022-11-21T03:30:40.922-05:00Comments on Autism Jabberwocky: Special Ed. Autism Prevalence in WisconsinM.J.http://www.blogger.com/profile/12033918835169823548noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-1892134081049774386.post-8138449143962835622010-11-21T20:03:25.837-05:002010-11-21T20:03:25.837-05:00Neuroskeptic,
Thanks, I'm glad you liked the ...Neuroskeptic,<br /><br />Thanks, I'm glad you liked the post. <br /><br />I took another look at the data set and I think you may be correct about regressing to the mean. It is a little hard what is going on because there are at least two different trends in the data and the use of means obscures the trends. But, when you use medians with standard deviations the picture becomes clearer.<br /><br />I did this analysis rather quickly so I need to double check this, but the median of the 1st octile is heading down while the median of the entire data set is heading up. The two almost touch in 2008 before the 1st octile heads back up in 2009. The deviation of the whole set get a little wider (although that could be a insignificant) while the deviation of the 1st octile actually gets wider.<br /><br />Specifically, in 2002 the median of the whole set was 3.5 with a stdev of 4.4 while in 2009 the median is 8.3 with a stdev of 6.2. I don't think the deviation going up is significant because it is likely being caused by more districts moving away from a zero prevalence. If you look at the -1 stdev line, it doesn't get above zero until 2006.<br /><br />For the 1st octile, the median is heading downward, from 10.8 in 2002 to 8.7 in 2008 before bouncing back to 10 in 2009. The stdev of the 1st octile is expanding from 4.5 in 2002 to 8.7 in 2009. <br /><br />The median is heading in the opposite direction of the mean. I think this is happening because the deviation is getting wider and the outlying schools are skewing the mean. But I have to take a closer look at this when I get a chance because something doesn't seem quite right.<br /><br />Regardless, the median for the 1st octile starts out about 1.5 stdev from the median for the whole set in 2002 but by the 2009 the median of the 1st is much closer to the median of the whole set. Which, if I am not mistaken, is something like what you are talking about.<br /><br />The chart linked below shows the trends. The red line is the set as a whole, +1 std and +2 std. The grey dashes are the prevalence distribution points for the 1st octile. The blue lines are the median for the 1st octile and +/- 1 std.<br /><br />I have limited the y axis so that any prevalence figure above 35 is cut off so that the trends are can be seen more clearly.<br /><br />See here for the chart <br /><br /><a href="http://bit.ly/bWqBSs" rel="nofollow">http://bit.ly/bWqBSs</a>M.J.https://www.blogger.com/profile/12033918835169823548noreply@blogger.comtag:blogger.com,1999:blog-1892134081049774386.post-59663773057294154172010-11-21T07:22:37.225-05:002010-11-21T07:22:37.225-05:00Excellent post. There is a more fundamental proble...Excellent post. There is a more fundamental problem with the data: the statistical trap called <a href="http://neuroskeptic.blogspot.com/2010/08/help-im-being-regressed-to-mean.html" rel="nofollow">'regression to the mean'.</a> This could lead to an apparent "convergence" even if there is no moving of ASD kids from one district to another, just pure random variation.<br /><br />As I posted over at Left Brain/Right Brain:<br /><br />Imagine you had 100 six-sided dice. You roll them all. You get about equal numbers of 1s, 2s, 3s, etc.<br /><br />You divide the dice into six categories, from the highest scoring to the lowest scoring. The top category contains mostly sixes. The bottom category contains mostly ones, a 6:1 ratio.<br /><br />Now you roll all the dice again. To your amazement, in the top category, the scores went down and in the bottom category they went up. All the categories have converged on an average score of about 3.5! The ratio between the highest category and the lowest is pretty much 1.<br /><br />Of course. You selected your categories based on the first scores, but they were random, so when you rolled the dice again, the categories stopped being meaningful – and the scores, therefore, converged…<br /><br />The authors say that “the gap in prevalence between districts overall has narrowed” but they didn’t look at the actual raw variability of prevalence scores in 2008 compared to 2002, which is the crucial result, they just looked at the 2002 categories which is like the dice example. Or if they did it’s not in the paper.<br /><br />Now it looks like you <i>did</i> look at the raw prevalence scores, in your last 3 images, and they show either no apparent convergence or a much weaker one than the authors found.<br /><br />So I am more convinced than ever that this result is regression to the mean...Neuroskeptichttps://www.blogger.com/profile/06647064768789308157noreply@blogger.com