1 Simulation Overview

In a previous vignette, we examined the efficacy of multiple imputation (MI) for dealing with missing scale score and student growth percentile (SGP) data. A simulation was conducted wherein observations were systematically removed from a synthetic data set (from the SGPdata R package; Betebenner et al., 2021). The results indicated that in many contexts, the cross-sectional L2PAN imputation method is a viable approach for creating “adjusted” scale scores and SGPs. Importantly, L2PAN generally performed best in conditions of (a) lower missingness percentages, (b) data missing completely at random, and (c) larger school sizes.

The simulation was replicated to incorporate a COVID-19 impact within the synthetic data from SGPdata. This vignette summarizes the results of this “impact” simulation. As before, data were amputed with patterns of missing completely at random (MCAR), missing at random (MAR) based on status and growth, or MAR based on status and demographics. Moreover, either 30%, 50%, or 70% of the observations were systematically removed (although note that the missingness percentage could vary by school even within each of these three levels). Six imputation methods were compared, with some slight differences from the previous “without impact” simulation:

Cross-sectional multi-level modeling with pan (L2PAN);
Longitudinal multi-level modeling with pan (L2PAN_LONG);
Quantile regression (RQ);
Random forests (RF);
One-level predictive mean matching (PMM); and
Multi-level modeling with predictive mean matching (L2PMM).

Like the previous simulation, we also compared these methods to the “Observed” condition, when no imputation was done. All MI analyses were conducted using the mice package (van Buuren & Groothuis-Oudshoorn, 2011), with calls to corresponding R packages (e.g., pan [Zhao & Schafer, 2018]). Here, we focus on the ability of these MI methods to accurately impute either mean scale scores or SGPs.

This vignette structure largely mirrors the summary from the “without impact” simulation. We use the following three indices to operationalize MI performance:

Percent bias
- The absolute value of the ratio of the raw bias (i.e., the average difference between the imputed and true values) to the average true value, multiplied by 100.
- Ideally less than 5% (Miri et al., 2020; Qi et al., 2010).
Simplified confidence interval (CI) coverage rate
- The proportion of times that the simplified CI contains the average true score; the simplified CI was proposed by Vink and van Buuren (2014) for cases where the complete data set can be considered the population.
- Ideally around \(1 - \alpha\) (in this case, \(1 - 0.10 = 0.90\); Demirtas, 2004; Qi et al., 2010).
Simplified \(\mathbf{F}_1\) statistic
- Tests the null hypothesis that the true and imputed values are equivalent (van Buuren, 2018; Vink & van Buuren, 2014).
- A p-value greater than \(\alpha\) denotes a failure to reject the null hypothesis.

2 Imputation Method Comparison

We first examine the results across the various design factors (e.g., type and percentage of missingness, grade and content area, MI method, etc.). We hope to elucidate whether one or more MI methods outperforms the others in terms of reduced bias and high coverage rates.

2.1 Summary Tables

2.1.1 GC: MCAR

Table 2.1: Mean percent bias and confidence interval coverage rates for scale score (SS) and student growth percentiles (SGPs) with MCAR data, grade-content area level
	L2PAN				L2PAN_LONG				RQ				RF				L2PMM				PMM				Observed
	Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR
Grade	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP
30% Missing
3	0.210		0.921		0.210		0.921		0.216		0.914		0.226		0.928		0.220		0.912		0.221		0.960		0.221		0.960
4	0.190		0.927		0.190		0.927		0.194		0.924		0.206		0.933		0.203		0.914		0.200		0.966		0.200		0.966
5	0.169	2.916	0.950	0.950	0.220	3.712	0.945	0.911	0.394	6.033	0.858	0.843	0.204	3.233	0.950	0.940	0.370	5.878	0.950	0.917	0.390	5.932	0.865	0.858	0.390	5.932	0.865	0.858
6	0.154	3.147	0.952	0.947	0.252	3.981	0.921	0.874	0.377	6.483	0.806	0.785	0.216	3.879	0.936	0.919	0.361	6.458	0.950	0.899	0.377	6.456	0.807	0.799	0.377	6.456	0.807	0.799
7	0.127	1.932	0.947	0.955	0.268	2.815	0.856	0.879	0.326	4.761	0.752	0.753	0.154	2.383	0.930	0.920	0.311	4.886	0.938	0.893	0.322	4.696	0.757	0.765	0.322	4.696	0.757	0.765
8	0.134	2.527	0.944	0.942	0.171	3.222	0.905	0.861	0.376	6.439	0.716	0.679	0.184	3.276	0.922	0.900	0.374	6.745	0.914	0.838	0.378	6.450	0.717	0.693	0.378	6.450	0.717	0.693
50% Missing
3	0.338		0.905		0.338		0.905		0.339		0.896		0.373		0.890		0.366		0.882		0.362		0.964		0.362		0.964
4	0.287		0.914		0.287		0.914		0.297		0.908		0.335		0.900		0.316		0.887		0.318		0.975		0.318		0.975
5	0.310	4.943	0.942	0.945	0.355	5.743	0.893	0.846	0.652	9.853	0.787	0.767	0.388	6.084	0.904	0.882	0.635	9.760	0.932	0.889	0.649	9.734	0.802	0.791	0.649	9.734	0.802	0.791
6	0.280	5.488	0.945	0.941	0.442	6.822	0.850	0.791	0.617	10.518	0.724	0.706	0.390	6.884	0.879	0.847	0.584	10.249	0.950	0.898	0.611	10.400	0.744	0.729	0.611	10.400	0.744	0.729
7	0.205	3.445	0.948	0.946	0.386	3.930	0.763	0.790	0.506	7.694	0.702	0.682	0.283	4.546	0.877	0.845	0.514	8.049	0.930	0.878	0.510	7.653	0.717	0.699	0.510	7.653	0.717	0.699
8	0.225	3.999	0.938	0.941	0.289	5.271	0.811	0.767	0.642	10.778	0.601	0.566	0.356	6.285	0.845	0.806	0.658	11.433	0.893	0.816	0.643	10.725	0.612	0.587	0.643	10.725	0.612	0.587
70% Missing
3	0.548		0.900		0.548		0.900		0.571		0.909		0.574		0.825		0.572		0.863		0.642		0.970		0.642		0.970
4	0.424		0.915		0.424		0.915		0.443		0.917		0.457		0.848		0.450		0.871		0.541		0.977		0.541		0.977
5	0.525	8.319	0.945	0.943	0.487	8.918	0.813	0.744	0.919	13.843	0.751	0.722	0.632	9.819	0.855	0.816	0.906	13.737	0.928	0.880	0.918	13.751	0.768	0.753	0.918	13.751	0.768	0.753
6	0.446	8.435	0.941	0.934	0.637	9.360	0.743	0.689	0.852	14.563	0.688	0.656	0.612	10.751	0.813	0.773	0.832	14.507	0.928	0.869	0.849	14.402	0.705	0.686	0.849	14.402	0.705	0.686
7	0.331	5.076	0.945	0.942	0.588	5.615	0.642	0.660	0.731	10.782	0.664	0.650	0.466	7.201	0.790	0.769	0.732	11.124	0.920	0.873	0.726	10.731	0.671	0.670	0.726	10.731	0.671	0.670
8	0.391	6.547	0.934	0.926	0.402	7.316	0.704	0.636	0.899	14.900	0.563	0.527	0.583	9.989	0.756	0.700	0.926	15.762	0.893	0.810	0.901	14.980	0.573	0.544	0.901	14.980	0.573	0.544

2.1.2 GC: Status with Demographics

Table 2.2: Mean percent bias and confidence interval coverage rates for scale score (SS) and student growth percentiles (SGPs) with MAR data (using status with demographics), grade-content area level
	L2PAN				L2PAN_LONG				RQ				RF				L2PMM				PMM				Observed
	Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR
Grade	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP
30% Missing
3	0.606		0.848		0.606		0.848		0.598		0.832		0.693		0.793		0.599		0.831		0.620		0.904		0.620		0.904
4	0.524		0.847		0.524		0.847		0.528		0.837		0.636		0.779		0.536		0.825		0.521		0.923		0.521		0.923
5	0.264	4.456	0.943	0.902	0.483	6.118	0.865	0.726	0.405	6.390	0.840	0.819	0.289	4.526	0.918	0.886	0.481	6.502	0.958	0.906	0.395	6.230	0.855	0.854	0.395	6.230	0.855	0.854
6	0.222	4.238	0.928	0.889	0.381	5.773	0.849	0.706	0.366	6.527	0.800	0.789	0.260	4.714	0.907	0.878	0.416	6.708	0.966	0.913	0.365	6.572	0.816	0.812	0.365	6.572	0.816	0.812
7	0.221	3.916	0.917	0.890	0.426	5.616	0.740	0.691	0.344	5.642	0.766	0.728	0.217	3.422	0.896	0.870	0.435	5.947	0.949	0.875	0.345	5.577	0.776	0.748	0.345	5.577	0.776	0.748
8	0.207	3.466	0.924	0.911	0.324	5.063	0.810	0.691	0.446	7.391	0.680	0.664	0.266	4.799	0.860	0.823	0.487	7.830	0.930	0.849	0.446	7.396	0.697	0.696	0.446	7.396	0.697	0.696
50% Missing
3	1.102		0.799		1.102		0.799		1.104		0.775		1.248		0.681		1.106		0.775		1.163		0.882		1.163		0.882
4	0.924		0.797		0.924		0.797		0.941		0.788		1.095		0.670		0.939		0.773		0.947		0.907		0.947		0.907
5	0.420	6.936	0.946	0.921	0.878	9.262	0.772	0.667	0.652	10.177	0.788	0.768	0.529	7.742	0.868	0.842	0.812	10.324	0.954	0.903	0.642	10.043	0.810	0.801	0.642	10.043	0.810	0.801
6	0.375	6.879	0.924	0.898	0.609	9.333	0.768	0.603	0.608	10.568	0.722	0.704	0.462	8.006	0.837	0.809	0.671	10.619	0.958	0.886	0.607	10.605	0.744	0.730	0.607	10.605	0.744	0.730
7	0.359	5.764	0.917	0.888	0.671	8.276	0.629	0.566	0.545	8.666	0.700	0.673	0.364	5.586	0.839	0.811	0.671	8.780	0.959	0.885	0.546	8.628	0.714	0.700	0.546	8.628	0.714	0.700
8	0.343	5.702	0.925	0.909	0.557	8.325	0.692	0.560	0.685	11.446	0.601	0.579	0.452	7.883	0.776	0.742	0.778	12.047	0.935	0.829	0.696	11.527	0.620	0.610	0.696	11.527	0.620	0.610
70% Missing
3	1.715		0.772		1.715		0.772		1.754		0.746		1.963		0.574		1.741		0.747		1.797		0.865		1.797		0.865
4	1.452		0.782		1.452		0.782		1.491		0.768		1.689		0.587		1.503		0.750		1.549		0.899		1.549		0.899
5	0.644	9.981	0.956	0.938	1.428	12.162	0.664	0.617	0.916	14.049	0.765	0.729	0.830	11.344	0.809	0.800	1.175	13.956	0.949	0.901	0.904	13.905	0.782	0.764	0.904	13.905	0.782	0.764
6	0.575	10.122	0.925	0.907	0.865	13.694	0.657	0.485	0.862	14.388	0.678	0.655	0.714	11.610	0.773	0.747	0.971	14.324	0.958	0.884	0.851	14.396	0.698	0.677	0.851	14.396	0.698	0.677
7	0.529	8.140	0.914	0.896	0.862	10.936	0.538	0.450	0.781	12.022	0.659	0.629	0.588	8.402	0.768	0.741	0.876	11.539	0.958	0.876	0.783	11.995	0.683	0.652	0.783	11.995	0.683	0.652
8	0.565	9.367	0.911	0.895	0.857	12.834	0.556	0.445	0.943	15.907	0.584	0.544	0.720	11.890	0.699	0.651	1.045	15.983	0.935	0.813	0.956	15.865	0.588	0.563	0.956	15.865	0.588	0.563

2.1.3 GC: Status with Growth

Table 2.3: Mean percent bias and confidence interval coverage rates for scale score (SS) and student growth percentiles (SGPs) with MAR data (using status with growth), grade-content area level
	L2PAN				L2PAN_LONG				RQ				RF				L2PMM				PMM				Observed
	Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR
Grade	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP
30% Missing
3	1.467		0.463		1.467		0.463		1.464		0.443		1.534		0.420		1.439		0.462		1.464		0.549		1.464		0.549
4	1.295		0.457		1.295		0.457		1.301		0.426		1.367		0.399		1.276		0.437		1.292		0.545		1.292		0.545
5	0.345	5.071	0.935	0.899	0.553	6.213	0.864	0.763	0.413	6.312	0.793	0.822	0.366	4.881	0.887	0.887	0.617	6.429	0.968	0.901	0.389	6.262	0.852	0.845	0.389	6.262	0.852	0.845
6	0.301	5.143	0.898	0.868	0.521	6.079	0.751	0.709	0.380	6.683	0.769	0.778	0.310	5.016	0.881	0.877	0.539	6.734	0.978	0.902	0.378	6.737	0.815	0.799	0.378	6.737	0.815	0.799
7	0.289	4.257	0.902	0.887	0.536	5.907	0.643	0.710	0.339	5.284	0.725	0.739	0.277	3.256	0.859	0.892	0.551	5.417	0.977	0.889	0.351	5.206	0.762	0.752	0.351	5.206	0.762	0.752
8	0.270	3.983	0.904	0.898	0.459	5.602	0.662	0.719	0.453	7.520	0.656	0.667	0.330	5.089	0.821	0.822	0.618	7.840	0.951	0.833	0.450	7.477	0.705	0.695	0.450	7.477	0.705	0.695
50% Missing
3	2.549		0.308		2.549		0.308		2.544		0.284		2.629		0.241		2.509		0.309		2.549		0.413		2.549		0.413
4	2.316		0.315		2.316		0.315		2.320		0.276		2.402		0.242		2.281		0.298		2.329		0.416		2.329		0.416
5	0.530	7.459	0.934	0.909	1.062	9.151	0.739	0.681	0.676	9.911	0.727	0.769	0.649	7.560	0.817	0.849	1.047	9.889	0.956	0.882	0.641	9.833	0.806	0.788	0.641	9.833	0.806	0.788
6	0.482	7.994	0.890	0.873	0.813	9.988	0.674	0.617	0.623	10.651	0.692	0.712	0.544	8.280	0.811	0.817	0.810	10.378	0.975	0.889	0.609	10.636	0.748	0.732	0.609	10.636	0.748	0.732
7	0.468	6.852	0.880	0.874	0.837	9.566	0.548	0.572	0.553	8.662	0.665	0.668	0.496	5.633	0.756	0.813	0.819	8.253	0.976	0.886	0.552	8.461	0.698	0.680	0.552	8.461	0.698	0.680
8	0.445	6.624	0.906	0.902	0.754	9.569	0.552	0.564	0.717	12.262	0.583	0.575	0.572	8.655	0.729	0.741	0.940	12.625	0.955	0.812	0.715	12.238	0.620	0.597	0.715	12.238	0.620	0.597
70% Missing
3	3.892		0.236		3.892		0.236		3.892		0.208		3.993		0.148		3.886		0.229		3.905		0.316		3.905		0.316
4	3.626		0.246		3.626		0.246		3.631		0.205		3.722		0.149		3.602		0.232		3.691		0.302		3.691		0.302
5	0.861	10.692	0.929	0.915	2.442	12.843	0.528	0.588	1.040	13.696	0.664	0.730	1.157	10.940	0.718	0.809	1.584	13.426	0.948	0.881	0.990	13.556	0.753	0.754	0.990	13.556	0.753	0.754
6	0.766	11.189	0.889	0.892	1.159	15.106	0.585	0.491	0.918	14.764	0.627	0.659	0.878	11.843	0.716	0.758	1.193	14.352	0.975	0.870	0.865	14.703	0.687	0.678	0.865	14.703	0.687	0.678
7	0.674	9.274	0.889	0.886	1.148	13.562	0.454	0.413	0.784	12.000	0.620	0.627	0.818	8.386	0.657	0.758	1.082	11.227	0.983	0.865	0.790	11.879	0.665	0.646	0.790	11.879	0.665	0.646
8	0.704	10.252	0.893	0.877	1.130	15.104	0.455	0.440	0.969	16.683	0.572	0.543	0.900	12.410	0.636	0.667	1.320	16.337	0.966	0.806	0.970	16.682	0.596	0.544	0.970	16.682	0.596	0.544

2.1.4 School Level

Table 2.4: Mean percent bias and confidence interval coverage rates for scale score (SS) and student growth percentiles (SGPs) at the school level
	L2PAN				L2PAN_LONG				RQ				RF				L2PMM				PMM				Observed
	Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR		Percent Bias		CR
Percent Missing	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP	SS	SGP
MCAR
30% Missing	0.077	1.963	0.928	0.925	0.101	2.559	0.878	0.840	0.169	4.477	0.766	0.672	0.108	2.403	0.888	0.857	0.164	4.463	0.880	0.794	0.168	4.427	0.808	0.681	0.168	4.427	0.808	0.681
50% Missing	0.144	3.519	0.913	0.908	0.170	4.163	0.829	0.736	0.290	7.438	0.700	0.570	0.203	4.793	0.813	0.742	0.275	7.443	0.846	0.753	0.302	7.379	0.770	0.586	0.302	7.379	0.770	0.586
70% Missing	0.248	5.772	0.886	0.883	0.254	6.273	0.769	0.619	0.435	10.371	0.683	0.524	0.327	7.587	0.716	0.651	0.412	10.505	0.822	0.736	0.477	10.312	0.741	0.538	0.477	10.312	0.741	0.538
MAR (Status with Demographics)
30% Missing	0.291	3.051	0.720	0.852	0.388	4.769	0.597	0.597	0.319	4.899	0.601	0.659	0.362	3.374	0.574	0.789	0.424	5.132	0.669	0.783	0.316	4.878	0.696	0.674	0.316	4.878	0.696	0.674
50% Missing	0.513	5.044	0.635	0.848	0.696	7.427	0.472	0.487	0.563	7.895	0.503	0.579	0.646	5.915	0.429	0.703	0.725	7.998	0.598	0.766	0.567	7.911	0.611	0.601	0.567	7.911	0.611	0.601
70% Missing	0.808	7.426	0.607	0.852	1.098	10.027	0.394	0.426	0.879	10.847	0.434	0.536	1.020	8.633	0.326	0.629	1.101	10.757	0.539	0.756	0.886	10.870	0.534	0.556	0.886	10.870	0.534	0.556
MAR (Status with Growth)
30% Missing	0.669	3.446	0.371	0.843	0.737	4.549	0.286	0.639	0.693	4.879	0.276	0.650	0.760	3.500	0.263	0.794	0.873	4.985	0.353	0.776	0.668	4.892	0.340	0.667	0.668	4.892	0.340	0.667
50% Missing	1.197	5.548	0.295	0.825	1.353	7.421	0.186	0.523	1.227	7.885	0.181	0.567	1.345	5.946	0.163	0.706	1.495	7.793	0.267	0.754	1.188	7.887	0.234	0.585	1.188	7.887	0.234	0.585
70% Missing	1.905	7.853	0.249	0.828	2.308	10.771	0.129	0.410	1.948	10.834	0.132	0.522	2.152	8.667	0.086	0.638	2.312	10.597	0.206	0.734	1.905	10.818	0.159	0.534	1.905	10.818	0.159	0.534

2.2 Summary Figures: Grade/Content Area

2.2.1 Scale Scores

Figure 2.1: Scale score percent bias by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.2: Scale score coverage rate by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.3: Scatterplot of scale score percent bias as a function of grade/content area size

\(~\)

Figure 2.4: Scatterplot of scale score percent bias as a function of percentage missing at the grade/content area level

\(~\)

Figure 2.5: Scatterplot of scale score coverage rate as a function of grade/content area size

\(~\)

Figure 2.6: Proportion of times that the imputed SS was found to differ from the true value based on the simplified F1 statistic

\(~\)

Figure 2.7: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of grade/content area size

\(~\)

Figure 2.8: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of percent missing

2.2.2 SGPs

Figure 2.9: SGP percent bias by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.10: SGP coverage rate by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.11: Scatterplot of SGP percent bias as a function of grade/content area size

\(~\)

Figure 2.12: Scatterplot of SGP percent bias as a function of percentage missing at the grade/content area level

\(~\)

Figure 2.13: Scatterplot of SGP coverage rate as a function of grade/content area size

\(~\)

Figure 2.14: Proportion of times that the imputed SGP was found to differ from the true value based on the simplified F1 statistic

\(~\)

Figure 2.15: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of grade/content area size

\(~\)

Figure 2.16: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of percent missing

2.3 Summary Figures: School Level

We replicate the above figures when looking at aggregated school-level results.

2.3.1 Scale Scores

Figure 2.17: Scale score percent bias by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.18: Scale score coverage rate by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.19: Scatterplot of scale score percent bias as a function of school size

\(~\)

Figure 2.20: Scatterplot of scale score percent bias as a function of percentage missing at the school level

\(~\)

Figure 2.21: Scatterplot of scale score coverage rate as a function of school size

\(~\)

Figure 2.22: Proportion of times that the imputed SS was found to differ from the true value based on the simplified F1 statistic

\(~\)

Figure 2.23: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of school size

\(~\)

Figure 2.24: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of percent missing

2.3.2 SGPs

Figure 2.25: SGP percent bias by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.26: SGP coverage rate by imputation method, missingness percentage, and missingness type

\(~\)

Figure 2.27: Scatterplot of SGP percent bias as a function of school size

\(~\)

Figure 2.28: Scatterplot of SGP percent bias as a function of percentage missing at the school level

\(~\)

Figure 2.29: Scatterplot of SGP coverage rate as a function of school size

\(~\)

Figure 2.30: Proportion of times that the imputed SGP was found to differ from the true value based on the simplified F1 statistic

\(~\)

Figure 2.31: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of school size

\(~\)

Figure 2.32: Density plot of rejected null hypotheses for the simplified F1 statistic as a function of percent missing

2.4 Basic Regression Models

The following models are preliminary fixed-effects regression models, regressing either raw or absolute bias on (a) school or grade/content area size, (b) percentage missing, (c) missingness type, and (d) imputation method. We include both additive and two-way interaction models. The following models are fit using the fixest package (Berge, 2018).

Note: The \(R^2\) values for the subsequent models are relatively low. Therefore, inferences should be drawn from these models with caution.

2.4.1 Grade/Content Area

Table 2.5: Linear fixed-effect regression models for raw bias at the grade/content area level
	Scale Scores: Additive	Scale Scores: Interaction	SGPs: Additive	SGPs: Interaction

N	0.0011 (0.0014)	0.0308** (0.0084)	0.0047* (0.0017)	0.0006 (0.0021)
MISS_PERC50%Missing	1.573*** (0.2930)	1.433*** (0.2609)	-0.1290* (0.0481)	-0.1505* (0.0562)
MISS_PERC70%Missing	3.645*** (0.6291)	2.857*** (0.5872)	-0.2664* (0.0849)	-0.2885** (0.0732)
MISS_TYPEDEMOG	3.764*** (0.5795)	7.324*** (0.3485)	-0.0401 (0.0638)	0.0451 (0.2141)
MISS_TYPEGROWTH	7.886*** (1.499)	12.38*** (0.6266)	0.0396 (0.0853)	0.2381 (0.3322)
i(var=IMP_METHOD,ref=“Observed”)L2PAN_LONG	-4.267*** (0.8048)	2.452*** (0.4414)	0.0022 (0.2487)	0.0467 (0.1721)
i(var=IMP_METHOD,ref=“Observed”)L2PAN	-4.787*** (0.9246)	2.328*** (0.3280)	-0.0994 (0.2040)	0.0329 (0.0785)
i(var=IMP_METHOD,ref=“Observed”)RQ	-4.838*** (0.9402)	2.034*** (0.2855)	-0.2937 (0.2580)	-0.3277 (0.1804)
i(var=IMP_METHOD,ref=“Observed”)RF	-4.224*** (0.8953)	2.080*** (0.2746)	-0.2165 (0.2073)	-0.1185 (0.1214)
i(var=IMP_METHOD,ref=“Observed”)L2PMM	-3.889*** (0.6558)	1.738*** (0.2426)	-0.4286 (0.2323)	-0.4464* (0.1862)
i(var=IMP_METHOD,ref=“Observed”)PMM	-4.934*** (0.9876)	2.128*** (0.3095)	-0.4117 (0.2600)	-0.3897. (0.1901)
N x MISS_PERC50%Missing		-0.0039* (0.0016)		0.0016. (0.0007)
N x MISS_PERC70%Missing		-0.0095* (0.0033)		0.0033* (0.0013)
N x MISS_TYPEDEMOG		-0.0103* (0.0035)		0.0007 (0.0011)
N x MISS_TYPEGROWTH		-0.0277** (0.0081)		5.25e-5 (0.0014)
N x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-0.0212*** (0.0038)		-0.0003 (0.0024)
N x i(IMP_METHOD,ref=“Observed”)L2PAN		-0.0152* (0.0050)		0.0009 (0.0014)
N x i(IMP_METHOD,ref=“Observed”)RQ		-0.0129* (0.0053)		0.0042 (0.0023)
N x i(IMP_METHOD,ref=“Observed”)RF		-0.0150** (0.0047)		0.0015 (0.0017)
N x i(IMP_METHOD,ref=“Observed”)L2PMM		-0.0107* (0.0038)		0.0046. (0.0021)
N x i(IMP_METHOD,ref=“Observed”)PMM		-0.0129* (0.0055)		0.0044 (0.0025)
MISS_PERC50%Missing x MISS_TYPEDEMOG		1.608*** (0.2596)		0.0029 (0.0367)
MISS_PERC70%Missing x MISS_TYPEDEMOG		3.498*** (0.5972)		-0.0876 (0.1074)
MISS_PERC50%Missing x MISS_TYPEGROWTH		3.273*** (0.6279)		-0.0061 (0.0446)
MISS_PERC70%Missing x MISS_TYPEGROWTH		7.555*** (1.338)		-0.0301 (0.1054)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-1.337** (0.3180)		0.0062 (0.0785)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-2.132* (0.6987)		0.0268 (0.1653)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-1.596** (0.3746)		-0.0518 (0.0733)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-3.133** (0.8251)		-0.1188 (0.1581)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-1.633** (0.3864)		-0.1083 (0.0914)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-3.141** (0.8506)		-0.2158 (0.2051)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RF		-1.399** (0.3562)		-0.0979 (0.0726)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RF		-2.581** (0.7695)		-0.1919 (0.1605)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-1.325*** (0.2875)		-0.1656. (0.0861)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-2.501** (0.6670)		-0.2828 (0.1611)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-1.677** (0.4093)		-0.1557 (0.0923)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-3.236** (0.9079)		-0.3005 (0.2025)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-5.343*** (0.7582)		-0.0751 (0.2407)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-7.942** (1.851)		-0.0362 (0.3300)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN		-5.929*** (0.8663)		-0.1344 (0.2245)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN		-8.249*** (1.858)		-0.2694 (0.3099)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RQ		-5.799*** (0.8946)		-0.1351 (0.2286)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RQ		-7.976** (1.809)		-0.2578 (0.3292)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RF		-5.195*** (0.9000)		-0.1042 (0.2076)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RF		-7.333** (1.677)		-0.1972 (0.3192)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PMM		-4.771*** (0.5485)		-0.1231 (0.2142)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PMM		-6.562*** (1.306)		-0.2610 (0.3138)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)PMM		-5.929*** (0.9740)		-0.1522 (0.2316)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)PMM		-8.270** (1.921)		-0.3071 (0.3324)
Fixed-Effects:	——————	——————-	—————–	——————
GRADE^CONTENT_AREA	Yes	Yes	Yes	Yes
________________________________________	__________________	___________________	_________________	__________________
S.E.: Clustered	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.
Observations	97,146	97,146	52,542	52,542
R2	0.34190	0.40344	0.00445	0.00536
Within R2	0.28282	0.34988	0.00349	0.00440

\(~\)

Table 2.6: Linear fixed-effect regression models for absolute bias at the grade/content area level
	Scale Scores: Additive	Scale Scores: Interaction	SGPs: Additive	SGPs: Interaction

N	-0.0085*** (0.0014)	0.0179* (0.0065)	-0.0146*** (0.0012)	-0.0099** (0.0025)
MISS_PERC50%Missing	2.070*** (0.2204)	2.128*** (0.2390)	1.443*** (0.0610)	1.145*** (0.0262)
MISS_PERC70%Missing	4.670*** (0.5063)	4.365*** (0.5163)	3.047*** (0.1319)	2.665*** (0.0891)
MISS_TYPEDEMOG	2.683*** (0.4104)	5.761*** (0.3552)	0.7656*** (0.0493)	1.442*** (0.1116)
MISS_TYPEGROWTH	6.674*** (1.344)	10.80*** (0.6290)	1.046*** (0.0810)	2.505*** (0.1763)
i(var=IMP_METHOD,ref=“Observed”)L2PAN_LONG	-3.407*** (0.5026)	1.810*** (0.3659)	0.9601*** (0.1582)	0.6862*** (0.1235)
i(var=IMP_METHOD,ref=“Observed”)L2PAN	-4.251*** (0.7546)	1.997*** (0.2945)	0.0555 (0.0931)	0.6699*** (0.1036)
i(var=IMP_METHOD,ref=“Observed”)RQ	-3.606*** (0.5816)	2.552*** (0.4868)	1.682*** (0.2676)	2.207*** (0.2058)
i(var=IMP_METHOD,ref=“Observed”)RF	-3.786*** (0.7501)	2.127*** (0.3354)	0.3773. (0.1873)	1.085*** (0.1501)
i(var=IMP_METHOD,ref=“Observed”)L2PMM	-3.274*** (0.4794)	2.260*** (0.4296)	1.679*** (0.2785)	2.225*** (0.2329)
i(var=IMP_METHOD,ref=“Observed”)PMM	-3.569*** (0.6125)	2.624*** (0.4728)	1.679*** (0.2739)	2.195*** (0.2115)
N x MISS_PERC50%Missing		-0.0053*** (0.0012)		-0.0037*** (0.0005)
N x MISS_PERC70%Missing		-0.0123*** (0.0028)		-0.0082*** (0.0011)
N x MISS_TYPEDEMOG		-0.0042. (0.0023)		-0.0016* (0.0005)
N x MISS_TYPEGROWTH		-0.0191* (0.0071)		-0.0037** (0.0007)
N x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-0.0162*** (0.0023)		0.0021 (0.0016)
N x i(IMP_METHOD,ref=“Observed”)L2PAN		-0.0163** (0.0038)		0.0006 (0.0011)
N x i(IMP_METHOD,ref=“Observed”)RQ		-0.0138** (0.0031)		0.0012 (0.0023)
N x i(IMP_METHOD,ref=“Observed”)RF		-0.0174*** (0.0039)		-0.0004 (0.0018)
N x i(IMP_METHOD,ref=“Observed”)L2PMM		-0.0111** (0.0028)		0.0024 (0.0025)
N x i(IMP_METHOD,ref=“Observed”)PMM		-0.0144** (0.0033)		0.0012 (0.0025)
MISS_PERC50%Missing x MISS_TYPEDEMOG		1.137*** (0.2015)		0.2287*** (0.0262)
MISS_PERC70%Missing x MISS_TYPEDEMOG		2.451*** (0.4405)		0.3606** (0.0896)
MISS_PERC50%Missing x MISS_TYPEGROWTH		2.696*** (0.6013)		0.3029*** (0.0479)
MISS_PERC70%Missing x MISS_TYPEGROWTH		6.167*** (1.265)		0.5772** (0.1101)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-1.139*** (0.2159)		0.2995* (0.1037)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-1.827** (0.5001)		0.5383* (0.2028)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-1.482*** (0.3179)		-0.0259 (0.0575)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-2.735** (0.6957)		0.0064 (0.0866)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-1.223*** (0.2487)		0.6679*** (0.1114)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-2.316** (0.6035)		1.097** (0.2132)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RF		-1.245** (0.2945)		0.2673* (0.0951)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RF		-2.230** (0.6350)		0.5072* (0.1901)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-1.092*** (0.2144)		0.6307*** (0.1162)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-2.030** (0.5285)		0.9582** (0.2297)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-1.190*** (0.2681)		0.6675*** (0.1159)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-2.217** (0.6549)		1.097** (0.2181)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-4.116*** (0.4859)		0.0917 (0.1576)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-5.965*** (1.269)		-0.5213* (0.2102)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN		-4.801*** (0.6665)		-0.6722*** (0.1123)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN		-7.112** (1.604)		-1.273*** (0.1917)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RQ		-5.068*** (0.7507)		-1.298*** (0.1347)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RQ		-7.660** (1.753)		-2.276*** (0.1951)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RF		-4.506*** (0.7761)		-0.9273*** (0.1157)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RF		-6.971** (1.635)		-1.883*** (0.1616)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PMM		-4.754*** (0.6372)		-1.308*** (0.1173)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PMM		-6.948*** (1.475)		-2.383*** (0.1520)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)PMM		-5.091*** (0.7480)		-1.275*** (0.1372)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)PMM		-7.767*** (1.749)		-2.261*** (0.1950)
Fixed-Effects:	——————-	——————-	——————-	——————-
GRADE^CONTENT_AREA	Yes	Yes	Yes	Yes
________________________________________	___________________	___________________	___________________	___________________
S.E.: Clustered	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.
Observations	97,146	97,146	52,542	52,542
R2	0.33276	0.38934	0.18317	0.19723
Within R2	0.29642	0.35607	0.17733	0.19149

\(~\)

We can also re-fit the scale score models using only observations from grades 5 through 8.

Table 2.7: Linear fixed-effect regression models for raw and absolute scale score bias when removing grades 3 and 4
	Scale Score Raw Bias: Additive	Scale Score Raw Bias: Interaction	Scale Score Absolute Bias: Additive	Scale Score Absolute Bias: Interaction

N	0.0019 (0.0014)	0.0067. (0.0029)	-0.0080*** (0.0013)	0.0002 (0.0018)
MISS_PERC50%Missing	0.7460*** (0.0896)	2.059*** (0.1201)	1.456*** (0.0690)	2.733*** (0.0768)
MISS_PERC70%Missing	1.854*** (0.2536)	4.262*** (0.2860)	3.254*** (0.1980)	5.647*** (0.2431)
MISS_TYPEDEMOG	2.147*** (0.1595)	8.023*** (0.2453)	1.548*** (0.0822)	6.570*** (0.2140)
MISS_TYPEGROWTH	3.656*** (0.3730)	13.10*** (0.5265)	2.917*** (0.1585)	11.67*** (0.5044)
i(var=IMP_METHOD,ref=“Observed”)L2PAN_LONG	-6.449*** (0.7746)	2.986*** (0.4177)	-4.823*** (0.3294)	2.392*** (0.4104)
i(var=IMP_METHOD,ref=“Observed”)L2PAN	-7.411*** (0.2719)	2.719*** (0.1721)	-6.383*** (0.2014)	2.460*** (0.1342)
i(var=IMP_METHOD,ref=“Observed”)RQ	-7.501*** (0.2374)	2.310*** (0.1512)	-5.227*** (0.2408)	3.721*** (0.2423)
i(var=IMP_METHOD,ref=“Observed”)RF	-6.774*** (0.2405)	2.358*** (0.1315)	-5.917*** (0.2000)	2.678*** (0.2268)
i(var=IMP_METHOD,ref=“Observed”)L2PMM	-5.747*** (0.2659)	2.008*** (0.1562)	-4.595*** (0.2846)	3.298*** (0.2499)
i(var=IMP_METHOD,ref=“Observed”)PMM	-7.725*** (0.2192)	2.483*** (0.1346)	-5.269*** (0.2473)	3.742*** (0.2406)
N x MISS_PERC50%Missing		0.0002 (0.0008)		-0.0023** (0.0005)
N x MISS_PERC70%Missing		-0.0009 (0.0020)		-0.0055** (0.0014)
N x MISS_TYPEDEMOG		-0.0019 (0.0015)		0.0009 (0.0009)
N x MISS_TYPEGROWTH		-0.0059. (0.0029)		-0.0007 (0.0017)
N x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-0.0117* (0.0043)		-0.0101*** (0.0015)
N x i(IMP_METHOD,ref=“Observed”)L2PAN		-0.0020 (0.0015)		-0.0059** (0.0012)
N x i(IMP_METHOD,ref=“Observed”)RQ		0.0012 (0.0013)		-0.0059* (0.0019)
N x i(IMP_METHOD,ref=“Observed”)RF		-0.0019 (0.0017)		-0.0070** (0.0015)
N x i(IMP_METHOD,ref=“Observed”)L2PMM		-0.0009 (0.0015)		-0.0044* (0.0018)
N x i(IMP_METHOD,ref=“Observed”)PMM		0.0018 (0.0013)		-0.0061* (0.0022)
MISS_PERC50%Missing x MISS_TYPEDEMOG		0.9041*** (0.1102)		0.5824*** (0.0411)
MISS_PERC70%Missing x MISS_TYPEDEMOG		1.845*** (0.1886)		1.221*** (0.1025)
MISS_PERC50%Missing x MISS_TYPEGROWTH		1.494*** (0.1892)		1.015*** (0.0560)
MISS_PERC70%Missing x MISS_TYPEGROWTH		3.759*** (0.4948)		2.593*** (0.2525)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-2.172*** (0.3319)		-1.732*** (0.1532)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-3.613* (1.072)		-2.998** (0.6656)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-2.652*** (0.1604)		-2.365*** (0.0976)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-5.464*** (0.3735)		-4.677*** (0.2763)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-2.716*** (0.1338)		-1.898*** (0.1300)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-5.535*** (0.2995)		-3.981*** (0.2943)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RF		-2.404*** (0.1338)		-2.069*** (0.1049)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RF		-4.757*** (0.3312)		-4.017*** (0.2755)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-2.117*** (0.1746)		-1.659*** (0.1598)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-4.368*** (0.3852)		-3.453*** (0.3944)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-2.821*** (0.1336)		-1.911*** (0.1427)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-5.795*** (0.3250)		-4.024*** (0.3083)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-7.293*** (0.8623)		-5.412*** (0.4781)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-12.95*** (1.745)		-9.545*** (0.8786)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN		-8.377*** (0.2375)		-6.678*** (0.1862)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN		-13.52*** (0.5859)		-11.66*** (0.4340)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RQ		-8.303*** (0.2112)		-7.187*** (0.1673)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RQ		-13.11*** (0.5569)		-12.64*** (0.4230)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RF		-7.748*** (0.2094)		-6.712*** (0.2066)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RF		-12.11*** (0.4668)		-11.63*** (0.4671)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PMM		-6.322*** (0.2525)		-6.566*** (0.2226)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PMM		-10.28*** (0.5286)		-11.16*** (0.4990)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)PMM		-8.652*** (0.1873)		-7.193*** (0.1601)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)PMM		-13.71*** (0.5055)		-12.72*** (0.3912)
Fixed-Effects:	——————	——————	——————-	——————-
GRADE^CONTENT_AREA	Yes	Yes	Yes	Yes
________________________________________	__________________	__________________	___________________	___________________
S.E.: Clustered	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.	by: GRA.^CON.
Observations	52,542	52,542	52,542	52,542
R2	0.24881	0.37143	0.28053	0.41316
Within R2	0.24476	0.36804	0.27626	0.40967

2.4.2 School Level

In these models, the data are aggregated at the school level.

Table 2.8: Linear fixed-effect regression models for raw bias at the school level
	Scale Scores: Additive	Scale Scores: Interaction	SGPs: Additive	SGPs: Interaction

(Intercept)	3.313*** (0.1268)	-2.369*** (0.2377)	-0.1926 (0.1315)	0.1976 (0.2784)
N	-0.0023*** (0.0001)	0.0028*** (0.0004)	0.0009*** (0.0001)	-0.0007. (0.0004)
MISS_PERC50%Missing	1.563*** (0.0875)	1.642*** (0.2521)	-0.1834* (0.0908)	-0.2607 (0.2953)
MISS_PERC70%Missing	3.657*** (0.0875)	3.611*** (0.2521)	-0.3536*** (0.0908)	-0.3676 (0.2953)
MISS_TYPEDEMOG	3.802*** (0.0875)	8.155*** (0.2521)	-0.0912 (0.0908)	-0.1054 (0.2953)
MISS_TYPEGROWTH	7.908*** (0.0875)	13.73*** (0.2521)	0.0677 (0.0908)	0.3147 (0.2953)
i(var=IMP_METHOD,ref=“Observed”)L2PAN_LONG	-4.773*** (0.1337)	2.662*** (0.3038)	-0.0923 (0.1387)	-0.2010 (0.3558)
i(var=IMP_METHOD,ref=“Observed”)L2PAN	-5.254*** (0.1337)	2.439*** (0.3038)	-0.1684 (0.1387)	-0.2914 (0.3558)
i(var=IMP_METHOD,ref=“Observed”)RQ	-5.332*** (0.1337)	2.030*** (0.3038)	-0.5141*** (0.1387)	-0.9991** (0.3558)
i(var=IMP_METHOD,ref=“Observed”)RF	-4.701*** (0.1337)	2.155*** (0.3038)	-0.3176* (0.1387)	-0.5147 (0.3558)
i(var=IMP_METHOD,ref=“Observed”)L2PMM	-4.367*** (0.1337)	1.723*** (0.3038)	-0.5913*** (0.1387)	-0.9403** (0.3558)
i(var=IMP_METHOD,ref=“Observed”)PMM	-5.423*** (0.1337)	2.128*** (0.3038)	-0.6258*** (0.1387)	-1.059** (0.3558)
N x MISS_PERC50%Missing		-0.0008** (0.0003)		0.0005 (0.0003)
N x MISS_PERC70%Missing		-0.0020*** (0.0003)		0.0009** (0.0003)
N x MISS_TYPEDEMOG		-0.0023*** (0.0003)		5.18e-5 (0.0003)
N x MISS_TYPEGROWTH		-0.0059*** (0.0003)		-0.0003 (0.0003)
N x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-0.0030*** (0.0004)		0.0005 (0.0005)
N x i(IMP_METHOD,ref=“Observed”)L2PAN		-0.0019*** (0.0004)		0.0009. (0.0005)
N x i(IMP_METHOD,ref=“Observed”)RQ		-0.0012** (0.0004)		0.0022*** (0.0005)
N x i(IMP_METHOD,ref=“Observed”)RF		-0.0018*** (0.0004)		0.0012* (0.0005)
N x i(IMP_METHOD,ref=“Observed”)L2PMM		-0.0009* (0.0004)		0.0019*** (0.0005)
N x i(IMP_METHOD,ref=“Observed”)PMM		-0.0013** (0.0004)		0.0022*** (0.0005)
MISS_PERC50%Missing x MISS_TYPEDEMOG		1.605*** (0.1897)		0.0484 (0.2222)
MISS_PERC70%Missing x MISS_TYPEDEMOG		3.477*** (0.1897)		-0.1010 (0.2222)
MISS_PERC50%Missing x MISS_TYPEGROWTH		3.267*** (0.1897)		0.1144 (0.2222)
MISS_PERC70%Missing x MISS_TYPEGROWTH		7.545*** (0.1897)		0.0560 (0.2222)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-1.516*** (0.2898)		-0.0591 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-2.769*** (0.2898)		-0.1091 (0.3395)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-1.763*** (0.2898)		-0.1108 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-3.737*** (0.2898)		-0.2709 (0.3395)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-1.808*** (0.2898)		-0.2136 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-3.750*** (0.2898)		-0.4343 (0.3395)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RF		-1.563*** (0.2898)		-0.1560 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RF		-3.177*** (0.2898)		-0.3275 (0.3395)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-1.495*** (0.2898)		-0.2673 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-3.114*** (0.2898)		-0.4810 (0.3395)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-1.844*** (0.2898)		-0.2588 (0.3395)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-3.831*** (0.2898)		-0.5061 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-5.981*** (0.2898)		0.0619 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-8.810*** (0.2898)		-0.0522 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN		-6.540*** (0.2898)		0.0012 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN		-9.037*** (0.2898)		-0.2240 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RQ		-6.413*** (0.2898)		-0.0047 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RQ		-8.774*** (0.2898)		-0.2285 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RF		-5.804*** (0.2898)		0.0190 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RF		-8.138*** (0.2898)		-0.2383 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PMM		-5.380*** (0.2898)		0.0168 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PMM		-7.366*** (0.2898)		-0.2680 (0.3395)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)PMM		-6.542*** (0.2898)		-0.0015 (0.3395)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)PMM		-9.061*** (0.2898)		-0.2591 (0.3395)
________________________________________	___________________	___________________	___________________	__________________
S.E. type	Standard	Standard	Standard	Standard
Observations	14,616	14,616	14,616	14,616
R2	0.46577	0.58271	0.00730	0.01095
Adj. R2	0.46537	0.58131	0.00655	0.00763

Table 2.9: Linear fixed-effect regression models for absolute bias at the school level
	Scale Scores: Additive	Scale Scores: Interaction	SGPs: Additive	SGPs: Interaction

(Intercept)	3.950*** (0.1186)	-1.758*** (0.2225)	1.170*** (0.0872)	0.5429** (0.1833)
N	-0.0024*** (0.0001)	0.0018*** (0.0003)	-0.0025*** (8.66e-5)	-0.0007* (0.0003)
MISS_PERC50%Missing	1.781*** (0.0819)	2.096*** (0.2359)	1.181*** (0.0602)	0.8555*** (0.1944)
MISS_PERC70%Missing	4.109*** (0.0819)	4.440*** (0.2359)	2.463*** (0.0602)	2.056*** (0.1944)
MISS_TYPEDEMOG	2.914*** (0.0819)	7.558*** (0.2359)	0.6693*** (0.0602)	1.334*** (0.1944)
MISS_TYPEGROWTH	6.925*** (0.0819)	12.99*** (0.2359)	0.7760*** (0.0602)	1.830*** (0.1944)
i(var=IMP_METHOD,ref=“Observed”)L2PAN_LONG	-4.405*** (0.1251)	2.158*** (0.2843)	1.175*** (0.0920)	0.7827*** (0.2342)
i(var=IMP_METHOD,ref=“Observed”)L2PAN	-5.100*** (0.1251)	2.513*** (0.2843)	0.4012*** (0.0920)	0.9131*** (0.2342)
i(var=IMP_METHOD,ref=“Observed”)RQ	-4.729*** (0.1251)	2.843*** (0.2843)	1.790*** (0.0920)	2.246*** (0.2342)
i(var=IMP_METHOD,ref=“Observed”)RF	-4.508*** (0.1251)	2.571*** (0.2843)	0.7818*** (0.0920)	1.276*** (0.2342)
i(var=IMP_METHOD,ref=“Observed”)L2PMM	-4.035*** (0.1251)	2.430*** (0.2843)	1.801*** (0.0920)	2.222*** (0.2342)
i(var=IMP_METHOD,ref=“Observed”)PMM	-4.756*** (0.1251)	2.942*** (0.2843)	1.799*** (0.0920)	2.233*** (0.2342)
N x MISS_PERC50%Missing		-0.0009*** (0.0003)		-0.0008*** (0.0002)
N x MISS_PERC70%Missing		-0.0021*** (0.0003)		-0.0018*** (0.0002)
N x MISS_TYPEDEMOG		-0.0013*** (0.0003)		-0.0007** (0.0002)
N x MISS_TYPEGROWTH		-0.0043*** (0.0003)		-0.0007*** (0.0002)
N x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-0.0017*** (0.0004)		-8.56e-5 (0.0003)
N x i(IMP_METHOD,ref=“Observed”)L2PAN		-0.0019*** (0.0004)		-0.0007* (0.0003)
N x i(IMP_METHOD,ref=“Observed”)RQ		-0.0012** (0.0004)		-0.0007* (0.0003)
N x i(IMP_METHOD,ref=“Observed”)RF		-0.0020*** (0.0004)		-0.0008* (0.0003)
N x i(IMP_METHOD,ref=“Observed”)L2PMM		-0.0009* (0.0004)		-0.0005 (0.0003)
N x i(IMP_METHOD,ref=“Observed”)PMM		-0.0013*** (0.0004)		-0.0007* (0.0003)
MISS_PERC50%Missing x MISS_TYPEDEMOG		1.178*** (0.1776)		0.1787 (0.1463)
MISS_PERC70%Missing x MISS_TYPEDEMOG		2.622*** (0.1776)		0.2929* (0.1463)
MISS_PERC50%Missing x MISS_TYPEGROWTH		2.790*** (0.1776)		0.2173 (0.1463)
MISS_PERC70%Missing x MISS_TYPEGROWTH		6.525*** (0.1776)		0.4116** (0.1463)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-1.445*** (0.2713)		0.4563* (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-2.661*** (0.2713)		0.7504*** (0.2235)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-1.739*** (0.2713)		0.2199 (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PAN		-3.513*** (0.2713)		0.3497 (0.2235)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-1.604*** (0.2713)		0.7751*** (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RQ		-3.250*** (0.2713)		1.235*** (0.2235)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)RF		-1.495*** (0.2713)		0.5149* (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)RF		-2.935*** (0.2713)		0.8630*** (0.2235)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-1.377*** (0.2713)		0.7340** (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)L2PMM		-2.777*** (0.2713)		1.149*** (0.2235)
MISS_PERC50%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-1.592*** (0.2713)		0.7823*** (0.2235)
MISS_PERC70%Missing x i(IMP_METHOD,ref=“Observed”)PMM		-3.192*** (0.2713)		1.243*** (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-5.621*** (0.2713)		0.2390 (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN_LONG		-8.104*** (0.2713)		-0.1777 (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PAN		-6.477*** (0.2713)		-0.5544* (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PAN		-9.054*** (0.2713)		-0.8419*** (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RQ		-6.948*** (0.2713)		-1.022*** (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RQ		-9.620*** (0.2713)		-1.586*** (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)RF		-6.072*** (0.2713)		-0.7431*** (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)RF		-8.538*** (0.2713)		-1.251*** (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)L2PMM		-6.065*** (0.2713)		-1.022*** (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)L2PMM		-8.193*** (0.2713)		-1.637*** (0.2235)
MISS_TYPEDEMOG x i(IMP_METHOD,ref=“Observed”)PMM		-7.022*** (0.2713)		-0.9921*** (0.2235)
MISS_TYPEGROWTH x i(IMP_METHOD,ref=“Observed”)PMM		-9.883*** (0.2713)		-1.553*** (0.2235)
________________________________________	___________________	___________________	____________________	___________________
S.E. type	Standard	Standard	Standard	Standard
Observations	14,616	14,616	14,616	14,616
R2	0.46193	0.57924	0.19303	0.20854
Adj. R2	0.46153	0.57782	0.19242	0.20588

2.5 Key Take-Aways

Prior to comparing among the different MI methods, a handful of trends merit comment. First, we find that percent bias tends to increase as the percentage of missingness increases. There also appears to be a greater increase in percent bias (or decrease in the simplified CI coverage rate) as the missingness percentage increases for data MAR with growth compared to the other missingness types. We also see relationships between each of the three dependent variables and the unit size. For example, there is greater variation in percent bias and CI coverage rates for smaller \(N\) values (either at the grade/content area or school level). Similarly, observations with smaller \(N\) values more often also have a simplified \(F_1\) statistic indicating significant differences between the imputed and true values.

The summary tables in Section 2.1 also highlight numerous conditions wherein the CI coverage rates are relatively small. Specifically, the school-level summary shows many coverage rates lower than 0.50 when data are MAR, particularly based on status and growth. Looking at the grade/content area summaries, it seems like these low coverage rates for scale scores are largely driven by grades 3 and 4 when data are MAR based on status and growth. These observations also tend to have higher scale score percent bias than the higher grades. Alternatively, when looking at coverage rates for SGPs, many imputation methods (e.g., RF, L2PMM) showed a negative relationship between coverage rates and grade level, particularly for higher missingness percentages.

Next, we compare results among the examined MI methods. Unlike the “no impact” simulations, there are fewer clear differences in MI efficacy among the imputation methods (holding other factors constant). Still, cross-sectional L2PAN tends to outperform the other methods. For example, compared to other MI methods, L2PAN often shows

Larger scale score coverage rates, larger SGP coverage rates (particularly for data MAR and higher missingness percentages), and fewer significant \(F_1\) statistics for SGPs when aggregating at the grade/content area level.
Higher scale score and SGP coverage rates, fewer significant \(F_1\) statistics for scale scores when data are MCAR, and fewer significant SGP \(F_1\) statistics when aggregating at the school level.

Still, there are many conditions where there is not a clear “winner” among the MI methods. For example, in certain cases, L2PAN and L2PMM have similar proportions of significant scale score \(F_1\) statistics at the grade/content area level. Moreover, random forest (RF) and L2PMM often appear to be viable MI options, sometimes showing similar results to L2PAN. However, RF and L2PMM showed some conditions with higher proportions of significant \(F_1\) statistics or lower coverage rates compared to L2PAN.

In the next two sections, we further examine the MI simulation results with either (a) cross-sectional L2PAN or (b) L2PMM. We include L2PMM here because in many cases, this method seemed to perform similarly to L2PAN.

3 Evaluating Cross-Sectional L2PAN

3.1 Scale Scores

3.1.1 Descriptive Statistics: Grade/Content Area

Figure 3.1: Average SS percent bias by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 3.2: Average SS coverage rate by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 3.3: Proportion of cases where a significant difference between the imputed and true SS value was found using the F1 statistic

3.1.2 Descriptive Statistics: School Level

Figure 3.4: Average SS percent bias by school size quantile and missingness characteristics

\(~\)

Figure 3.5: Average SS coverage rate by school size quantile and missingness characteristics

\(~\)

Figure 3.6: Proportion of cases where a significant difference between the imputed and true SS value was found using the F1 statistic

3.2 Student Growth Percentiles

3.2.1 Descriptive Statistics: Grade/Content Area

Figure 3.7: Average SGP percent bias by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 3.8: Average SGP coverage rate by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 3.9: Proportion of cases where a significant difference between the imputed and true SGP value was found using the F1 statistic

3.2.2 Descriptive Statistics: School Level

Figure 3.10: Average SGP percent bias by school size quantile and missingness characteristics

\(~\)

Figure 3.11: Average SGP coverage rate by school size quantile and missingness characteristics

\(~\)

Figure 3.12: Proportion of cases where a significant difference between the imputed and true SGP value was found using the F1 statistic

3.3 Key Take-Aways

Beginning with the imputed scale scores, we find (a) higher percent bias, (b) lower coverage rates, and (c) higher proportions of significant \(F_1\) statistics for lower grades, particularly when data are MAR based on status and growth. We similarly see worse performance when aggregating at the school level for conditions of high missingness with data MAR based on status and growth; these school-level results are likely driven by the scale score imputation for grades 3 and 4. Still, note that the percent bias did not exceed the “problematic” threshold of 5%.

Furthermore, there is evidence of a negative relationship between the \(N\) quantile and scale score coverage rates, as well as a positive relationship between the \(N\) quantile and the proportion of significant \(F_1\) statistics, for grades 3 and 4 when data are MAR. In other words, when data are MAR, observations in grades 3 and 4 are more likely to have lower scale score coverage rates and more significant \(F_1\) statistics when the grade/content area size is in a higher quantile. These latter trends differ from observations in grades 5 and higher, where there is largely no clear relationship between the \(N\) quantile and the given dependent variable.

We next summarize the results for the imputed SGPs with cross-sectional L2PAN. Here, we find evidence of higher percent bias for lower grade/content area size quantiles, particularly among higher grades. For example, the average SGP percent bias reaches around 22% for grade 8 observations in the first \(N\) quantile with 70% of data missing at random based on status and growth. Looking at SGP coverage rates, we don’t find clear trends as a function of grade/content area quantile or grade level. However, we do see that SGP coverage rates are often lower when data are MAR, and the \(F_1\) statistic is more often significant when data are MAR based on status and growth.

Aggregating at the school level, SGP percent bias tends to increase as missingness percentage increases and when data are MAR. Moreover, missingness type seems to have a stronger relationship with SGP coverage rates than missingness percentage, with slightly lower SGP coverage rates among data MAR based on status and growth. Still, the coverage rates don’t fall below 0.80; recall that scale score coverage rates were as low as 0.10, likely as a function of the imputation difficulties with grades 3 and 4. Finally, when examining results at the school level, there is only a noticeable relationship between school size quantile and SGP percent bias, with percent bias decreasing as the \(N\) quantile increases.

4 Evaluating L2PMM

4.1 Scale Scores

4.1.1 Descriptive Statistics: Grade/Content Area

Figure 4.1: Average SS percent bias by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 4.2: Average SS coverage rate by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 4.3: Proportion of cases where a significant difference between the imputed and true SS value was found using the F1 statistic

4.1.2 Descriptive Statistics: School Level

Figure 4.4: Average SS percent bias by school size quantile and missingness characteristics

\(~\)

Figure 4.5: Average SS coverage rate by school size quantile and missingness characteristics

\(~\)

Figure 4.6: Proportion of cases where a significant difference between the imputed and true SS value was found using the F1 statistic

4.2 Student Growth Percentiles

4.2.1 Descriptive Statistics: Grade/Content Area

Figure 4.7: Average SGP percent bias by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 4.8: Average SGP coverage rate by grade/content area quantile, grade, and missingness characteristics

\(~\)

Figure 4.9: Proportion of cases where a significant difference between the imputed and true SGP value was found using the F1 statistic

4.2.2 Descriptive Statistics: School Level

Figure 4.10: Average SGP percent bias by school size quantile and missingness characteristics

\(~\)

Figure 4.11: Average SGP coverage rate by school size quantile and missingness characteristics

\(~\)

Figure 4.12: Proportion of cases where a significant difference between the imputed and true SGP value was found using the F1 statistic

4.3 Key Take-Aways

Many of the trends from the L2PAN results replicated when looking at L2PMM. Starting with the scale scores, there is evidence of higher percent bias for grades 3 and 4 when data are MAR based on status and growth, as well as a general increase in scale score percent bias as the missingness percentage increases. Moreover, we again see substantially lower scale score coverage rates for the lower grades under MAR based on status and growth; the coverage rates decrease for these observations as the grade/content area size quantile increases. When looking at the \(F_1\) statistics for scale scores, a higher proportion are statistically significant for grades 3 and 4, particularly among higher grade/content area size quantiles.

Turning to the SGPs, we again see higher percent bias for higher grades. The SGP coverage rates slightly decrease as the percentage missingness increases, but this relationship is small to negligible. Furthermore, when data are MAR, grade 8 observations often had the highest proportion of significant \(F_1\) statistics for SGPs, particularly for smaller grade/content area size quantiles. Evaluating the results at the school level, we find evidence of a negative relationship between SGP percent bias and school size quantile, as well as evidence of worse performance with L2PMM among higher missingness percentages (holding other factors constant). At the school level, the SGP coverage rates were often lowest in the fourth school size quantile.

5 Summary

The current study focused on the efficacy of multiple imputation for creating “adjusted” scale scores and SGPs among data with a simulated COVID-19 impact. To briefly summarize the above results, we find evidence that MI may be a plausible mechanism for dealing with missing data when

Cross-sectional L2PAN is used with the mice R package
Less than 50% of the data are missing (although note that some researchers posit upper thresholds of 40% for using MI; Jakobsen et al., 2017)
Data are not missing at random based only on status and growth
Grade/content area or school sizes are relatively large

A clear trend throughout these results was that MI struggled to generate accurate scale scores and SGPs when imputing MAR data for grades 3 and 4, particularly when data were missing based on status and growth. In other instances, such as when imputing SGPs, there was higher percent bias for higher grades. These variations indicate that researchers and policymakers should examine MI performance at the grade level when evaluating the method’s accuracy.

It is important to highlight certain limitations of the present simulation study. Specifically, we cannot appropriately generalize our findings beyond the conditions examined in the simulation design. For example, it remains to be seen how L2PAN, L2PMM, and other MI methods perform when data are missing at random based on other characteristics, or are missing not at random (MNAR). If data are MNAR, Jakobsen and colleagues (2017) recommend that analyses be conducted using only the observed cases with an accompanying discussion of the missingness magnitudes (see Figure 1 in Jakobsen et al., 2017).

Although these results shed light on certain conditions wherein MI performs relatively well (in terms of percent bias, simplified CI coverage rate, and the simplified \(F_1\) statistic), it is difficult to clearly pinpoint generalizable thresholds for determining whether MI can be applied to a given data set. Rather, we recommend that descriptive analyses accompany any report on academic status and growth comparisons. These descriptives can include missingness patterns within larger participation analyses, as well as diagnostic checks after imputation to ensure that MI worked relatively well with the data.

6 References

Berge, L. (2018). Efficient estimation of maximum likelihood models with multiple fixed-effects: the R package FENmlm. CREA Discussion Papers.
Betebenner, D. W., Van Iwaarden, A. R., & Domingue, B. (2021). SGPdata: Exemplar data sets for student growth percentile (SGP) analyses. R package version 25.1-0.0. https://centerforassessment.github.io/SGPdata/
Demirtas, H. (2004). Simulation driven inferences for multiply imputed longitudinal datasets. Statistica neerlandica, 58(4), 466-482. https://doi.org/10.1111/j.1467-9574.2004.00271.x
Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts. BMC Medical Research Methodology, 162, 1-10. https://doi.org/10.1186/s12874-017-0442-1
Miri, H. H., Hassanzadeh, J., Khaniki, S. H., Akrami, R., & Sirjani, E. (2020). Accuracy of five multiple imputation methods in estimating prevalence of Type 2 diabetes based on STEPS surveys. Journal of Epidemiology and Global Health, 10(1), 36-41. https://doi.org/10.2991/jegh.k.191207.001
Qi, L., Wang, Y.-F., & He, Y. (2010). A comparison of multiple imputation and fully augmented weighted estimators for Cox regression with missing covariates. Statistics in Medicine, 29(25), 2592-2604. https://doi.org/10.1002/sim.4016
van Buuren, S. (2018). Flexible imputation of missing data. CRC Press. https://stefvanbuuren.name/fimd/
van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1-67. https://www.jstatsoft.org/v45/i03/
Vink, G., & van Buuren, S. (2014). Pooling multiple imputations when the sample happens to be the population. arXiv Pre-Print 1409.8542.
Zhao, J. H., & Schafer, J. L. (2018). pan: Multiple imputation for multivariate panel or clustered data. R package version 1.6.

Comparing Multiple Imputation Methods when Simulating an Academic Impact of COVID-19

Allie Cooperman, Adam Van Iwaarden, and Damian Betebenner

June 16, 2021

1 Simulation Overview

2 Imputation Method Comparison

2.1 Summary Tables

2.1.1 GC: MCAR

2.1.2 GC: Status with Demographics

2.1.3 GC: Status with Growth

2.1.4 School Level

2.2 Summary Figures: Grade/Content Area

2.2.1 Scale Scores

2.2.2 SGPs

2.3 Summary Figures: School Level

2.3.1 Scale Scores

2.3.2 SGPs

2.4 Basic Regression Models

2.4.1 Grade/Content Area

2.4.2 School Level

2.5 Key Take-Aways

3 Evaluating Cross-Sectional L2PAN

3.1 Scale Scores

3.1.1 Descriptive Statistics: Grade/Content Area

3.1.2 Descriptive Statistics: School Level

3.2 Student Growth Percentiles

3.2.1 Descriptive Statistics: Grade/Content Area

3.2.2 Descriptive Statistics: School Level

3.3 Key Take-Aways

4 Evaluating L2PMM

4.1 Scale Scores

4.1.1 Descriptive Statistics: Grade/Content Area

4.1.2 Descriptive Statistics: School Level

4.2 Student Growth Percentiles

4.2.1 Descriptive Statistics: Grade/Content Area

4.2.2 Descriptive Statistics: School Level

4.3 Key Take-Aways

5 Summary

6 References