class: center, middle, my-title, title-slide .title[ # Academic Impact Part 1 ] .subtitle[ ## Data Visualization and Analysis in the Era of COVID-19 ] .author[ ### Damian Betebenner, Adam VanIwaarden & Nathan Dadey ] .institute[ ### Center for Assessment ] .date[ ### February 18th, 2022 (updated: April 20th, 2022) ] --- class: inverse, center, middle
# Data --- ## There are several data considerations that need to be addressed. -- - We are going to utilize simulated annual state assessment data to investigate academic impact both in terms of status and growth. -- - The data we will utilize spans 7 years (2016 to 2023) with data missing in 2020 due to the pandemic. - This situation is very similar to that found in most state assessment programs. -- - In order to compare pre-pandemic performance (status or growth) to current performance (post-pandemic?) the assessment must be on the same scale. - The assessment data we will use has a common vertical-scale across all of the year. -- - It goes without saying that running analyses to assess pandemic related academic impact assumes that the data derived are valid in the sense that they accurately describe student performance with regard to the standards being assessed. --- # Data - The data set we will utilize is called __sgpData_LONG_COVID__ and is a part of the __SGPdata__ package you've installed. - Load the package into your R environment: ```r require("SGPdata") ``` ``` ## Loading required package: SGPdata ``` ```r dim(sgpData_LONG_COVID) ``` ``` ## [1] 583913 17 ``` ```r names(sgpData_LONG_COVID) ``` ``` ## [1] "VALID_CASE" "CONTENT_AREA" "YEAR" "ID" ## [5] "GRADE" "SCALE_SCORE" "SCALE_SCORE_without_COVID_IMPACT" "ACHIEVEMENT_LEVEL" ## [9] "ETHNICITY" "FREE_REDUCED_LUNCH_STATUS" "ELL_STATUS" "IEP_STATUS" ## [13] "GENDER" "SCHOOL_NUMBER" "SCHOOL_NAME" "DISTRICT_NUMBER" ## [17] "DISTRICT_NAME" ``` --- # Data - The __sgpData_LONG_COVID__ data set is a LONG formatted data set with each case representing a unique student by content area by year assessment record. - Missing data due to the pandemic is a big issue due to lower participation rates in spring assessments. We will discuss missing data in great detail in separate sections of this training session --- # Status Based Academic Impact - If you followed state reporting of spring 2021 assessment results, you know that the most prominent comparison being employed was status based comparisons to same grade/content area pre-pandemic cohorts of students. - For example, percent proficient for Grade 5 students in 2021 was compared to percent proficient of Grade 5 students in 2019. - Pre-/post-pandemic comparisons of same grade/content area imply a counter-factual: + The 2019 results represent where students in that grade and content area would have scored had the pandemic not occurred. + This presumes that the population of students in 2019 and 2021 are roughly equivalent. + This presumes that there would be no improvement in the system between 2019 and 2021 (which would lead to a higher outcome for the 2021 students) - The comparison isn't "wrong", but it isn't the most nuanced comparison that can be used. - Status comparisons have the advantage of being able to be used for all grades and contents areas. - Growth comparisons (which we discuss later) are only available for grades/content areas where students have the requisite prior scores. --- # Status based Academic Impact - We will begin by looking at some basic descriptive summaries to familiarize ourselves with the __sgpData_LONG_COVID__ data set. ```r tmp_2019_m5 <- subset(sgpData_LONG_COVID, YEAR=="2019" & CONTENT_AREA=="MATHEMATICS" & GRADE=="5") tmp_2021_m5 <- subset(sgpData_LONG_COVID, YEAR=="2021" & CONTENT_AREA=="MATHEMATICS" & GRADE=="5") mean(tmp_2019_m5$SCALE_SCORE, na.rm=TRUE) - mean(tmp_2021_m5$SCALE_SCORE, na.rm=TRUE) ```