Methodology

Opportunity Youth

Percent of 16-24 year olds in Arizona that are NOT going to school or working.

Source

2023 1-year Public Use Microdata Series (PUMS) person file for Arizona from U.S. Census Bureau

Included in this number

Arizona residents aged 16-24, inclusive, who are neither working or attending school.

Not included in this number

Those outside the age range and those working or attending school. Those living in group quarters are excluded from poverty measures because their income is not calculated for the poverty statistic.

In brief

Opportunity Youth refers to people aged 16 through 24 who are neither working nor in school. PUMS data for 2023 was filtered to include only people within this age range, and this population was checked against both school enrollment and worker status to determine the percentage of Opportunity Youth.

Detailed methods

The data set was filtered to include only those between the ages of 16 and 24, inclusive. The Employment Status Recode (ESR) variable in the PUMS data was recoded into a new variable, ESR_Worker. This new variable takes on the value of zero for those who are listed as unemployed or not in the workforce. All other values, including civilians working and those in the armed forces, are coded as 1.

The PUMS variable SCH indicates whether the person has attended school within the last 3 months or is in public or private school. A dichotomous variable InSchool was created to convey school attendance status.

The variables ESR_Worker and InSchool were combined to create a new dichotomous variable Disconnected which identified those who are neither in school nor working.

For more on the processing of this data, please see the sections on PUMS Data and Survey Data.

PUMS data

Two indicators, attainment and opportunity youth, are drawn from Public-Use Microdata Sample (PUMS) data from the United States Census Bureau. PUMS data is a product of the Bureau’s American Community Survey (ACS), which is conducted annually and collects a wide variety of data from households across the nation.

PUMS data allows researchers to compile several attributes into custom tables to present data in new ways. For instance, there is a standard ACS data table that lists the number of people aged 16-19 that are neither in school nor working. However, the term ‘Opportunity Youth’ is defined as those aged 16-24 who are not working or in school. Furthermore, the ACS table does not break down these Opportunity Youth by other characteristics such as race, ethnicity, and disability status. PUMS data allows these breakdowns, within certain statistical limitations.

PUMS data is available in samples that have been collected over a five-year period or over a single year. The five-year sample is more accurate, but since the Progress Meter is looking for changes across time, the one-year sample is more appropriate for this use. The one-year sample for 2023 for Arizona was downloaded from the Census website (https://www.census.gov/programs-surveys/acs/news/data-releases/2023/release.html). The data was then imported into SPSS, a statistical software package. Using SPSS, summary variables were created for race and ethnicity, age categories, limited English proficiency (LEP), poverty status, work status, school attendance, educational attainment, disability status, and county of residence. Note that several counties with smaller populations are combined in the PUMS data to protect the privacy of survey respondents. The PUMS data is so detailed that it would be possible to identify individuals or families if the data were focused on a smaller geography. Populous counties are big enough that individual records are effectively masked, but data from smaller counties such as Mohave and La Paz are combined to create a larger population pool and protect identities.

An automated script file was then developed to produce the tables used by the progress meter. The tables contain the variable of interest broken down by the ten county-comparable geographies reported by PUMS, race and ethnicity, limited English proficiency (LEP), poverty, and disability status.

These tables were then transferred to Microsoft Excel for further formatting, calculation of percentages, analysis of standard errors, and computation of 90% confidence intervals. Standard errors for the estimates and the derived proportions were calculated according to the formulas suggested by the Census Bureau (http://www2.census.gov/programs-surveys/acs/tech_docs/pums/accuracy/2016AccuracyPUMS.pdf). These calculations consider the size of the estimates, the size of the population from which the estimates were drawn, and the design factors used by Census Bureau.

Values in the final Excel output tables were suppressed in cases where the 90 percent confidence interval exceeded +/- 25 percent or when the confidence interval encompasses either 0% or 100%.

*Note. In the release of the 2021 PUMS 1-year American Community Survey microdata, the geographic region variable (REGION) was changed/updated to align with the 2020 10-year Census regions that were recently made available (previous analyses used the 2010 Census regions). See below for an overview of how geographies were categorized in 2021 and 2022.

2010 CENSUS REGIONS	2020 CENSUS REGIONS
0 ‘N/A’	0 ‘N/A’
100 ‘Maricopa’	100 ‘Maricopa’
200 ‘Pima’	400 ‘Gila, Graham, Greenlee Counties’
300 ‘Navajo-Apache’	500 ‘Coconino’
400 ‘Coconino’	600 ‘Mohave’
500 ‘Yavapai’	701 ‘La Paz-Yuma’
600 ‘Mohave-LaPaz’	850 ‘Pinal’
700 ‘Yuma’	900 ‘Cochise-Santa Cruz’
800 ‘E. Pinal-Gila-Graham-Greenlee’	1000 ‘Navajo-Apache’
850 ‘Pinal’	1101 ‘Yuma’
900 ‘Cochise-Santa Cruz’	1901 ‘Pima’
	2501 ‘Yavapai’

Due to these changes, it is not recommended that percentages are compared over time where differing region variables are used to estimate educational attainment.

A supplemental spreadsheet will be provided with this document outlining the specific changes made from the 2010 to the 2020 Census regions.

Survey Data

Attainment and opportunity youth are products of the American Community Survey conducted by the Census Bureau. Since this data is drawn by sampling a small percentage of the overall population, there is a degree of uncertainty to the numbers.

Sampling error

Rather than seeing these numbers as point descriptors of exactly the percent of adults with college degrees, for example, it is more accurate to visualize them as the center of a 90% confidence interval. Were it possible to interview everyone in Arizona, there is a 90% chance that the ‘true’ percentage would fall within this confidence interval.

This uncertainty is known as sampling error. It is an unavoidable consequence of the survey process. The size of the confidence interval is expressed by the standard error of the estimate, which is used to monitor the quality of the estimate.

Non-sampling error

Inevitably, other errors creep into the data. Random errors, such as a respondent accidentally checking the wrong box on a survey form, do not bias the data in one direction or another but do affect the precision of the estimate by increasing the standard error.

Systematic errors unintentionally push the data in a specific direction, perhaps through a poorly worded question, which can be a serious concern. However, the Census Bureau conducts rigorous, high-quality surveys that reduce systematic errors to a minimum.