value-added – Robert Kelchen

Yes, Student Characteristics Matter. But So Do Colleges.

It is no surprise to those in the higher education world that student characteristics and institutional resources are strongly associated with student outcomes. Colleges which attract academically elite students and have the ability to spend large sums of money on instruction and student support should be able to graduate more of their students than open-access, financially-strapped universities, even after holding factors such as teaching quality constant. But an article in today’s Inside Higher Ed shows that there is a great deal of interest in determining the correlation between inputs and outputs (such as graduation).

The article highlights two new studies that examine the relationship between inputs and outputs. The first, by the Department of Education’s Advisory Committee on Student Financial Assistance, breaks down graduation rates by the percentage of students who are Pell Grant recipients, per-student endowments, and ACT/SAT scores using IPEDS data. The second new study, by the president of Colorado Technical University, finds that four student characteristics (race, EFC, transfer credits, and full-time status) explain 74% of the variation in an unidentified for-profit college’s graduation rate. His conclusion is that “public [emphasis original] policy will not increase college graduates by focusing on institution characteristics.”

While these studies take different approaches (one using institutional-level data and the other using student-level data), they highlight the importance that student and institutional characteristics currently have in predicting student success rates. These studies are not novel or unique—they follow a series of papers in HCM Strategists’ Context for Success project in 2012 and even more work before that. I contributed a paper to the project (with Doug Harris at Tulane University) examining input-adjusted graduation rates using IPEDS data. We found R-squared values of approximately 0.74 using a range of student and institutional characteristics, although the predictive power varied by Carnegie classification. It is also worth noting that the ACSFA report calculated predicted graduation rates with an R-squared value of 0.80, but they control for factors (like expenditures and endowment) that are at least somewhat within an institution’s control and don’t allow for a look at cost-effectiveness.

This suggests the importance of taking a value-added approach in performance measurement. Just like K-12 education is moving beyond rewarding schools for meeting raw benchmarks and adopting a gain score approach, higher education needs to do the same. Higher education also needs to look at cost-adjusted models to examine cost-effectiveness, something which we do in the HCM paper and I have done in the Washington Monthly college rankings (a new set of which will be out later this month).

However, even if a regression model explains 74% of the variation in graduation rates, a substantial amount can be attributed either to omitted variables (such as motivation) or institutional actions. The article by the Colorado Technical University president takes exactly the wrong approach, saying that “student graduation may have little to do with institutional factors.” If his statement is accurate, we would expect colleges’ predicted graduation rates to be equal to their actual graduation rates. But, as anyone who was spent time on college campuses should know, institutional practices and policies can play an important role in retention and graduation. The 2012 Washington Monthly rankings included a predicted vs. actual graduation rate component. While Colorado Tech basically hit its predicted graduation rate of 25% (with an actual graduation rate one percentage point higher), other colleges outperformed their prediction given student and institutional characteristics. For example, San Diego State University and Rutgers University-Newark, among others, outperformed their prediction by more than ten percentage points.

While incoming student characteristics do affect graduation rates (and I’m baffled by the amount of attention on this known fact), colleges’ actions do matter. Let’s highlight the colleges which appear to be doing a good job with their inputs (and at a reasonable price to students and taxpayers) and see what we can learn from them.

Bill Gates on Measuring Educational Effectiveness

The Bill and Melinda Gates Foundation has become a very influential force in shaping research in health and education policy over the past decade, both due to the large sums of money the foundation has spent funding research in these areas and because of the public influence that someone as successful as Bill Gates can have. (Disclaimer: I’ve worked on several projects which have received Gates funding.) In both the health and education fields, the Gates Foundation is focusing on the importance of being able to collect data and measure a program’s effectiveness. This is evidenced by the Gates Foundation’s annual letter to the public, which I recommend reading.

In the education arena, the Gates letter focuses on creating useful and reliable K-12 teacher feedback and evaluation systems. They have funded a project called Measures of Effective Teaching, which finds some evidence that it is possible to measure teacher effectiveness in a repeatable manner that can be used to help teachers improve. (A hat tip to my friend Trey Miller, who worked on the report.) To me, the important part of the MET report is that multiple measures of teacher effectiveness, including evaluations, observations, and student scores, need to be used when consider teaching effectiveness.

The Gates Foundation is also moving into performance measurement in higher education. I have been a part of one of Gates’s efforts in this arena—a project examining best practices in input-adjusted performance metrics. What this essentially means is that colleges should be judged based on some measure of their “value added” instead of the raw performance of their students. Last week, Bill Gates commented to a small group of journalists that college rankings are doing the exact opposite (as reported by Luisa Kroll of Forbes):

“The control metric shouldn’t be that kids aren’t so qualified. It should be whether colleges are doing their job to teach them. I bet there are community colleges and other colleges that do a good job in this area, but US News & World Report rankings pushes you away from that.”

The Forbes article goes on to mention that Gates would like to see metrics that focus on the performance of students from low-income families and the effectiveness of teacher education programs. Both of these measures are currently in progress, and are likely to continue moving forward given the Gates Foundation’s deep pockets and influence.

Predicting Student Loan Default Rates

Regular readers of this blog know that there are several concerns to using outcome measures in a higher education accountability system. One of my primary concerns is that outcomes must be adjusted to reflect a college’s inputs—in non-economist language, this means that colleges need to be assessed based on how well they do given their available resources. I have done quite a bit of work in this area with respect to graduation rates, but this same principle can be applied to many other areas in higher education.

The Education Sector also shares this concern, as evidenced by their recent blog post on the importance of input-adjusted graduation measures. In this post (at the Quick and the Ed), Andrew Gillen examines four-year colleges’ performance in student loan default rates. He adjusts for the percentage of Pell Grant recipients, the percentage of part-time students, and the average student loan size to get a measure of student default rate performance.

I repeat this estimate using the most recent loan default data (through 2009-10) and IPEDS data for the above characteristics for the 2009-10 academic year. This simple model does a fair job predicting loan default rates, with a R-squared value of 0.422. Figure 1 below shows actual vs. predicted loan default rates for 1876 four-year institutions with complete data:

The Education Sector analysis did not break down student default rate performance by important institutional characteristics, such as type of control (public, private not-for-profit, or for-profit) or the cost of attendance. Figures 2 and 3 below the performance between public universities and their private non-profit and for-profit peers:

Note: A positive differential means that default rates are higher than predicted. Negative numbers are good.

The default rate performances of public and private not-for profit colleges do not differ in a meaningful way, but a significant number of for-profit colleges have substantially higher than predicted default rates. This difference is obscured when all colleges’ performances are combined.

Finally, Figure 4 compares default rate performance by the net price of attendance (the sticker cost of attendance less grant aid) and finds no relationship between the net price and loan default rates:

Certainly, more work needs to be done before adopting input-adjusted student loan default rates as an accountability tool. But it does appear that a certain group of colleges tend to have a higher percentage of former students default, which is worth additional investigation.

The Limitations of “Data-Driven” Decisions

It’s safe to say that I am a data-driven person. I am an economist of education by training, and I get more than a little giddy when I get a new dataset that can help me examine an interesting policy question (and even more exciting when I can get the dataset coded correctly). But there are limits to what quantitative analysis can tell us, which comes as no surprise to nearly everyone in the education community (but can be surprising to some other researchers). Given my training and perspectives, I found an Education Week article on the limitations of data-driven decisions by Alfie Kohn, a noted critic of quantitative analyses in education, interesting.

Kohn writes that our reliance on quantifiable measures (such as test scores) in education result in the goals of education being transformed to meet those measures. He also notes that educators and policymakers have frequently created rubrics to quantify performance that used to be more qualitatively assessed, such as writing assignments. These critiques are certainly valid and should be kept in mind at all times, but then his clear agenda against what is often referred to as data-driven decision making shows through.

Toward the end of his essay, he launches into a scathing criticism of the “pseudoscience” of value-added models, in which students’ gains on standardized tests or other outcomes are estimated over time. While nobody in the education or psychometric communities is (or should be) claiming that value-added models give us a perfect measure of student learning, they do provide us with at least some useful information. A good source for more information on value-added models and data-driven decisions in K-12 education can be found in a book by my longtime mentor and dissertation committee member Doug Harris (with a foreword by the president of the American Federation of Teachers).

Like it or not, policy debates in education are being increasingly being shaped by the available quantitative data in conjunction with more qualitative sources such as teacher evaluations. I certainly don’t put full faith in what large-scale datasets can tell us, but it is abundantly clear that the accountability movement at all levels of education is not going away anytime soon. If Kohn disagrees with the type of assessment going on, he should propose an actionable alternative; otherwise, his objections cannot be taken seriously.