Researcher Looks at 'Digital Traces' to Help Students
Every time University of Arizona students swipe their ID cards — at the student union, the rec center, the library — they leave a digital trace, showing exactly where they've been and when.
One UA researcher is tracking those digital traces to see what they reveal about students' routines and relationships, and what that means for their likelihood of returning to campus after their freshman year.
Sudha Ram, a professor of management information systems, directs the UA's INSITE: Center for Business Intelligence and Analytics in the Eller College of Management. The center focuses on harnessing the power of big data, using machine learning and network science, to help businesses and organizations make better-informed decisions.
The goal of Ram's Smart Campus research is to help educational institutions repurpose the data already being captured from student ID cards to identify those most at risk for not returning after their first year of college.
"By getting their digital traces, you can explore their patterns of movement, behavior and interactions, and that tells you a great deal about them," Ram said.
Freshman retention is an ongoing challenge for public universities nationwide. It's important not only for the obvious reason — that a university's goal is to educate students — but also because retention and graduation rates influence a university's reputation and national rankings. And students' first two years in college have been found to be critical to their overall likelihood of completing a degree.
Traditionally, factors such as academic performance and demographic information have been heavily relied upon to predict which students are most at risk for dropping out.
Ram's research takes a different approach, focusing on students' interactions and campus routines.
ID Cards as 'Embedded Sensors'
Every student at the UA is issued a CatCard student ID when they enroll. They use that card at numerous locations, including residence halls, the Student Recreation Center, various campus labs, the library and the Think Tank academic support center, just to name a few.
Many students also load cash onto the card for use in vending machines and to pay for food and services at the Student Union Memorial Center, putting the total number of campus locations that accept CatCards near 700.
"It's kind of like a sensor that's embedded in them, which can be used for tracking them," Ram said of the card. "It's really not designed to track their social interactions, but you can, because you have a timestamp and location information."
Working in partnership with UA Information Technology, Ram gathered and analyzed data on freshman CatCard usage over a three-year period. She then used that data to create large networks mapping which students interacted with one another and how often.
For example, if Student A, on multiple occasions, uses her CatCard at the same location at roughly the same time as Student B, it would suggest a social interaction between the two.
Ram also looked at how students' interactions changed over time, by constructing networks two weeks at a time over a 12-week period.
"There are several quantitative measures you can extract from these networks, like the size of their social circle, and we can analyze changes in these networks to see if their social circle is shrinking or growing, and if the strength of their connections is increasing or decreasing over time," she said.
Ram additionally used the CatCard data to look at the regularity of students' routines and whether or not they had fairly established patterns of activity during the school week. She and her collaborators developed a machine learning algorithm to develop ways to quantify these patterns.
Considered together with demographic information and other predictive measures of freshman retention, an analysis of students' social interactions and routines was able to accurately predict 85 to 90 percent of the freshmen who would not return for a second year at the UA, with those having less-established routines and fewer social interactions most at-risk for leaving.
"Of all the students who drop out at the end of the first year, with our social integration measures, we're able to do a prediction at the end of the first 12 weeks of the semester with 85 to 90 percent recall," Ram said. "That means out of the 2,000 students who drop out, we're able to identify 1,800 of them."
Ram found that social integration and routine were stronger predictors than end-of-term grades, which is one of the more traditionally used predictors of freshman retention in higher education.
The problem with relying solely on grades for making predictions is that national literature suggests freshmen who ultimately leave the university make the decision to do so in the first 12 weeks of the 16-week semester, and often as early as the first four weeks — long before final grades are posted, Ram said.
"A public university like ours is very large, and students can get lost," Ram said. "There are social science theories that indicate when these students come in, they need to establish a regular routine, learn how to manage their time, and they need to get socially integrated. Those are some of the reasons they tend to drop out — they're not socially integrated and they haven't established a regularity of routine on campus."
Ram presented the first phase of her Smart Campus research at the 2015 International Conference on Information Systems, and she has submitted additional findings for journal publication.
Retention: An Age-Old Issue in Higher Ed
Ram's research represents a new approach to an old problem, and she hopes it can eventually be used by the UA and other universities to supplement the predictive analytics work they are already doing.
"Student retention is something that's been studied for the last 30 or 40 years, but we never had the ability to track people's behavior and movement and extract their social integration patterns," Ram said. "We have also made great strides in developing machine learning and large-scale network analysis methods that help in analyzing such spatio-temporal data."
Ram's work comes at a time when universities nationwide, including the UA, are committing more resources to harnessing data analytics in ways that support student success.
"The kind of move that universities are making toward predictive analytics and using more data to understand the student experience allows us to look earlier and more often at some of the variables that we can get our hands on, and ask different sorts of questions than we were able to ask before about the freshman experience," said Vincent J. Del Casino Jr., UA vice president for Academic Initiatives and Student Success and a lecturer in the recent College of Science series on "Humans, Data and Machines."
The UA, which saw its freshman retention rate jump from 80.5 to 83.3 percent between 2016 and 2017, has for the past four years contracted with outside vendor Civitas Learning on data analysis related to retention and graduation rates.
The University now uses some 800 data points — related to everything from academic performance to financial aid to students' activity in the university's D2L course management system, among other things — to identify which students are most at-risk for leaving the UA. Those predictions, which do not include data from Ram's research, are about 73 percent accurate from the first day of classes, with the rate of accuracy improving over time, said Angela Baldasare, UA assistant provost for institutional research.
The University's current effort to predict first-year student success is a significant improvement over previous freshman retention prediction models, which relied largely on descriptive data from surveys and incoming freshmen's "academic index scores," which are predictions of students' first-year college GPA based on their high school GPA, SAT or ACT test scores, and the difficulty of the courses they took in high school.
The University now generates lists — twice a semester and twice in the summer — of the 20 percent most at-risk students in each college, based on those 800 data points. Those lists are shared with the colleges, with the intent that advisers will use them to reach out to students who may need additional support or guidance, Baldasare said.
"As early as the first day of classes, even for freshmen, these predictive analytics are creating highly accurate indicators that inform what we do to support students in our programs and practice," she said. "The lists of students are delivered in the fourth week of classes because we know students may already be making up their minds about staying or leaving UA."
The University also is preparing to launch an online dashboard where advisers can access key data and assess student risk in real time throughout the semester.
"We've really worked hard over last year to work more with advisers and get advisers better data so we can facilitate their informed action with students," Baldasare said.
Predictive analytics provide merely a signal for underlying challenges students might be facing, Baldasare noted. It's ultimately up to the advisers on the ground to use that information to diagnose the problem and help students as best they can, with the understanding that it never will be possible to retain everyone.
"What all of this ultimately boils down to is individual students and individual needs and how we translate the big signals down into individual outreach," Baldasare said.
Big Data Allows Better Responses, Sooner
That individualized approach is at the heart of Ram's Smart Campus work, which Baldasare hopes may eventually be incorporated into the predictive work the UA is already doing.
Ram said generalized solutions are no longer enough, in any industry.
"We live in an era where you shouldn't be generalizing about 'groups of people,'" Ram said. "You should be personalizing solutions at the individual level."
That thinking underlies many of Ram's other projects as well, ranging from her health-care research to identify high-cost hospital patients at the point of admission, to her Smart Cities work in Fortaleza, Brazil, which aims to tackle public transportation challenges facing that city's residents. These research projects tackle societal grand challenges to produce solutions with policy and social implications, by harnessing the power of big data, large-scale network analysis and machine learning techniques.
Coming up with targeted solutions is possible today more than ever before, thanks to what Ram refers to as the "datafication phenomenon" — our ability to measure and quantify things that we couldn't in the past, from students' interactions on campus to the number of steps it takes to walk to work in the morning. That data is coming continuously, from many different sources, often with a precise time and location associated with it.
"My philosophy in doing prediction modeling is to combine signals from many different sources to understand the problem," Ram said. "We now live in a sensing environment. We live in an environment where everything is connected, so our whole world is even more than an 'internet of things' — it is an internet of people, data and things. It's all three connected to each other, and it's all about understanding those connections. This is where network science and machine learning interact and allow us to make better predictions."
Just as Amazon boasts of having a 360-degree view of the customer — being able to predict what an individual will buy before they do it — so too can other organizations leverage data to predict and respond to the evolving needs of those they serve, Ram said.
"We don't live in a static world. Our preferences change, our behavior changes, what we do changes, our whole life changes," Ram said. "When you look at data, you can see longer-term trends and follow what people are thinking and respond to that in real time in terms of offering new services and new products. And you can do interventions and see immediate reactions to these interventions and say, 'This was a good way to do it; this was not.' So you're able to make decisions in a lot shorter time — better decisions — and then you can measure whether your decisions are working or not."
In the case of freshman retention, the decision to intervene needs to take place early to be most effective, whether that intervention be an email from an adviser checking in, or perhaps an invitation from the adviser to a student to participate in a time management seminar or some activity that could boost social interaction.
"We think by doing these interventions by the 12th week, which is when students make up their mind, you're sort of
doing what Amazon does — delivering items you didn't order but will be ordering in the future," Ram said.
Building the Future You Want
Predictive analytics get even better the more data you have, Ram said. For her Smart Campus research, she hopes to eventually be able to incorporate UA Wi-Fi data from the 8,000 Wi-Fi hubs on campus to get an even more accurate picture of students' movement and behavior.
Ram acknowledges that there can be privacy concerns when dealing with individuals' personal information. That's why the CatCard data she collected was completely anonymized so that she could not personally identify individual students by name, ID number or any other attributes. That information ultimately would be shared only with the students' adviser.
"Almost every prediction we make is personalized, without knowing who the individual is," Ram said.
In the end, Ram believes the potential benefits — getting students the individualized attention and support they need, while helping the institution meet its goals — make the process worthwhile.
The same goes for any industries leveraging the power of big data and predictive analytics to make better decisions for themselves and those they serve.
"It's all about thinking about the future," Ram said. "It's about planning for the future and making sure you're doing things in a way that enables the future to happen the way you want it — for everyone's benefit."