Virtual Class Driven by Data to Succeed
With the help of the UA-based iPlant Collaborative, students in a revolutionary, two-university "ecoinformatics" course dug through unused open-access data to discover how variations in soil composition influence microbial life.

By Shelley Littin, iPlant Collaborative
Nov. 5, 2015

Modern technology has blurred the boundaries of place, time zone and people, between students learning details and scientists leading discoveries.

Case in point: A revolutionary virtual class brought together undergraduate and graduate students at the University of Arizona and Western Michigan University in the spring of 2014.

With an innovative academic curriculum culminating in a student-authored research paper published in PLOS ONE, the students and professors of the class, titled "Ecoinformatics," have demonstrated that for education, collaboration and scientific discovery, there are no boundaries remaining.

"From the beginning, it was our goal that the students would write a paper. I don't think they believed us at first," said Rachel Gallery, a UA assistant professor in the School of Natural Resources and the Environment in the College of Agriculture and Life Sciences.

Gallery and Kathryn Docherty, an assistant professor of biological sciences at Western Michigan, pioneered the class, which integrated remote collaboration, data sharing and peer-review publication into a new form of learning that goes well beyond customary college coursework.

The class convened for lectures via videoconference.

"It was surprising how easy it was to treat it like a normal classroom," Gallery said. "In some ways, it was more helpful than a standard lecture, much more effective because we spent less time lecturing and more time having discussions."

Martha Gebhardt, a UA doctoral candidate and Ecoinformatics student, agreed.

"Virtual lectures facilitated class discussions," she said. "Both UA and Michigan students could immediately see what was being discussed and add their thoughts and input."

The class combined undergraduate and graduate students from a variety of educational and intellectual backgrounds, including entomology, soil sciences, environmental science and informatics.

"It was in many ways a professional development course," Gallery said. "We taught students concepts including how to write a manuscript, who deserves authorship and co-authorship, and how do you allocate those responsibilities?"

Mining 'Open-Access Data'

Before they could write a paper, the students had to learn to analyze large-scale datasets. Modern technology has fueled the big-data revolution with new and more powerful resource tools generating huge amounts of data — often more than scientists have time or resources to study.

The massive volumes of unanalyzed data are funneled into so-called big-data repositories, science centers that store and catalog the datasets with the hope and intent that someday they will be used to pioneer new discoveries.

The result is "open-access data," free for anyone scientifically inclined to mine for answers to questions that often haven’t even been asked, leading to valuable new knowledge.

Researchers everywhere are talking about using open-access data, Gallery said, "so we tried training students on big-data questions, and had so much success that it resulted in a manuscript."

The students leveraged previously unanalyzed pilot data from the National Ecological Observation Network, or NEON, an observation system designed to enable researchers to examine ecological variation over time, on a continental scale.

The class developed the scheme of analyzing the effects of geography and temperature on soil bacteria communities in four different biomes. Biomes are ecological zones characterized by ability to support distinct communities of life forms. The students selected datasets collected from biomes in Utah, Hawaii, Alaska and Florida, and began evaluating seasonal variation of terrestrial vegetation and comparing peak growing season values across the biomes.

To profile the microbial communities, the students used bacterial DNA and lipids, or oils, produced by bacteria and fungi in the soils, which provide an estimate of their growth.

Role of iPlant Collaborative

To securely store, share and analyze the massive volumes of data, the class turned to the iPlant Collaborative, a National Science Foundation-funded biotechnology project that provides computational resources for big-data storage, analysis and sharing.

The class, and its research, education and publication outcomes, would not have been possible without the collaborative, Gallery said. She and Docherty used online educational tools and services provided by iPlant to help the students learn how to analyze big data.

"We used iPlant for all the data sharing, so everybody could access the data," Gallery said. "And the students used the iPlant environment to send messages as they worked their way through the data."

The students even developed their own idea of creating YouTube videos to help teach each other the various skills needed for success in the course, including how to use iPlant's services.

From their research and correspondence, the students determined that key properties, such as soil temperature, soil chemistry and vegetation, could explain most variation in soil bacteria across the four biomes. The research data from the course are available through the iPlant Collaborative, and are stored with corresponding metadata in a public data repository by the National Center for Biotechnology Information, an initiative of the National Institutes of Health.

From these data, the students co-wrote what for many was their first scientific publication.

"It is first and foremost a research paper," Gallery said. "But we also talked about the usefulness of this approach for project-based learning."

"As a graduate student, having a publication really demonstrates commitment to your research, and ability to perform," said UA doctoral candidate Noelle Espinosa. "This course provided so much more than most courses. We were challenged to think and work as a collaborative group, to ask big questions and grapple with a big dataset. What I picked up from my peers will be invaluable for my future."

Said Docherty: "Oftentimes researchers feel limited to collaborating just with local researchers, but with these large datasets and today's communication and data-sharing systems, that is no longer a limitation."