Houston voices

Houston research: Why you need a data management plan

Every situation is unique and deserves a one-of-the-kind data management plan, not a one-size-fits-all solution. Graphic byMiguel Tovar/University of Houston

Why do you need a data management plan? It mitigates error, increases research integrity and allows your research to be replicated – despite the “replication crisis” that the research enterprise has been wrestling with for some time.

Error

There are many horror stories of researchers losing their data. You can just plain lose your laptop or an external hard drive. Sometimes they are confiscated if you are traveling to another country — and you may not get them back. Some errors are more nuanced. For instance, a COVID-19 repository of contact-traced individuals was missing 16,000 results because Excel can’t exceed 1 million lines per spreadsheet.

Do you think a hard drive is the best repository? Keep in mind that 20 percent of hard drives fail within the first four years. Some researchers merely email their data back and forth and feel like it is “secure” in their inbox.

The human and machine error margins are wide. Continually backing up your results, while good practice, can’t ensure that you won’t lose invaluable research material.

Repositories

According to Reid Boehm, Ph.D., Research Data Management Librarian at the University of Houston Libraries, your best bet is to utilize research data repositories. “The systems and the administrators are focused on file integrity and preservation actions to mitigate loss and they often employ specific metadata fields and documentation with the content,” Boehm says of the repositories. “They usually provide a digital object identifier or other unique ID for a persistent record and access point to these data. It’s just so much less time and worry.”

Integrity

Losing data or being hacked can challenge data integrity. Data breaches do not only compromise research integrity, they can also be extremely expensive! According to Security Intelligence, the global average cost of a data breach in a 2019 study was $3.92 million. That is a 1.5 percent increase from the previous year’s study.

Sample size — how large or small a study was — is another example of how data integrity can affect a study. Retraction Watch removes approximately 1,500 articles annually from prestigious journals for “sloppy science.” One of the main reasons the papers end up being retracted is that the sample size was too small to be a representative group.

Replication

Another metric for measuring data integrity is whether or not the experiment can be replicated. The ability to recreate an experiment is paramount to the scientific enterprise. In a Nature article entitled, 1,500 scientists lift the lid on reproducibility, “73 percent said that they think that at least half of the papers can be trusted, with physicists and chemists generally showing the most confidence.”

However, according to Kelsey Piper at Vox, “an attempt to replicate studies from top journals Nature and Science found that 13 of the 21 results looked at could be reproduced.”

That's so meta

The archivist Jason Scott said, “Metadata is a love note to the future.” Learning how to keep data about data is a critical part of reproducing an experiment.

“While this will be always be determined by a combination of project specifics and disciplinary considerations, descriptive metadata should include as much information about the process as possible,” said Boehm. Details of workflows, any standard operating procedures and parameters of measurement, clear definitions of variables, code and software specifications and versions, and many other signifiers ensure the data will be of use to colleagues in the future.

In other words, making data accessible, useable and reproducible is of the utmost importance. You make reproducing experiments that much easier if you are doing a good job of capturing metadata in a consistent way.

The Big Idea

A data management plan includes storage, curation, archiving and dissemination of research data. Your university’s digital librarian is an invaluable resource. They can answer other tricky questions as well: such as, who does data belong to? And, when a post-doctoral student in your lab leaves the institution, can s/he take their data with them? Every situation is unique and deserves a one-of-the-kind data management plan, not a one-size-fits-all solution.

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Trending News

Building Houston

 
 

Here's your roundup of energy innovation news coming out of Houston. Photo via Getty Images

Houston's energy innovation ecosystem has seen a busy spring season, with startup accelerator cohorts announced, expanded corporate partnerships, and recent funding raised.

In this roundup of short stories within Houston energy innovation, a startup enters into a strategic partnership, Greentown Labs announces a new accelerator, and more.

Syzygy taps global company to lead scaling for tech development 

Syzygy has brought on a new partner that's key to its future growth and tech production. Photo via Emerson

Houston-based Syzygy Plasmonics, which has developed a light-based catalyst reactor technology that originated out of Rice University, has selected global technology and software company Emerson (NYSE: EMR) to automate electrification of chemical production processes.

The reactor technology uses light instead of thermal energy for chemical manufacturing. The all-electric production method has the opportunity to replace fossil fuel-based combustion, making energy generation more sustainable. Syzygy estimates, according to the news release, that its reactor systems could eliminate 1 gigaton of CO2 emissions by 2040.

“We are excited to advance this opportunity with Emerson not only for its automation technologies and software but also its sustainability leadership and domain expertise in chemical engineering, electrification and hydrogen production,” says Syzygy CEO Trevor Best in the release. “As we expand beyond traditional paradigms of reactor technology and launch a new way to electrify chemical manufacturing, we wanted a technology partner who can help us scale our technology efficiently, safely and reliably.”

Emerson will provide its suite of hardware, software, and services for the Syzygy modular reactors.

"Emerson is excited to collaborate with Syzygy Plasmonics on such promising technology that could have a significant impact on industries that are some of the most challenging to decarbonize," says Peter Zornio, CTO at Emerson. “This aligns with Emerson’s culture of innovation that takes on our customers’ biggest challenges.”

Greentown Labs announces applications opening for Shell accelerator

Shell is seeking energy tech companies. Photo via greentownlabs.com

Greentown Labs, a climatetech incubator co-located in Houston and Boston, has teamed up with Shell for a Greentown Go program, geared at accelerating startup-corporate partnerships, to focus on technologies for carbon utilization, storage, and traceability.

Greentown Go Make 2023 zeroing in on alternative carbon feedstocks for carbon-intensive commodities; biogenic and nature-based solutions; and solutions for carbon storage and traceability, according to a news release.

Applications are open now, and the selected startups will have access to mentorship from Shell and Greentown's networks, desk space and membership within Greentown, $15,000 in non-dilutive grant funding, and educational workshops throughout the duration of the six-month program.

“Greentown Go brings together groundbreaking climatetech startups and the corporations that can help commercialize and scale their technologies,” says Kevin T. Taylor, interim CEO and CFO at Greentown Labs, in a news release. “Every Greentown Go program aims to drive climate impact and accelerate the energy transition. We look forward to working with Shell, a long-time Greentown partner, on this important program and supporting the latest innovations in carbon utilization, storage, and traceability.”

The program will help support Shell’s strategy through the development and scaling of technologies for carbon utilization, storage, and traceability across chemicals, carbon fuels, and more.

“Collaboration to accelerate technology development is critical to developing the energy solutions we need for a low-carbon energy future, and I am excited to see what novel technologies arise from startups participating in the Greentown Go Make 2023 program,” says Ed Holgate, commercial partnerships manager at Shell.

Chevron Technology Ventures adds Canadian startup to its Catalyst Program

Motive.io ia using AI to optimize workforce training. Photo via Motive.io

Chevron Technology Ventures announced the addition of Vancouver-based Motive.io, which provides immersive training solutions that leverage virtual and augmented reality technologies, to its Catalyst program. The program seeks out and helps to grow breakthrough technologies and solutions that have the potential to disrupt the energy industry.

"We are honored and thrilled to be selected as part of Chevron Technology Ventures' Catalyst program," says Ryan Chapman, CEO of Motive.io, in a news release. "Selection for this program represents a tremendous opportunity for Motive.io to collaborate with Chevron Technology Ventures as we continue to advance our cutting-edge immersive training solutions for the energy sector."

Motive.io's technology, called the XR Management System, "aims to revolutionize how companies train their employees by providing realistic and interactive simulations that allow learners to practice their skills in a safe and controlled environment," according to a news release.

Trending News