Every situation is unique and deserves a one-of-the-kind data management plan, not a one-size-fits-all solution. Graphic by Miguel Tovar/University of Houston

Why do you need a data management plan? It mitigates error, increases research integrity and allows your research to be replicated – despite the “replication crisis” that the research enterprise has been wrestling with for some time.

Error

There are many horror stories of researchers losing their data. You can just plain lose your laptop or an external hard drive. Sometimes they are confiscated if you are traveling to another country — and you may not get them back. Some errors are more nuanced. For instance, a COVID-19 repository of contact-traced individuals was missing 16,000 results because Excel can’t exceed 1 million lines per spreadsheet.

Do you think a hard drive is the best repository? Keep in mind that 20 percent of hard drives fail within the first four years. Some researchers merely email their data back and forth and feel like it is “secure” in their inbox.

The human and machine error margins are wide. Continually backing up your results, while good practice, can’t ensure that you won’t lose invaluable research material.

Repositories

According to Reid Boehm, Ph.D., Research Data Management Librarian at the University of Houston Libraries, your best bet is to utilize research data repositories. “The systems and the administrators are focused on file integrity and preservation actions to mitigate loss and they often employ specific metadata fields and documentation with the content,” Boehm says of the repositories. “They usually provide a digital object identifier or other unique ID for a persistent record and access point to these data. It’s just so much less time and worry.”

Integrity

Losing data or being hacked can challenge data integrity. Data breaches do not only compromise research integrity, they can also be extremely expensive! According to Security Intelligence, the global average cost of a data breach in a 2019 study was $3.92 million. That is a 1.5 percent increase from the previous year’s study.

Sample size — how large or small a study was — is another example of how data integrity can affect a study. Retraction Watch removes approximately 1,500 articles annually from prestigious journals for “sloppy science.” One of the main reasons the papers end up being retracted is that the sample size was too small to be a representative group.

Replication

Another metric for measuring data integrity is whether or not the experiment can be replicated. The ability to recreate an experiment is paramount to the scientific enterprise. In a Nature article entitled, 1,500 scientists lift the lid on reproducibility, “73 percent said that they think that at least half of the papers can be trusted, with physicists and chemists generally showing the most confidence.”

However, according to Kelsey Piper at Vox, “an attempt to replicate studies from top journals Nature and Science found that 13 of the 21 results looked at could be reproduced.”

That's so meta

The archivist Jason Scott said, “Metadata is a love note to the future.” Learning how to keep data about data is a critical part of reproducing an experiment.

“While this will be always be determined by a combination of project specifics and disciplinary considerations, descriptive metadata should include as much information about the process as possible,” said Boehm. Details of workflows, any standard operating procedures and parameters of measurement, clear definitions of variables, code and software specifications and versions, and many other signifiers ensure the data will be of use to colleagues in the future.

In other words, making data accessible, useable and reproducible is of the utmost importance. You make reproducing experiments that much easier if you are doing a good job of capturing metadata in a consistent way.

The Big Idea

A data management plan includes storage, curation, archiving and dissemination of research data. Your university’s digital librarian is an invaluable resource. They can answer other tricky questions as well: such as, who does data belong to? And, when a post-doctoral student in your lab leaves the institution, can s/he take their data with them? Every situation is unique and deserves a one-of-the-kind data management plan, not a one-size-fits-all solution.

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Here's your university research data management checklist. Graphic by Miguel Tovar/University of Houston

Tips for optimizing data management in research, from a UH expert

Houston voices

A data management plan is invaluable to researchers and to their universities. "You should plan at the outset for managing output long-term," said Reid Boehm, research data management librarian at University of Houston Libraries.

At the University of Houston, research data generated while individuals are pursuing research studies as faculty, staff or students of the University of Houston are to be retained by the institution for a period of three years after submission of the final report. That means there is a lot of data to be managed. But researchers are in luck – there are many resources to help navigate these issues.

Take inventory

Is your data

  • Active (constantly changing) or Inactive (static)
  • Open (public) or Proprietary (for monetary gain)
  • Non-identifiable (no human subjects) or Sensitive (containing personal information)
  • Preservable (to save long term) or To discard in 3 years (not for keeping)
  • Shareable (ready for reuse) or Private (not able to be shared)

The more you understand the kind of data you are generating the easier this step, and the next steps, will be.

Check first

When you are ready to write your plan, the first thing to determine is if your funders or the university have data management plan policy and guidelines. For instance, University of Houston does.

It is also important to distinguish between types of planning documents. For example:

A Data Management Plan (DMP) is a comprehensive, formal document that describes how you will handle your data during the course of your research and at the conclusion of your study or project.

While in some instances, funders or institutions may require a more targeted plan such as a Data Sharing Plan (DSP) that describes how you plan to disseminate your data at the conclusion of a research project.

Consistent questions that DMPs ask include:

  • What is generated?
  • How is it securely handled? and
  • How is it maintained and accessed long-term?

However it's worded, data is critical to every scientific study.

Pre-proposal

Pre-proposal planning resources and support at UH Libraries include a consultation with Boehm. "Each situation is unique and in my role I function as an advocate for researchers to talk through the contextual details, in connection with funder and institutional requirements," stated Boehm. "There are a lot of aspects of data management and dissemination that can be made less complex and more functional long term with a bit of focused planning at the beginning."

When you get started writing, visit the Data Management Plan Tool. This platform helps by providing agency-specific templates and guidance, working with your institutional login and allowing you to submit plans for feedback.

Post-project

Post-project resources and support involve the archiving, curation and the sharing of information. The UH Data Repository archives, preserves and helps to disseminate your data. The repository, the data portion of the institutional repository Cougar ROAR, is open access, free to all UH researchers, provides data sets with a digital object identifier and allows up to 10 GB per project. Most most Federal funding agencies already require this type of documentation (NSF, NASA, USGS and EPA. The NIH will require DMPs by 2023.

Start out strong

Remember, although documentation is due at the beginning of a project/grant proposal, sustained adherence to the plan and related policies is a necessity. We may be distanced socially, but our need to come together around research integrity remains constant. Starting early, getting connected to resources, and sharing as you can through avenues like the data repository are ways to strengthen ourselves and our work.

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Ad Placement 300x100
Ad Placement 300x600

CultureMap Emails are Awesome

Houston biopharma company launches equity crowdfunding campaign

money moves

A clinical-stage company headquartered in Houston has opened an online funding campaign.

FibroBiologics, which is developing fibroblast cell-based therapeutics for chronic diseases, launched a campaign with equity crowdfunding platform StartEngine. The platform lets anyone — regardless of their net worth or income level — to invest in securities issued by startups.

The funding, according to a press release, will be used to support ongoing operations of Fibrobiologics and advance its clinical programs in multiple sclerosis, degenerative disc disease, wound care, extension of life, and cancer.

"We're excited to partner with StartEngine on this campaign. StartEngine has over 600,000 investors as part of their community and has raised over half a billion dollars for its clients," says FibroBiologics' Founder and CEO Pete O'Heeron, in the release.

"This is an exciting time at FibroBiologics as we continue progressing our clinical pipeline and developing innovative therapies to treat chronic diseases," he continues. "This new funding will fuel our growth in the lab and bring us one step closer to commercialization."

The campaign, launched this week, already has over 100 investors, at the time of publication, and has raised nearly $2 million, according to the page. The minimum investment is set at around $500, and the company's indicated valuation is $252.57 million.

In 2021, FibroBiologics announced its intention of going public. Last year, O'Heeron told InnovationMap on the Houston Innovators Podcast of the company's growth plans as well as the specifics of the technology.

Only two types of cells — stem cells and fibroblasts — can be used in cell therapy for a regenerative treatment, which is when specialists take healthy cells from a patient and inject them into a part of the body that needs it the most. As O'Heeron explains in the podcast, fibroblasts can do it more effectively and cheaper than stem cells.

"(Fibroblasts) can essentially do everything a stem cell can do, only they can do it better," says O'Heeron. "We've done tests in the lab and we've seen them outperform stem cells by a low of 50 percent to a high of about 220 percent on different disease paths."


Texas ranks as a top state for female entrepreneurs

women in business

Texas dropped three spots in Merchant Maverick’s annual ranking of the top 10 states for women-led startups.

The Lone Star State landed at No. 5 thanks in part to its robust venture capital environment for women entrepreneurs. Last year, Texas ranked second, up from its No. 6 showing in 2021.

Merchant Maverick, a product comparison site for small businesses, says Texas “boasts the strongest venture capital scene” for women entrepreneurs outside California and the Northeast. The state ranked fourth in that category, with $6.5 billion invested in the past five years.

Other factors favoring Texas include:

  • Women solely lead 22 percent of all employees working for a business in Texas (No. 4).
  • Texas lacks a state income tax (tied for No. 1).

However, Texas didn’t fare well in terms of the unemployment rate (No. 36) and the rate of business ownership by women (No. 29). Other Texas data includes:

  • Average income for women business owners, $52,059 (No. 19).
  • Early startup survival rate, 81.9 percent (No. 18).

Appearing ahead of Texas in the 2023 ranking are No. 1 Colorado, No. 2 Washington, No. 3 California, and No. 4 Arizona.

Another recent ranking, this one from NorthOne, an online bank catering to small businesses, puts Texas at No. 7 among the 10 best states for women entrepreneurs.

NorthOne says Texas provides “a ton of opportunities” for woman entrepreneurs. For instance, it notches one of the highest numbers of women-owned businesses in the country at 1.4 million, 2.1 percent of which have at least 500 employees.

In this study, Texas is preceded by Colorado at No. 1, Nevada at No. 2, Virginia at No. 3, Maryland at No. 4, Florida at No. 5, and New Mexico at No. 6. The rankings are based on eight metrics, including the percentage of woman-owned businesses and the percentage of women-owned businesses with at least 500 employees.