houston voices

Houston expert: Navigating dark data within research and innovation

Let's talk about dark data — what it means and how to navigate it. Graphic byMiguel Tovar/University of Houston

Is it necessary to share ALL your data? Is transparency a good thing or does it make researchers “vulnerable,” as author Nathan Schneider suggests in the Chronicle of Higher Education article, “Why Researchers Shouldn’t Share All Their Data.”

Dark Data Defined

Dark data is defined as the universe of information an organization collects, processes and stores – oftentimes for compliance reasons. Dark data never makes it to the official publication part of the project. According to the Gartner Glossary, “storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

This topic is reminiscent of the file drawer effect, a phenomenon which reflects the influence of the results of a study on whether or not the study is published. Negative results can be just as important as hypotheses that are proven.

Publication bias and the need to only publish positive research that supports the PI’s hypothesis, it can be argued, is not good science. According to an article in the Indian Journal of Anaesthesia, authors Priscilla Joys Nagarajan, et al., wrote: “It is speculated that every significant result in the published world has 19 non-significant counterparts in file drawers.” That’s one definition of dark data.

Total Transparency

But what to do with all your excess information that did not make it to publication, most likely because of various constraints? Should everything, meaning every little tidbit, be readily available to the research community?

Schneider doesn’t think it should be. In his article, he writes that he hides some findings in a paper notebook or behind a password, and he keeps interviews and transcripts offline altogether to protect his sources.

Open-source

Open-source software communities tend to regard total transparency as inherently good. What are the advantages of total transparency? You may make connections between projects that you wouldn’t have otherwise. You can easily reproduce a peer’s experiment. You can even become more meticulous in your note-taking and experimental methods since you know it’s not private information. Similarly, journalists will recognize this thought pattern as the recent, popular call to engage in “open journalism.” Essentially, an author’s entire writing and editing process can be recorded, step by step.

TMI

This trend has led researchers to open-source programs like Jupyter and GitHub. Open-source programs detail every change that occurs along a project’s timeline. Is unorganized, excessive amounts of unpublishable data really what transparency means? Or does it confuse those looking for meaningful research that is meticulously curated?

The Big Idea

And what about the “vulnerability” claim? Sharing every edit and every new direction taken opens a scientist up to scoffers and harassment, even. Dark data in industry even involves publishing salaries, which can feel unfair to underrepresented, marginalized populations.

In Model View Culture, Ellen Marie Dash wrote: “Let’s give safety and consent the absolute highest priority, with openness and transparency prioritized explicitly below those. This means digging deep, properly articulating in detail what problems you are trying to solve with openness and transparency, and handling them individually or in smaller groups.”

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Trending News

Building Houston

 
 

Houston innovators podcast episode 140

What Houston can expect from its rising innovation district

Sam Dike of Rice Management Company joins the Houston Innovators Podcast to discuss the past, present, and future of Houston's rising Ion Innovation District. Photo via rice.edu

Last month, the Ion Houston welcomed in the greater Houston community to showcase the programs and companies operating within the Ion Innovation District — and the week-long Ion Activation Festival spotlighted just the beginning.

The rising district — anchored by the Ion — is a 16-acre project in Midtown Houston owned and operated by Rice Management Company, an organization focused on managing Rice University's $8.1 billion endowment.

"We're chiefly responsible for stewarding the university's endowment and generating returns to support the academic mission of the university," says Samuel Dike, manager of strategic initiatives at RMC, on this week's episode of the Houston Innovators Podcast. "Part of those returns go to support student scholarships and student success — as well as many of the other academic programs."

"The university sees a dual purpose behind the investing," Dike continues, in addition to focusing on generating returns, RMC's mission is "also to be a valuable partner in Houston's ecosystem and pushing Houston as a global 21st century city."

RMC saw an opportunity a few years back to make an investment in Houston's nascent innovation and tech ecosystem, and announced the plans for the Ion, a 266,000-square-foot innovation hub in an renovated and rehabilitated Sears.

"In some ways innovation is not necessarily about creating something completely new — it's oftentimes building upon something that exists and making it better," Dike says. "I think that's what we've done with the building itself.

"We took something that had really strong bones and a strong identity here in Houston," he continues, "and we did something that's often atypical in Houston and preserved and repurposed it — not an easy logistical or financial decision to make, but we believed it was the best for Houston and for the project."

Now, the Ion District includes the Ion as the anchor, as well as Greentown Houston, which moved into a 40,000-square-foot space in the former Fiesta Mart building, just down the street. While RMC has announced a few other initiatives, the next construction project to be delivered is a 1,500-space parking garage that will serve the district.

"It is not your typical parking garage," Dike says. "The garage will feature a vegetated facade with ground-floor retail and gallery space, as well as EV charging spaces and spaces to feature display spaces for future tech. It's going to be a nice addition to the district."

The new garage will free up surface parking lots that then will be freed up for future construction projects, Dike explains.

He shares more about the past, present, and future of the Ion and the district as a whole on the podcast. Listen to the interview below — or wherever you stream your podcasts — and subscribe for weekly episodes.



Trending News