Let's talk about dark data — what it means and how to navigate it. Graphic by Miguel Tovar/University of Houston

Is it necessary to share ALL your data? Is transparency a good thing or does it make researchers “vulnerable,” as author Nathan Schneider suggests in the Chronicle of Higher Education article, “Why Researchers Shouldn’t Share All Their Data.”

Dark Data Defined

Dark data is defined as the universe of information an organization collects, processes and stores – oftentimes for compliance reasons. Dark data never makes it to the official publication part of the project. According to the Gartner Glossary, “storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

This topic is reminiscent of the file drawer effect, a phenomenon which reflects the influence of the results of a study on whether or not the study is published. Negative results can be just as important as hypotheses that are proven.

Publication bias and the need to only publish positive research that supports the PI’s hypothesis, it can be argued, is not good science. According to an article in the Indian Journal of Anaesthesia, authors Priscilla Joys Nagarajan, et al., wrote: “It is speculated that every significant result in the published world has 19 non-significant counterparts in file drawers.” That’s one definition of dark data.

Total Transparency

But what to do with all your excess information that did not make it to publication, most likely because of various constraints? Should everything, meaning every little tidbit, be readily available to the research community?

Schneider doesn’t think it should be. In his article, he writes that he hides some findings in a paper notebook or behind a password, and he keeps interviews and transcripts offline altogether to protect his sources.

Open-source

Open-source software communities tend to regard total transparency as inherently good. What are the advantages of total transparency? You may make connections between projects that you wouldn’t have otherwise. You can easily reproduce a peer’s experiment. You can even become more meticulous in your note-taking and experimental methods since you know it’s not private information. Similarly, journalists will recognize this thought pattern as the recent, popular call to engage in “open journalism.” Essentially, an author’s entire writing and editing process can be recorded, step by step.

TMI

This trend has led researchers to open-source programs like Jupyter and GitHub. Open-source programs detail every change that occurs along a project’s timeline. Is unorganized, excessive amounts of unpublishable data really what transparency means? Or does it confuse those looking for meaningful research that is meticulously curated?

The Big Idea

And what about the “vulnerability” claim? Sharing every edit and every new direction taken opens a scientist up to scoffers and harassment, even. Dark data in industry even involves publishing salaries, which can feel unfair to underrepresented, marginalized populations.

In Model View Culture, Ellen Marie Dash wrote: “Let’s give safety and consent the absolute highest priority, with openness and transparency prioritized explicitly below those. This means digging deep, properly articulating in detail what problems you are trying to solve with openness and transparency, and handling them individually or in smaller groups.”

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Ad Placement 300x100
Ad Placement 300x600

CultureMap Emails are Awesome

Biosciences startup becomes Texas' first decacorn after latest funding

A Dallas-based biosciences startup whose backers include millionaire investors from Austin and Dallas has reached decacorn status — a valuation of at least $10 billion — after hauling in a series C funding round of $200 million, the company announced this month. Colossal Biosciences is reportedly the first Texas startup to rise to the decacorn level.

Colossal, which specializes in genetic engineering technology designed to bring back or protect various species, received the $200 million from TWG Global, an investment conglomerate led by billionaire investors Mark Walter and Thomas Tull. Walter is part owner of Major League Baseball’s Los Angeles Dodgers, and Tull is part owner of the NFL’s Pittsburgh Steelers.

Among the projects Colossal is tackling is the resurrection of three extinct animals — the dodo bird, Tasmanian tiger and woolly mammoth — through the use of DNA and genomics.

The latest round of funding values Colossal at $10.2 billion. Since launching in 2021, the startup has raised $435 million in venture capital.

In addition to Walter and Tull, Colossal’s investors include prominent video game developer Richard Garriott of Austin and private equity veteran Victor Vescov of Dallas. The two millionaires are known for their exploits as undersea explorers and tourist astronauts.

Aside from Colossal’s ties to Dallas and Austin, the startup has a Houston connection.

The company teamed up with Baylor College of Medicine researcher Paul Ling to develop a vaccine for elephant endotheliotropic herpesvirus (EEHV), the deadliest disease among young elephants. In partnership with the Houston Zoo, Ling’s lab at the Baylor College of Medicine has set up a research program that focuses on diagnosing and treating EEHV, and on coming up with a vaccine to protect elephants against the disease. Ling and the BCMe are members of the North American EEHV Advisory Group.

Colossal operates research labs Dallas, Boston and Melbourne, Australia.

“Colossal is the leading company working at the intersection of AI, computational biology, and genetic engineering for both de-extinction and species preservation,” Walter, CEO of TWG Globa, said in a news release. “Colossal has assembled a world-class team that has already driven, in a short period of time, significant technology innovations and impact in advancing conservation, which is a core value of TWG Global.”

Well-known genetics researcher George Church, co-founder of Colossal, calls the startup “a revolutionary genetics company making science fiction into science fact.”

“We are creating the technology to build de-extinction science and scale conservation biology,” he added, “particularly for endangered and at-risk species.”

Houston investment firm names tech exec as new partner

new hire

Houston tech executive Robert Kester has joined Houston-based Veriten, an energy-focused research, investment and strategy firm, as technology and innovation partner.

Kester most recently served as chief technology officer for emissions solutions at Honeywell Process Solutions, where he worked for five years. Honeywell International acquired Houston-based oil and gas technology company Rebellion Photonics, where Kester was co-founder and CEO, in 2019.

Honeywell Process Solutions shares offices in Houston with the global headquarters of Honeywell Performance Materials and Technologies. Honeywell, a Fortune 100 conglomerate, employs more than 850 people in Houston.

“We are thrilled to welcome Robert to the Veriten team,” founder and CEO Maynard Holt said in a statement, “and are confident that his technical expertise and skills will make a big contribution to Veriten’s partner and investor community. He will [oversee] every aspect of what we do, with the use case for AI in energy high on the 2025 priority list.”

Kester earned a doctoral degree in bioengineering from Rice University, a master’s degree in optical sciences from the University of Arizona and a bachelor’s degree in laser optical engineering technology from the Oregon Institute of Technology. He holds 25 patents and has more than 25 patents pending.

Veriten celebrated its third anniversary on January 10, the day that the hiring of Kester was announced. The startup launched with seven employees.

“With the addition of Dr. Kester, we are a 26-person team and are as enthusiastic as ever about improving the energy dialogue and researching the future paths for energy,” Holt added.

Kester spoke on the Houston Innovators Podcast in 2021. Listen here

.