Let's talk about dark data — what it means and how to navigate it. Graphic by Miguel Tovar/University of Houston

Is it necessary to share ALL your data? Is transparency a good thing or does it make researchers “vulnerable,” as author Nathan Schneider suggests in the Chronicle of Higher Education article, “Why Researchers Shouldn’t Share All Their Data.”

Dark Data Defined

Dark data is defined as the universe of information an organization collects, processes and stores – oftentimes for compliance reasons. Dark data never makes it to the official publication part of the project. According to the Gartner Glossary, “storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

This topic is reminiscent of the file drawer effect, a phenomenon which reflects the influence of the results of a study on whether or not the study is published. Negative results can be just as important as hypotheses that are proven.

Publication bias and the need to only publish positive research that supports the PI’s hypothesis, it can be argued, is not good science. According to an article in the Indian Journal of Anaesthesia, authors Priscilla Joys Nagarajan, et al., wrote: “It is speculated that every significant result in the published world has 19 non-significant counterparts in file drawers.” That’s one definition of dark data.

Total Transparency

But what to do with all your excess information that did not make it to publication, most likely because of various constraints? Should everything, meaning every little tidbit, be readily available to the research community?

Schneider doesn’t think it should be. In his article, he writes that he hides some findings in a paper notebook or behind a password, and he keeps interviews and transcripts offline altogether to protect his sources.

Open-source

Open-source software communities tend to regard total transparency as inherently good. What are the advantages of total transparency? You may make connections between projects that you wouldn’t have otherwise. You can easily reproduce a peer’s experiment. You can even become more meticulous in your note-taking and experimental methods since you know it’s not private information. Similarly, journalists will recognize this thought pattern as the recent, popular call to engage in “open journalism.” Essentially, an author’s entire writing and editing process can be recorded, step by step.

TMI

This trend has led researchers to open-source programs like Jupyter and GitHub. Open-source programs detail every change that occurs along a project’s timeline. Is unorganized, excessive amounts of unpublishable data really what transparency means? Or does it confuse those looking for meaningful research that is meticulously curated?

The Big Idea

And what about the “vulnerability” claim? Sharing every edit and every new direction taken opens a scientist up to scoffers and harassment, even. Dark data in industry even involves publishing salaries, which can feel unfair to underrepresented, marginalized populations.

In Model View Culture, Ellen Marie Dash wrote: “Let’s give safety and consent the absolute highest priority, with openness and transparency prioritized explicitly below those. This means digging deep, properly articulating in detail what problems you are trying to solve with openness and transparency, and handling them individually or in smaller groups.”

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Ad Placement 300x100
Ad Placement 300x600

CultureMap Emails are Awesome

MD Anderson makes AI partnership to advance precision oncology

AI Oncology

Few experts will disagree that data-driven medicine is one of the most certain ways forward for our health. However, actually adopting it comes at a steep curve. But what if using the technology were democratized?

This is the question that SOPHiA GENETICS has been seeking to answer since 2011 with its universal AI platform, SOPHiA DDM. The cloud-native system analyzes and interprets complex health care data across technologies and institutions, allowing hospitals and clinicians to gain clinically actionable insights faster and at scale.

The University of Texas MD Anderson Cancer Center has just announced its official collaboration with SOPHiA GENETICS to accelerate breakthroughs in precision oncology. Together, they are developing a novel sequencing oncology test, as well as creating several programs targeted at the research and development of additional technology.

That technology will allow the hospital to develop new ways to chart the growth and changes of tumors in real time, pick the best clinical trials and medications for patients and make genomic testing more reliable. Shashikant Kulkarni, deputy division head for Molecular Pathology, and Dr. J. Bryan, assistant professor, will lead the collaboration on MD Anderson’s end.

“Cancer research has evolved rapidly, and we have more health data available than ever before. Our collaboration with SOPHiA GENETICS reflects how our lab is evolving and integrating advanced analytics and AI to better interpret complex molecular information,” Dr. Donna Hansel, division head of Pathology and Laboratory Medicine at MD Anderson, said in a press release. “This collaboration will expand our ability to translate high-dimensional data into insights that can meaningfully advance research and precision oncology.”

SOPHiA GENETICS is based in Switzerland and France, and has its U.S. offices in Boston.

“This collaboration with MD Anderson amplifies our shared ambition to push the boundaries of what is possible in cancer research,” Dr. Philippe Menu, chief product officer and chief medical officer at SOPHiA GENETICS, added in the release. “With SOPHiA DDM as a unifying analytical layer, we are enabling new discoveries, accelerating breakthroughs in precision oncology and, most importantly, enabling patients around the globe to benefit from these innovations by bringing leading technologies to all geographies quickly and at scale.”

Houston company plans lunar mission to test clean energy resource

lunar power

Houston-based natural resource and lunar development company Black Moon Energy Corporation (BMEC) announced that it is planning a robotic mission to the surface of the moon within the next five years.

The company has engaged NASA’s Jet Propulsion Laboratory (JPL) and Caltech to carry out the mission’s robotic systems, scientific instrumentation, data acquisition and mission operations. Black Moon will lead mission management, resource-assessment strategy and large-scale operations planning.

The goal of the year-long expedition will be to gather data and perform operations to determine the feasibility of a lunar Helium-3 supply chain. Helium-3 is abundant on the surface of the moon, but extremely rare on Earth. BMEC believes it could be a solution to the world's accelerating energy challenges.

Helium-3 fusion releases 4 million times more energy than the combustion of fossil fuels and four times more energy than traditional nuclear fission in a “clean” manner with no primary radioactive products or environmental issues, according to BMEC. Additionally, the company estimates that there is enough lunar Helium-3 to power humanity for thousands of years.

"By combining Black Moon's expertise in resource development with JPL and Caltech's renowned scientific and engineering capabilities, we are building the knowledge base required to power a new era of clean, abundant, and affordable energy for the entire planet," David Warden, CEO of BMEC, said in a news release.

The company says that information gathered from the planned lunar mission will support potential applications in fusion power generation, national security systems, quantum computing, radiation detection, medical imaging and cryogenic technologies.

Black Moon Energy was founded in 2022 by David Warden, Leroy Chiao, Peter Jones and Dan Warden. Chiao served as a NASA astronaut for 15 years. The other founders have held positions at Rice University, Schlumberger, BP and other major energy space organizations.

Houston co. makes breakthrough in clean carbon fiber manufacturing

Future of Fiber

Houston-based Mars Materials has made a breakthrough in turning stored carbon dioxide into everyday products.

In partnership with the Textile Innovation Engine of North Carolina and North Carolina State University, Mars Materials turned its CO2-derived product into a high-quality raw material for producing carbon fiber, according to a news release. According to the company, the product works "exactly like" the traditional chemical used to create carbon fiber that is derived from oil and coal.

Testing showed the end product met the high standards required for high-performance carbon fiber. Carbon fiber finds its way into aircraft, missile components, drones, racecars, golf clubs, snowboards, bridges, X-ray equipment, prosthetics, wind turbine blades and more.

The successful test “keeps a promise we made to our investors and the industry,” Aaron Fitzgerald, co-founder and CEO of Mars Materials, said in the release. “We proved we can make carbon fiber from the air without losing any quality.”

“Just as we did with our water-soluble polymers, getting it right on the first try allows us to move faster,” Fitzgerald adds. “We can now focus on scaling up production to accelerate bringing manufacturing of this critical material back to the U.S.”

Mars Materials, founded in 2019, converts captured carbon into resources, such as carbon fiber and wastewater treatment chemicals. Investors include Untapped Capital, Prithvi Ventures, Climate Capital Collective, Overlap Holdings, BlackTech Capital, Jonathan Azoff, Nate Salpeter and Brian Andrés Helmick.

---

This article originally appeared on our sister site, EnergyCapitalHTX.com.