Let's talk about dark data — what it means and how to navigate it. Graphic by Miguel Tovar/University of Houston

Is it necessary to share ALL your data? Is transparency a good thing or does it make researchers “vulnerable,” as author Nathan Schneider suggests in the Chronicle of Higher Education article, “Why Researchers Shouldn’t Share All Their Data.”

Dark Data Defined

Dark data is defined as the universe of information an organization collects, processes and stores – oftentimes for compliance reasons. Dark data never makes it to the official publication part of the project. According to the Gartner Glossary, “storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

This topic is reminiscent of the file drawer effect, a phenomenon which reflects the influence of the results of a study on whether or not the study is published. Negative results can be just as important as hypotheses that are proven.

Publication bias and the need to only publish positive research that supports the PI’s hypothesis, it can be argued, is not good science. According to an article in the Indian Journal of Anaesthesia, authors Priscilla Joys Nagarajan, et al., wrote: “It is speculated that every significant result in the published world has 19 non-significant counterparts in file drawers.” That’s one definition of dark data.

Total Transparency

But what to do with all your excess information that did not make it to publication, most likely because of various constraints? Should everything, meaning every little tidbit, be readily available to the research community?

Schneider doesn’t think it should be. In his article, he writes that he hides some findings in a paper notebook or behind a password, and he keeps interviews and transcripts offline altogether to protect his sources.

Open-source

Open-source software communities tend to regard total transparency as inherently good. What are the advantages of total transparency? You may make connections between projects that you wouldn’t have otherwise. You can easily reproduce a peer’s experiment. You can even become more meticulous in your note-taking and experimental methods since you know it’s not private information. Similarly, journalists will recognize this thought pattern as the recent, popular call to engage in “open journalism.” Essentially, an author’s entire writing and editing process can be recorded, step by step.

TMI

This trend has led researchers to open-source programs like Jupyter and GitHub. Open-source programs detail every change that occurs along a project’s timeline. Is unorganized, excessive amounts of unpublishable data really what transparency means? Or does it confuse those looking for meaningful research that is meticulously curated?

The Big Idea

And what about the “vulnerability” claim? Sharing every edit and every new direction taken opens a scientist up to scoffers and harassment, even. Dark data in industry even involves publishing salaries, which can feel unfair to underrepresented, marginalized populations.

In Model View Culture, Ellen Marie Dash wrote: “Let’s give safety and consent the absolute highest priority, with openness and transparency prioritized explicitly below those. This means digging deep, properly articulating in detail what problems you are trying to solve with openness and transparency, and handling them individually or in smaller groups.”

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Ad Placement 300x100
Ad Placement 300x600

CultureMap Emails are Awesome

Austin company to bring AI-powered school to The Woodlands

AI education

Austin-based Alpha School, which operates AI-powered private schools, is opening its first Houston-area location in The Woodlands.

The 8,000-square-foot school, scheduled to be ready for the 2026-27 academic year, initially will serve students in kindergarten through eighth grade. Alpha says the school will offer “open workshop spaces and innovative classrooms that support personalized instruction, core academics, leadership development, and real-world life skills.”

Alpha sets aside two hours each school day for the AI-driven, self-paced study of core subjects like math, reading and science. The rest of each school day consists of life-skills workshops focusing on topics such as leadership and financial literacy.

Alpha’s school in The Woodlands has begun accepting applications for the 2026-27 school year. Annual tuition costs $40,000.

“The Woodlands is one of the most dynamic, forward-thinking communities in Texas, and Alpha is proud to bring

an innovative educational model that complements its strong academic foundation,” says Rachel Goodlad, head

of expansion for Alpha.

Founded in 2014, Alpha School combines adaptive technology-driven instruction with immersive life-skills workshops. Its model emphasizes mastery-based learning in core subjects alongside development of communication, critical thinking, financial literacy and leadership skills. It operates more than 15 schools across the country.

Elsewhere in Texas, Alpha operates schools in Austin, Brownsville, Fort Worth and Plano. Alpha also operates 12 Texas Sports Academy campuses in Texas, including locations in Houston, Pearland and Richmond, along with a NextGen Academy esports school in Austin, a school for gifted students in Georgetown, and lower-cost Nova Academy campuses in Austin and Bastrop.

Alpha has fans and critics. While supporters tout students’ high achievement rates, detractors complain about the high tuition and the AI-influenced depersonalization of education.

“Students and our country need to be in relationship with other human beings,” Randi Weingarten, president of the American Federation of Teachers, a teachers union, tells The New York Times. “When you have a school that is strictly AI, it is violating that core precept of the human endeavor and of education.”

Alpha co-founder MacKenzie Price, a podcaster and social media influencer, doesn’t share Weingarten’s views.

“Parents and teachers: We need to embrace this change,” Price wrote after President Trump signed an executive order promoting AI in schools.

The Times notes that Alpha doesn’t employ AI as a tutor or a supplement. Rather, the newspaper says, AI is “the school’s primary educational driver to move students through academic content.”

Houston researcher secures $1.7M to develop drug for aggressive form of breast cancer

cancer research

A University of Houston researcher has joined a $3.2 million effort to develop a new drug designed to attack a cancer-driving protein commonly found in triple-negative breast cancer.

Triple-negative breast cancer (TNBC) is one of the most difficult-to-treat forms of cancer and accounts for 10 percent to 15 percent of all breast cancer cases. The disease gets its name because tumors associated with it test negative for estrogen receptors, progesterone receptors and excess HER2 protein, making it difficult to target. Due to this, TNBC is often treated with general chemotherapy, which can come with negative side effects and drug resistance, according to UH.

UH College of Pharmacy research associate professor Wei Wang is developing a drug that can target the disease more specifically. The drug will target MDM2, a protein often overproduced in TNBC that also contributes to faster tumor growth.

Wang is working on a team led by Wei Li, director of the University of Tennessee Health Science Center College of Pharmacy’s Drug Discovery Center. She has received $1.7 million to support the research.

Wang and UH professor of pharmacology and toxicology Ruiwen Zhang have discovered a compound that can break down MDM2. In early laboratory models, the compound has shown the ability to shrink tumors.

Wang and Zhang will focus on understanding how the treatment works and monitoring its effectiveness in models that closely mirror human disease.

“We will study how the drug targets MDM2 and evaluate the most promising drug candidates to determine effective dosing, understand how the drug behaves in the body, compare it with existing treatments and assess early safety,” Wang said in a news release.

Li’s team at the University of Tennessee will be working on the chemistry and drug design end of the project.

“This work could lead to an entirely new class of therapies for triple-negative breast cancer,” Li added in the release. “We’re hopeful that by directly removing the MDM2 protein from cancer cells, we can help more patients respond to treatment regardless of their tumor type.”