Let's talk about dark data — what it means and how to navigate it. Graphic byMiguel Tovar/University of Houston

Is it necessary to share ALL your data? Is transparency a good thing or does it make researchers “vulnerable,” as author Nathan Schneider suggests in the Chronicle of Higher Education article, “Why Researchers Shouldn’t Share All Their Data.”

Dark Data Defined

Dark data is defined as the universe of information an organization collects, processes and stores – oftentimes for compliance reasons. Dark data never makes it to the official publication part of the project. According to the Gartner Glossary, “storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

This topic is reminiscent of the file drawer effect, a phenomenon which reflects the influence of the results of a study on whether or not the study is published. Negative results can be just as important as hypotheses that are proven.

Publication bias and the need to only publish positive research that supports the PI’s hypothesis, it can be argued, is not good science. According to an article in the Indian Journal of Anaesthesia, authors Priscilla Joys Nagarajan, et al., wrote: “It is speculated that every significant result in the published world has 19 non-significant counterparts in file drawers.” That’s one definition of dark data.

Total Transparency

But what to do with all your excess information that did not make it to publication, most likely because of various constraints? Should everything, meaning every little tidbit, be readily available to the research community?

Schneider doesn’t think it should be. In his article, he writes that he hides some findings in a paper notebook or behind a password, and he keeps interviews and transcripts offline altogether to protect his sources.

Open-source

Open-source software communities tend to regard total transparency as inherently good. What are the advantages of total transparency? You may make connections between projects that you wouldn’t have otherwise. You can easily reproduce a peer’s experiment. You can even become more meticulous in your note-taking and experimental methods since you know it’s not private information. Similarly, journalists will recognize this thought pattern as the recent, popular call to engage in “open journalism.” Essentially, an author’s entire writing and editing process can be recorded, step by step.

TMI

This trend has led researchers to open-source programs like Jupyter and GitHub. Open-source programs detail every change that occurs along a project’s timeline. Is unorganized, excessive amounts of unpublishable data really what transparency means? Or does it confuse those looking for meaningful research that is meticulously curated?

The Big Idea

And what about the “vulnerability” claim? Sharing every edit and every new direction taken opens a scientist up to scoffers and harassment, even. Dark data in industry even involves publishing salaries, which can feel unfair to underrepresented, marginalized populations.

In Model View Culture, Ellen Marie Dash wrote: “Let’s give safety and consent the absolute highest priority, with openness and transparency prioritized explicitly below those. This means digging deep, properly articulating in detail what problems you are trying to solve with openness and transparency, and handling them individually or in smaller groups.”

------

This article originally appeared on the University of Houston's The Big Idea. Sarah Hill, the author of this piece, is the communications manager for the UH Division of Research.

Ad Placement 300x100
Ad Placement 300x600

CultureMap Emails are Awesome

Early-stage accelerator names 9th Houston cohort

ready to grow

For the ninth time, gBETA is incubating five early-stage Houston startups providing innovative solutions across skincare, human resources, and more.

Global organization gener8tor, along with Downtown Launchpad, started its ninth gBETA Houston cohort last month. The free seven-week, no-equity accelerator program selected five Houston-based founders to provide helpful programming, support, and connections to mentors, customers, corporate partners, and investors.

"We're thrilled to continue fostering innovation in Houston and are thankful for our collaboration with Downtown Launchpad as we launch the ninth cohort of gBETA Houston,” says Vanessa Huerta, vice president of gBETA at gener8tor, in a statement.

The program has accelerated 40 Houston companies since its launch in Houston a few years ago. The companies have gone on to raise over $8.6 million in funding and created more than 70 jobs.

“With each new cohort, we witness the power of innovation unleashed,” Muriel Foster, gBETA Houston director, says in the release. “The Spring 2024 gBETA Houston cohort embodies the spirit of relentless creativity and boundless ambition.”

The gBETA Houston Spring 2024 Cohort includes:

  • Cosnetix is innovating within personalized skincare, leveraging genetic and microbial skin profiling to offer users custom skincare product recommendations. The platform has been developed through over 100 customer discovery interviews and is headed for beta-testing.
  • Kannect has created an innovative community engagement platform — already used by 20 organizations — to streamline communication, foster collaboration, and enhance member engagement. The tools can be used by nonprofits, associations, religious institutions, and beyond as a digital dashboard to manage memberships, organize events, and facilitate meaningful interactions.
  • Targeting college grads and career pivoters, No Experience Jobs helps users find entry-level jobs that don’t require experience. In its first three months of launching, NoExperienceJobs.io received more than 72,000 unique monthly visitors, gained over 1,300 newsletter subscribers, generated more than 700,000 social media engagements, and is already revenue-generating.
  • The Roo App partners with bars and restaurants to connect designating drivers to those who need designated driver services. The company is currently operation on a web-based platform with over 1,500 current visitors, but plans to launch the mobile application later this year.
  • Yuyo.love is changing the fitness game by providing bilingual fitness classes ranging from yoga, pilates, dance, fitness, nutrition, and meditation. The company's hybrid classes have over 150 participants per class and plans to launch the platform this quarter.

Houston organization introduces inaugural cancer-fighting cohort of data sciences, experts

new to hou

The University of Texas MD Anderson Cancer Center is one step closer to ending cancer thanks to its new institute that's focused on data science.

MD Anderson’s goal with the new Institute for Data Science in Oncology (IDSO) is to advance collaborative projects that will bring the power of data science to every decision made at the hospital. And now, the IDSO has announced its inaugural cohort of 33 scientists, clinicians, and staff that will bring it to life, joining the already appointed leadership and focus area co-leads.

“By engaging diverse expertise across all of our mission areas, we will enhance the rich and productive data science ecosystem at MD Anderson to deliver transformational impact for patients,” David Jaffray, Ph.D., director of IDSO and chief technology and digital officer at MD Anderson, says in a press release.

The focus areas for the IDSO are quantitative pathology and medical imaging; single-cell analytics; computational modeling for precision medicine; decision analytics for health; and safety, quality, and access.

The IDSO Affiliates, as they are known, are a mix of existing contributors to the IDSO and team members who were recruited specifically for their expertise in data science. The affiliates were chosen to fulfill a two-year term, during which they will focus on IDSO projects related to the focus areas above. The diverse roster of professionals includes:

“Our affiliates bring expertise, perspectives and commitment from across the institution to foster impactful data science in order to tackle the most urgent needs of our patients and their families,” said Caroline Chung, M.D., director of Data Science Development and Implementation for IDSO and chief data officer at MD Anderson. “People and community are at the heart of our efforts, and establishing the IDSO Affiliates is an exciting step in growing the most impactful ecosystem for data science in the world.”

Houston initiative selected for DOE program developing hubs for clean energy innovation

seeing green

Houston has been selected as one of the hubs backed by a new program from the United States Department of Energy that's developing communities for clean energy innovation.

The DOE's Office of Technology Transitions announced the the first phase of winners of the Energy Program for Innovation Clusters, or EPIC, Round 3. The local initiative is one of 23 incubators and accelerators that was awarded $150,000 to support programming for energy startups and entrepreneurs.

The Houston-based participant is called "Texas Innovates: Carbon and Hydrogen Innovation and Learning Incubator," or CHILI, and it's a program meant to feed startups into the DOE recognized HyVelocity program and other regional decarbonization efforts.

EPIC was launched to drive innovation at a local level and to inspire commercial success of energy startups. It's the third year of the competition that wraps up with a winning participant negotiating a three-year cooperative agreement with OTT worth up to $1 million.

“Incubators and Accelerators are uniquely positioned to provide startups things they can't get anywhere else -- mentorship, technology validation, and other critical business development support," DOE Chief Commercialization Officer and Director of OTT Vanessa Z. Chan says in a news release. “The EPIC program allows us to provide consistent funding to organizations who are developing robust programming, resources, and support for innovative energy startups and entrepreneurs.”

CHILI, the only participant in Texas, now moves on to the second phase of the competition, where they will design a project continuation plan and programming for the next seven months to be submitted in September.

Phase 2 also includes two national pitch competitions with a total of $165,000 in cash prizes up for grabs for startups. The first EPIC pitch event for 2024 will be in June at the 2024 Small Business Forum & Expo in Minneapolis, Minnesota.

Last fall, the DOE selected the Gulf Coast's project, HyVelocity Hydrogen Hub, as one of the seven regions to receive a part of the $7 billion in Bipartisan Infrastructure Law. The hub was announced to receive up to $1.2 billion — the most any hub will get.

The DOE's OTT selections are nationwide. Photo via energy.gov

------

This article originally ran on EnergyCapital.