Paco Xander Nathan
2 min readFeb 1, 2021

--

Glad that helps :) This area is really taking off in industry, with related work in academia and research.

In fact, there are resources – depending on which domain you're working in... For example, in geosciences in the US there are precisely those kinds of metadata "wikipedia of datasets" for NOAA, NASA, USGS, etc., in terms of climate science, drought, wildfires, flooding, hurricane watch, tsunami early warning, agriculture, ice flows, etc., and there are strict federal laws which mandate making that metadata available. Outside private research organizations such as https://firststreet.org/ put that data and others into good use for important use cases.

For the social sciences, among the US statistical agencies and the state and local agencies who partner with them (e.g., Census, USDA, NFS, HHS, BoL, etc.) there are similar resources and legal mandates in this area regarding https://www.data.gov/ and other relatively standardized practices.

For work in industry, especially w.r.t. industrial supply chain management, there are also a wide array of resources, albeit much of that is proprietary for the parties engaged directly.

Certainly in areas of life sciences research, there are also great resources. This is probably more sophisticated in EU and UK due to their pioneering lead in the field over the years, establishing standards and tying research funding to metadata practices for grantees, although in the US we're seeing Sloan, CZI, and other institutional funding push researchers to level up their practices.

Overall, there are also excellent vendors who specialize in this kind of data/metadata service, such as https://data.world/

I'd likely caution against a Wikipedia-esque approach, overall. Different domains have different needs, and a OSFA approach can introduce severe liabilities. For example, there's consensus among researchers that while Wikidata process and metadata is relatively good, the Wikipedia process and metadata is nearly disastrous. Case in point, there's a Wikipedia page about me as an individual, without almost no correct information, and moreover part of it got hijacked by a guy in Italy whom I've never met, who appointed himself co-author on some of my published materials. Frankly, I'd be terrified if pandemic research worldwide depended on data + metadata that followed Wikipedia's lax tech-bro "rules" for data governance and accountability! :)

--

--

Paco Xander Nathan
Paco Xander Nathan

Written by Paco Xander Nathan

evil mad scientist @ Senzing ; https://derwen.ai/paco ; @pacoid.bsky.social ; lives on an apple orchard in the coastal redwoods coastal redwoods /|\

Responses (1)