Nom de Ome: A pseudonym for your genome

How common will genomic pseudonyms be in 25 years? When might a person choose to use a Nom de Ome?

In some sense, the Human Genome Project’s human genome reference sequence has a nom de ome (which is “human genome reference sequence”). This sequence was generated mostly from a tissue sample donated by an anonymous male from Buffalo, NY. This volunteer was likely solicited from a newspaper article that ran in the Buffalo News on March 23, 1997. Here are the opening words from that article:


Wanted: Two Buffalo area residents — a man and a woman — willing to donate their DNA for the world’s biggest science project — deciphering the genetic blueprint for human life.

The pay is small; the benefits — to humanity — are great
Scientists around the globe will study your genes, the stuff of heredity, for years to come. They will decode your blueprint and put it on the Internet for everyone to see.

That’s no joke. The advertisement appears today in The Buffalo News. [Jason: There was an advertisement that accompanied this article]

But forget about thoughts of fame or fortune. Don’t call an agent to sell the rights to your genetic story, even if it includes some interesting mutations.

One man and one woman from the Buffalo area will be chosen, a distinction few other communities can claim. But they must remain anonymous. Not even the researchers who pick them will know who they are.

“We joke that the donors could become pop culture stars if they were identified. But everyone has some defective genes. It’s possible the information could cause them problems if it becomes public,” said Pieter J. de Jong, the Roswell Park Cancer Institute scientist who placed the ad…

[Jason: the article goes on for ~6 more pages]

These researchers used anonymity as a device to protect the volunteer (and the Human Genome Project itself to some degree). Seeking protection by obscuring identity is widely practiced in many walks of life. Authors often adopt pen names, or noms de plume, for a variety of reasons. The practice has carried over into blogging as well. Some well-known bloggers, like Tyler Cowen, have “secret blogs“. The wikipedia entry of nom de plume, list a few motivations behind their use:

  1. “to replace a long, difficult, or uninteresting name”
  2. “avoid overexposure”
  3. “Authors who write in different styles use pen names to avoid confusing regular readers”
  4. “Some female authors have used male pen names to ensure that their works were accepted by publishers or taken seriously” [Jason: Ben Franklin used one so his big brother would take his writing more seriously and publish his work, i.e. to avoid age discrimination]
  5. “A pseudonym may also be used to protect the writer, in cases of exposé books about espionage or crime.”

Many of these seem vaguely applicable to personal genomics. In recent weeks, I’ve been dedicating a fair number of brain cycles to the practice of data de-identification (and re-identification), which describes the process of scrubbing data, such as medical records, in order to obscure who the data is about (or the reverse for re-identification). The practice is commonly used in medical research in order to improve the odds of maintaining the confidentiality of patient data. The most obvious identifiers are names and social security numbers. HIPAA describes 18 personal identifiers.

A Nom de Ome might be considered one type of de-identification, insofar as it might be a practical means of obscuring the owner or source of a DNA sequence (although a very weak method, remember the 15 year old boy who tracked down his “anonymous” sperm donor father?).

A real genome with a fake name is a Nom de Ome, I think that works. What about a fake genome with a real name? What is that called? A pseudome? A pseudogenome? A genomequin?

I could imagine someone putting a fake genome sequence out on the web to obscure a real genome sequence. Or, putting a bunch of fake genome sequences on the web, which is sort of a denial of service (DoS) attack, except the goal is to deny identity and it is not necessarily as insidious as an attack. A google search for Jason Bobe’s genome might turn up 1000 different genomes. Which one is real?

One area where things start to get dicey is the practice of fabricating a human genome sequence completely and calling it real and saying that it belongs to you and fooling a researcher to believe you. Such a practice might create a serious problem for genomic research if there aren’t good authentication practices in place. What if NCBI’s GenBank were filled with thousands or millions of human genome sequences that were fake? Not good.

Why else might someone use a Nom de Ome? What other implications would these practices have? Does anyone know of a good social history of the use of pen names by authors? What would a sudden spike in the use of pen names, in newspaper editorials for example, be an indicator of?

The man from Buffalo described above never had a choice in the matter of using a Nom de Ome. The researchers decided for him that he needed one (and that it would be “human genome reference sequence”). As more people choose to get personal genome sequencing, they will need to decide: what do I want to call my genome?

P.S. Does anyone have a scanned image of the advertisement that solicited volunteers for the Human Genome Project, which accompanied the above article quote above (waaay back in March 23, 1997 Buffalo News)? I would love to see it.


