Pseudonym Generation

In entici, we distinguish between internal and external pseudonyms. The internal pseudonym is the default pseudonym which will be generated whenever a new resource is pseudonymized. Therefore, the relation between resource and internal pseudonym is 1:1.

In addition, an arbitrary number of external pseudonyms may be generated for one resource. This mechanism enables project-based pseudonymization. Therefore, the relation between resource and external pseudonym is 1:N.

All pseudonyms are generated via a cryptographically strong random number generator (RNG). In the unlikely case of a collision, a fall-back mechanism is in place to guarantee unique pseudonyms.

Internal Pseudonyms

An internal pseudonym is always generated when a resource is pseudonymized for the first time. It is a Base64-encoded random string of fixed length 11 and is stored permanently for the lifetime of the resource.

Internal pseudonyms are returned in the response with identifier system urn:fdc:difuture.de:trustcenter.plain and use SECONDARY.

External Pseudonyms

External pseudonyms are generated on demand by specifying a targetSystem in the pseudonymization request. They are created in addition to the internal pseudonym and are specific to the given target system. The same resource can have multiple external pseudonyms — one per target system.

Alphabet Configuration

The format and length of external pseudonyms are configured per target system via the LEVEL_1_PSNS_ALPHABETS environment variable. The value is a semicolon-separated list of entries, where each entry has the form:

<target-system>:<length>:<characters>
ComponentDescription
target-systemName of the target system this alphabet applies to
lengthLength of the generated pseudonym
charactersSet of characters to draw from during generation

Example:

LEVEL_1_PSNS_ALPHABETS="NUMCodex:20:ACDEFGHJKLMNPQRTUVWX1234567890;test-system:10:0123456789ABCDEF"

This configures two target systems:

  • NUMCodex — pseudonyms of length 20 using an alphanumeric character set (without ambiguous characters such as I, O, B, S)
  • test-system — pseudonyms of length 10 using hexadecimal characters

Character Set Recommendations

Choose a character set that avoids visually ambiguous characters (e.g. 0/O, 1/I/l, B/8, S/5) to reduce transcription errors when pseudonyms are entered manually. A length of 15–20 characters from a set of ~30 distinct characters provides a large enough pseudonym space to make collisions negligible in practice.

Collision Handling

If the RNG produces a pseudonym that already exists in the database, entici automatically retries with a new random value. This retry loop is transparent to the caller. Given a sufficiently large pseudonym space (alphabet size raised to the power of pseudonym length), the probability of a collision is negligible in practice, and the loop terminates in one iteration in the vast majority of cases.