Pseudonym Generation
In entici, we distinguish between internal and external pseudonyms. The internal pseudonym is the default pseudonym which will be generated whenever a new resource is pseudonymized. Therefore, the relation between resource and internal pseudonym is 1:1.
In addition, an arbitrary number of external pseudonyms may be generated for one resource. This mechanism enables project-based pseudonymization. Therefore, the relation between resource and external pseudonym is 1:N.
All pseudonyms are generated via a cryptographically strong random number generator (RNG). In the unlikely case of a collision, a fall-back mechanism is in place to guarantee unique pseudonyms.
Internal Pseudonyms
An internal pseudonym is always generated when a resource is pseudonymized for the first time. It is a Base64-encoded random string of fixed length 11 and is stored permanently for the lifetime of the resource.
Internal pseudonyms are returned in the response with identifier system urn:fdc:difuture.de:trustcenter.plain and use SECONDARY.
External Pseudonyms
External pseudonyms are generated on demand by specifying a targetSystem in the pseudonymization request. They are created in addition to the internal pseudonym and are specific to the given target system. The same resource can have multiple external pseudonyms — one per target system.
Alphabet Configuration
The format and length of external pseudonyms are configured per target system via the LEVEL_1_PSNS_ALPHABETS environment variable. The value is a semicolon-separated list of entries, where each entry has the form:
<target-system>:<length>:<characters>| Component | Description |
|---|---|
target-system | Name of the target system this alphabet applies to |
length | Length of the generated pseudonym |
characters | Set of characters to draw from during generation |
Example:
LEVEL_1_PSNS_ALPHABETS="NUMCodex:20:ACDEFGHJKLMNPQRTUVWX1234567890;test-system:10:0123456789ABCDEF"This configures two target systems:
NUMCodex— pseudonyms of length 20 using an alphanumeric character set (without ambiguous characters such as I, O, B, S)test-system— pseudonyms of length 10 using hexadecimal characters
Character Set Recommendations
Choose a character set that avoids visually ambiguous characters (e.g. 0/O, 1/I/l, B/8, S/5) to reduce transcription errors when pseudonyms are entered manually. A length of 15–20 characters from a set of ~30 distinct characters provides a large enough pseudonym space to make collisions negligible in practice.
Collision Handling
If the RNG produces a pseudonym that already exists in the database, entici automatically retries with a new random value. This retry loop is transparent to the caller. Given a sufficiently large pseudonym space (alphabet size raised to the power of pseudonym length), the probability of a collision is negligible in practice, and the loop terminates in one iteration in the vast majority of cases.