Work in progress
by Gérald Jean Francis Banon
June 2023
Updated in September 2025
The integrity of digital information is an old problem [1]. One feature that determines information integrity and still deserves special attention for archiving purposes is the insertion of hyperlinks in a Web resource. Such a hyperlink must work forever without the need to be edited again in the Web resource, also called here the source resource.
Currently, a widely used model for hyperlink in a source resource is the "absolute hyperlink" which has three components: a persistent, location-independent destination resource identifier, a domain name of the identifier resolver that redirects the user to the current requested destination resource location, and a URL scheme. Such a hyperlink is usually called a persistent hyperlink because it is based on a persistent, location-independent identifier.
A sample of a persistent hyperlink, found in a journal article published by Elsevier [2], is:
https://doi.org/10.1016/j.rse.2021.112667.
The corresponding hyperlink source code is:
<a href="https://doi.org/10.1016/j.rse.2021.112667" target="_blank">https://doi.org/10.1016/j.rse.2021.112667</a>
In this example:
The URL scheme is: https
The resolver domain name is: doi.org
The destination resource identifier is: 10.1016/j.rse.2021.112667
The disavantage of the so-called a persistent hyperlink in a Web resource (such as the hyperlink above on this page) is that, of its three components, only the destination resource identifier is unlikely to change. On the other hand, there is no garantee that the resolver's domain name and scheme (the communication protocol) will remain unchanged forever.
In other words, the integrity of the hyperlink and, consequently, of the Web resource, which depends on the persistence of the resolver domain name and the scheme, cannot be garanteed for the long term.
The purpose of this note is to illustrate, by mean of three examples, the existence of a digital service that overcomes the potential risk to information integrity when using the usual persistent hyperlinks.
The solution adopted here is to use, as URIs [3] some kind of uniform resource name in the hyperlink, rather than some kind of uniform resource locator.
To implement this solution, making the name resolvable, a "relative hyperlink is used instead of an "absolute hyperlink".
In this way, the result of resolving the relative hyperlink is an absolute URI, specifically a URL, containing in its path another URI.
Despite the fact that the use of the resolver's domaine name is made implicit, the proposed solution still relies on a global resolver to exist. For this reason, the relative hyperlinks presented below are not called fully persistent but almost fully persistent. For a solution that does not necessarely require the use of a global resolver, see [4].
The HTML hyperlinks in this section are said "almost fully persistent"† in the sense that each one uses a persistent, location-independent destination resource identifier that is resolved without mentioning explicitly the respective global resolver's domaine name (urlib.net in the first example using IBI, doi.org in the second example using DOI and n2t.net in the thrid example using ARK). This property can be verified by looking at the value of the respective href attribute in the source code of this HTML page.
Observation 1: The above "magic" works because this page (the Web resource containing the hyperlinks) has been deposited in an Archive (Digital Repository - Data Provider) hosted on an experimental computational platform called the URLib and thereby, part of what is called the IBI network of Archives. Each of these Archives operates as a data provider specially configurated to work with relative hyperlinks.
In this implementation, the information in the usual persistent hyperlink about the URL scheme and the resolver domain name, migrates from the source recourse to an appropriate cgi-script running on the data provider.
This way, any future possible changes of this information, will not have any impact on the source recourse and can be easily implemented in such a cgi-script.
Observation 2: The upn URI scheme still has to be registered with IANA. The acronym upn stands for Uniform Package Name. Inicially, this scheme was designed to identify any pecified naming system used to identify Archival Information Package, nevertheless it can be used to identify any type of naming sistem to turn a local namespace into o global one. To create and register a new UPN namespace ID point to upn:4CR88AP-:QABCDSTQQW/4DMTTQE and send an e-mail to urlibservice@gmail.com stating the resolver's domain name for the created UPN namespace ID.
‡To be interpreted as a relative hyperlink by the browser, the URI must be preceded by a period (.) followed by a slash mark (/).