Example of two almost fully persistent HTML hyperlinks

by Gérald Jean Francis Banon
June 2023
Updated in June 2025

Introduction

The integrity of digital information is an old problem [1]. One feature that determines information integrity and still deserves special attention for archiving purposes is the insertion of hyperlinks in a Web resource. Such a hyperlink must work forever without the need to be edited again in the Web resource, also called here the source resource.

Currently, a widely used model for hyperlink in a source resource is the "absolute hyperlink" which has three components: a persistent, location-independent destination resource identifier, a domain name of the identifier resolver that redirects the user to the current requested destination resource location, and a URL scheme. Such a hyperlink is usually called a persistent hyperlink because it is based on a persistent, location-independent identifier.

A sample of a persistent hyperlink, found in a journal article published by Elsevier [2], is:
https://doi.org/10.1016/j.rse.2021.112667.
The corresponding hyperlink source code is:
<a href="https://doi.org/10.1016/j.rse.2021.112667" target="_blank">https://doi.org/10.1016/j.rse.2021.112667</a>

In this example:
The URL scheme is: https
The domain name is: doi.org
The destination resource identifier is: 10.1016/j.rse.2021.112667

The disavantage of the so-called a persistent hypelink in a Web resource (such as the hyperlink above on this page) is that, of its three components, only the destination resource identifier is unlikely to change. On the other hand, there is no garantee that the resolver's domain name and scheme (the communication protocol) will remain unchanged forever.

In other words, the integrity of the hyperlink and, consequently, of the Web resource, which depends on the persistence of the resolver domain name and the scheme, cannot be garanteed for the long term.

The purpose of this note is to illustrate, by mean of two examples, the existence of a digital service that overcomes the potential risk to information integrity when using the usual persistent hyperlinks.

The solution consists of identifying the destination resource via a URI [3], and hiding the identifier resolver's domain name and the URL scheme by using a "relative hyperlink instead of an "absolute hyperlink". Thus, the result of resolving the relative hyperlink is an absolute URI, specifically an absolute URL, containing in its path another URI.

Despite the fact that the use of the resolver's domaine name is made implicit, the proposed solution still relies on a global resolver to exist. For this reason, the relative hyperlinks presented below are not called fully persistent but almost fully persistent. For a solution that does not necessarely require the use of a global resolver, see [4].

Examples

The HTML hyperlinks in this section are said "almost fully persistent" in the sense that each one uses a persistent, location-independent destination resource identifier that is resolved without mentioning explicitly the respective global resolver's domaine name (urlib.net in the first example using IBI, and doi.org in the second example using DOI). This property can be verified by looking at the value of the respective href attribute in the source code of this HTML page.

  1. Almost fully persistent HTML hyperlink using an IBI identifier (IBI stands for Internet Based Identifier)
    Title of the destination resource: The Internet Based Identifier (IBI) and the IBI Network.
    Available from: upn:4CR88AP:8JMKD3MGP3W34R/44C25PS
    Hyperlink source code:
    <a href="./upn:4CR88AP:8JMKD3MGP3W34R/44C25PS" target="_blank">upn:4CR88AP:8JMKD3MGP3W34R/44C25PS</a>

    The hyperlink consists of the URI of the destination resource preceded by ./. The string upn is the URI scheme and 4CR88AP is the upn namespace identifier for IBI.

  2. Almost fully persistent HTML hyperlink using a DOI identifier (DOI stands for Digital Object Identifier)
    Title of the destination resource: Uniform Resource Names (URNs).
    Available from: urn:doi:10.17487/RFC8141
    Hyperlink source code:
    <a href="./urn:doi:10.17487/RFC8141" target="_blank">urn:doi:10.17487/RFC8141</a>

Observation 1: The above "magic" works because this page (the Web resource containing the hyperlinks) has been deposited in an Archive (Digital Repository - Data Provider) hosted on a special computational platform called the URLib and thereby, part of what is called the IBI network of Archives. Each of these Archives operates as a data provider specially configurated to work with relative hyperlinks.

Observation 2: The above two hyperlinks coded in HTML work successfully with Firefox, Chrome and Edge, but at the present time the only downside is that they work only with Firefox when coded in PDF.

Observation 3: The upn URI scheme has not yet been registered with IANA. The acronym upn stands for Uniform Package Name. Inicially, this scheme was designed to identify Archival Information Package, nevertheless it can be used to identify any type of recourse. To create and register a new UPN namespace ID point to upn:4CR88AP-:QABCDSTQQW/4DMTTQE and send an e-mail to urlibservice@gmail.com stating the resolver's domain name for the created UPN namespace ID.

References

[1] WATERS, D and GARRETT, J. Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. Washington, DC: CLIR, May 1996. Available from: https://www.clir.org/wp-content/uploads/sites/6/pub63watersgarrett.pdf.

[2] MACIEL, D. A.; PAHLEVAN, N.; BARBOSA, C. C. F.; MARTINS, V. S.; SMITH, B.; O'SHEA, R. E.; BALASUBRAMANIAN, S. V.; SARANATHAN, A. M. and NOVO, E. M. L. M. Towards global long-term water transparency products from the Landsat archive. Remote Sensing of Environment, v. 299, p. e113889, Dec. 2023. DOI: <10.1016/j.rse.2023.113889>. Available from: urn:doi:10.1016/j.rse.2023.113889.

[3] BERNERS-LEE, T.; FIELDING, R. AND MASINTER, L. Uniform Resource Identifier (URI): Generic Syntax. RFC 3986. Available from: https://datatracker.ietf.org/doc/html/rfc3986#section-3.3.

[4] BANON, G. J. F. Example of two fully persistent HTML hyperlinks. [S.l.] Deposited in the URLib collection, 2023. IBI: . Available from: upn:4CR88AP:QABCDSTQQW/4A86BJH.
 

An "almost fully persistent hyperlink" is necessarely a "relative hyperlink".

To be interpreted as a relative hyperlink by the browser, the URI must be preceded by a period (.) followed by a slash mark (/).