Example of three almost fully persistent HTML hyperlinks

Work in progress


by Gérald Jean Francis Banon
June 2023
Updated in September 2025

Introduction

The integrity of digital information is an old problem [1]. One feature that determines information integrity and still deserves special attention for archiving purposes is the insertion of hyperlinks in a Web resource. Such a hyperlink must work forever without the need to be edited again in the Web resource, also called here the source resource.

Currently, a widely used model for hyperlink in a source resource is the "absolute hyperlink" which has three components: a persistent, location-independent destination resource identifier, a domain name of the identifier resolver that redirects the user to the current requested destination resource location, and a URL scheme. Such a hyperlink is usually called a persistent hyperlink because it is based on a persistent, location-independent identifier.

A sample of a persistent hyperlink, found in a journal article published by Elsevier [2], is:
https://doi.org/10.1016/j.rse.2021.112667.
The corresponding hyperlink source code is:
<a href="https://doi.org/10.1016/j.rse.2021.112667" target="_blank">https://doi.org/10.1016/j.rse.2021.112667</a>

In this example:
The URL scheme is: https
The resolver domain name is: doi.org
The destination resource identifier is: 10.1016/j.rse.2021.112667

The disavantage of the so-called a persistent hyperlink in a Web resource (such as the hyperlink above on this page) is that, of its three components, only the destination resource identifier is unlikely to change. On the other hand, there is no garantee that the resolver's domain name and scheme (the communication protocol) will remain unchanged forever.

In other words, the integrity of the hyperlink and, consequently, of the Web resource, which depends on the persistence of the resolver domain name and the scheme, cannot be garanteed for the long term.

The purpose of this note is to illustrate, by mean of three examples, the existence of a digital service that overcomes the potential risk to information integrity when using the usual persistent hyperlinks.

The solution adopted here is to use, as URIs [3] some kind of uniform resource name in the hyperlink, rather than some kind of uniform resource locator.

To implement this solution, making the name resolvable, a "relative hyperlink is used instead of an "absolute hyperlink".

In this way, the result of resolving the relative hyperlink is an absolute URI, specifically a URL, containing in its path another URI.

Despite the fact that the use of the resolver's domaine name is made implicit, the proposed solution still relies on a global resolver to exist. For this reason, the relative hyperlinks presented below are not called fully persistent but almost fully persistent. For a solution that does not necessarely require the use of a global resolver, see [4].

Examples

The HTML hyperlinks in this section are said "almost fully persistent" in the sense that each one uses a persistent, location-independent destination resource identifier that is resolved without mentioning explicitly the respective global resolver's domaine name (urlib.net in the first example using IBI, doi.org in the second example using DOI and n2t.net in the thrid example using ARK). This property can be verified by looking at the value of the respective href attribute in the source code of this HTML page.

  1. Almost fully persistent HTML hyperlink using an IBI identifier (IBI stands for Internet Based Identifier)
    Title of the destination resource: The Internet Based Identifier (IBI) and the IBI Network.
    Available from: upn:4CR88AP:8JMKD3MGP3W34R/44C25PS
    Hyperlink source code:
    <a href="./upn:4CR88AP:J8LNKB5R7W/3NSP3DL" target="_blank">upn:4CR88AP:J8LNKB5R7W/3NSP3DL</a>

    The hyperlink consists of the URI of the destination resource preceded by ./. The string upn is the URI scheme and 4CR88AP is the upn namespace identifier for IBI.

  2. Almost fully persistent HTML hyperlink using a DOI identifier (DOI stands for Digital Object Identifier)
    Title of the destination resource: Uniform Resource Names (URNs).
    Available from: urn:doi:10.17487/RFC8141
    Hyperlink source code:
    <a href="./urn:doi:10.17487/RFC8141" target="_blank">urn:doi:10.17487/RFC8141</a>
    Alternatively, available from: doi:10.17487/RFC8141
    (the name doi has been registered by the Internet Assigned Numbers Authority (IANA) as both a URI scheme and a URN namespace identifier)

  3. Almost fully persistent HTML hyperlink using an ARK identifier (ARK stands for Archival Resource Key)
    Title of the destination resource: The ARK Identifier Scheme.
    Available from: ark:13030/c7cv4br18
    (the name ark has been registered by the Internet Assigned Numbers Authority (IANA) as a URI scheme)
    Hyperlink source code:
    <a href="./ark:13030/c7cv4br18" target="_blank">ark:13030/c7cv4br18</a>

Observation 1: The above "magic" works because this page (the Web resource containing the hyperlinks) has been deposited in an Archive (Digital Repository - Data Provider) hosted on an experimental computational platform called the URLib and thereby, part of what is called the IBI network of Archives. Each of these Archives operates as a data provider specially configurated to work with relative hyperlinks.
In this implementation, the information in the usual persistent hyperlink about the URL scheme and the resolver domain name, migrates from the source recourse to an appropriate cgi-script running on the data provider.
This way, any future possible changes of this information, will not have any impact on the source recourse and can be easily implemented in such a cgi-script.

Observation 2: The upn URI scheme still has to be registered with IANA. The acronym upn stands for Uniform Package Name. Inicially, this scheme was designed to identify any pecified naming system used to identify Archival Information Package, nevertheless it can be used to identify any type of naming sistem to turn a local namespace into o global one. To create and register a new UPN namespace ID point to upn:4CR88AP-:QABCDSTQQW/4DMTTQE and send an e-mail to urlibservice@gmail.com stating the resolver's domain name for the created UPN namespace ID.

References

[1] WATERS, D and GARRETT, J. Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. Washington, DC: CLIR, May 1996. Available from: https://www.clir.org/wp-content/uploads/sites/6/pub63watersgarrett.pdf.

[2] MACIEL, D. A.; PAHLEVAN, N.; BARBOSA, C. C. F.; MARTINS, V. S.; SMITH, B.; O'SHEA, R. E.; BALASUBRAMANIAN, S. V.; SARANATHAN, A. M. and NOVO, E. M. L. M. Towards global long-term water transparency products from the Landsat archive. Remote Sensing of Environment, v. 299, p. e113889, Dec. 2023. DOI: <10.1016/j.rse.2023.113889>. Available from: urn:doi:10.1016/j.rse.2023.113889.

[3] BERNERS-LEE, T.; FIELDING, R. AND MASINTER, L. Uniform Resource Identifier (URI): Generic Syntax. RFC 3986. Available from: https://datatracker.ietf.org/doc/html/rfc3986.

[4] BANON, G. J. F. Example of two fully persistent HTML hyperlinks. [S.l.] Deposited in the URLib collection, 2023. IBI: . Available from: upn:4CR88AP:QABCDSTQQW/4A86BJH.
 

An "almost fully persistent hyperlink" is necessarely a "relative hyperlink".

To be interpreted as a relative hyperlink by the browser, the URI must be preceded by a period (.) followed by a slash mark (/).