Example of three almost fully persistent HTML hyperlinks

Work in progress


by Gérald Jean Francis Banon
June 2023
Updated in october 2025

Introduction

The integrity of digital information is an old problem [1]. One feature that determines information integrity and still deserves special attention for archiving purposes is the insertion of hyperlinks in a Web resource. Such a hyperlink must work forever without the need to be edited again in the Web resource, also called here the source resource.

Currently, a widely used model for hyperlink in a source resource is the "absolute hyperlink" which has three components: a persistent, location-independent destination resource identifier, a domain name of the identifier resolver that redirects the user to the current requested destination resource location, and a URL scheme. Such a hyperlink is usually called a persistent hyperlink because it is based on a persistent, location-independent identifier.

A sample of a persistent hyperlink, found in a journal article published by Elsevier [2], is:
https://doi.org/10.1016/j.rse.2021.112667.
The corresponding hyperlink source code is:
<a href="https://doi.org/10.1016/j.rse.2021.112667" target="_blank">https://doi.org/10.1016/j.rse.2021.112667</a>

In this example:
The URL scheme is: https
The resolver domain name is: doi.org
The destination resource identifier is: 10.1016/j.rse.2021.112667

The disavantage of the so-called a persistent hyperlink in a Web resource (such as the hyperlink above on this page) is that, of its three components, only the destination resource identifier is unlikely to change. On the other hand, there is no garantee that the resolver's domain name and scheme (the communication protocol) will remain unchanged forever.

In other words, the integrity of such hyperlink and, consequently, of the Web resource, which depends on the persistence of the resolver domain name and the scheme, cannot be garanteed for the long term.

The purpose of this note is to illustrate, by mean of three examples, the existence of a digital service that overcomes the potential risk to information integrity when using the usual persistent hyperlinks.

The solution adopted here is to use, as URIs [3], some kind of uniform resource global name in a "relative hyperlink", rather than some kind of uniform resource locator in an "absolute hyperlink".

Where the uniform resource global name consists of a global namespace prefix, global in the sense that the prefix has possibly been registered with Internet Assigned Numbers Authority (IANA), followed by a name within the scope of that namespace.

Finally, the Archive (Digital Repository) hosting the source resource is adapted to also serve as a proxy resolver in the sense that it directs, based on the namespace prefix, the client's resolution request to the appropriate resolver and ultimately returns the URL of the destination resource to the client's browser for redirection.

Despite the fact that the use of the resolver's domaine name is made implicit, the proposed solution still relies on a global resolver to exist. For this reason, the relative hyperlinks presented below are not called fully persistent but almost fully persistent. For a solution that does not necessarely require the use of a global resolver, see [4].

Examples

The HTML hyperlinks in this section are said "almost fully persistent" in the sense that each one uses a persistent, location-independent destination resource identifier that is resolved without mentioning explicitly the respective resolver's domaine name (n2t.net in the first example using ARK, doi.org in the second example using DOI and urlib.net in the third using IBI). This property can be verified by looking at the value of the respective href attribute in the source code of this HTML page.

  1. Almost fully persistent HTML hyperlink using an ARK identifier (ARK stands for Archival Resource Key)
    Title of the destination resource: The ARK Identifier Scheme.
    Available from: ark:13030/c7cv4br18
    (the name ark has been registered by IANA as a URI scheme)
    Hyperlink source code:
    <a href="./ark:13030/c7cv4br18" target="_blank">ark:13030/c7cv4br18</a>
    The hyperlink consists of the URI of the destination resource preceded by ./.
  2. Almost fully persistent HTML hyperlink using a DOI identifier (DOI stands for Digital Object Identifier)
    Title of the destination resource: The QAA-RGB: ... ACOLITE.
    Available from: urn:doi:10.1016/j.rse.2021.112667
    Hyperlink source code:
    <a href="./urn:doi:10.1016/j.rse.2021.112667" target="_blank">urn:doi:10.1016/j.rse.2021.112667</a>
    The string urn is the URI scheme and doi is the URN namespace identifier for DOI.
    Alternatively, the above resource is available from: doi:10.1016/j.rse.2021.112667
    (the name doi has been registered by the IANA as both a URI scheme and a URN namespace identifier).

  3. Almost fully persistent HTML hyperlink using an IBI identifier (IBI stands for Internet Based Identifier)
    Title of the destination resource: The Internet Based Identifier (IBI) and the IBI Network.
    Available from: ibi:8JMKD3MGP3W34R/44C25PS
    Hyperlink source code:
    <a href="./ibi:8JMKD3MGP3W34R/44C25PS" target="_blank">ibi:8JMKD3MGP3W34R/44C25PS</a>
[[ if [[catch {Execute {urlib.net 800} [[list ReturnArrayValueFromFile urlib.net/www/2025/09.08.04.08 namespacePrefixXresolverURLarray.tcl namespacePrefixXresolverURLarray ark]] 1} arkResolverURL]] { global errorInfo set errorInfo } if [[catch {Execute {urlib.net 800} [[list ReturnArrayValueFromFile urlib.net/www/2025/09.08.04.08 namespacePrefixXresolverURLarray.tcl namespacePrefixXresolverURLarray urn:doi]] 1} doiResolverURL]] { global errorInfo set errorInfo } if [[catch {Execute {urlib.net 800} [[list ReturnArrayValueFromFile urlib.net/www/2025/09.08.04.08 namespacePrefixXresolverURLarray.tcl namespacePrefixXresolverURLarray ibi]] 1} ibiResolverURL]] { global errorInfo set errorInfo } ]]

Observation: The above examples work because this page (the Web resource containing the hyperlinks) has been deposited in an Archive hosted on an experimental computational platform called the URLib. On this platform, each Archive serves as proxy resolver.

Table 1 illustrates how the Archive $localSite works as a proxy resolver to solve the above three examples. As with any resolver, the URL path component (e.g., ark:13030/c7cv4br18 — see Line 1 of Table 1) is the URI of the destination resource.

The character string of this URI up to the last colon (:) (e.g., ark) is the namespace prefix that identifies a possible resolver for the resolution of the destination resource identifier (e.g., 13030/c7cv4br18).

In turn, a possible resolver (e.g., $arkResolverURL for ark — see Line 2 of Table 1) is assigned to each namespace prefix (e.g., ark) thus forming a mapping called "Prefix-Resolver".

Finally, based on the output of the prefix-resolver mapping corresponding to a given namespace prefix, the proxy server triggers the appropriate resolver (e.g., see Line 3 of Table 1).

In this implementation, the information contained in the usual persistent hyperlink, consisting of the concatenation of the URL scheme and the resolver domain name, migrates from the source recourse to the prefix-resolver mapping accessible by a cgi-script running in the Archive.

This way, the integrity of the source resource can be preserved over time, even in the presence of any possible future changes to that information. Such changes will have no impact on the source recourse and can be easily reflected in the prefix-resolver mapping.

Table 1 - Proxy resolver operation

 Proxy resolver  http://$localSite/ark:13030/c7cv4br18
 Mapping value for ARK ark$arkResolverURL
 ARK global resolver $arkResolverURL/ark:13030/c7cv4br18 
 Proxy resolver  http://$localSite/urn:doi:10.1016/j.rse.2021.112667
 Mapping value for DOI urn:doi$doiResolverURL
 DOI global resolver $doiResolverURL/urn:doi:10.1016/j.rse.2021.112667 
 Proxy resolver  http://$localSite/ibi:8JMKD3MGP3W34R/44C25PS
 Mapping value for IBI urn:doi$ibiResolverURL
 IBI global resolver $ibiResolverURL/ibi:8JMKD3MGP3W34R/44C25PS 

References

[1] WATERS, D and GARRETT, J. Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. Washington, DC: CLIR, May 1996. Available from: https://www.clir.org/wp-content/uploads/sites/6/pub63watersgarrett.pdf.

[2] MACIEL, D. A.; PAHLEVAN, N.; BARBOSA, C. C. F.; MARTINS, V. S.; SMITH, B.; O'SHEA, R. E.; BALASUBRAMANIAN, S. V.; SARANATHAN, A. M. and NOVO, E. M. L. M. Towards global long-term water transparency products from the Landsat archive. Remote Sensing of Environment, v. 299, p. e113889, Dec. 2023. DOI: <10.1016/j.rse.2023.113889>. Available from: urn:doi:10.1016/j.rse.2023.113889.

[3] BERNERS-LEE, T.; FIELDING, R. AND MASINTER, L. Uniform Resource Identifier (URI): Generic Syntax. RFC 3986. Available from: https://datatracker.ietf.org/doc/html/rfc3986.

[4] BANON, G. J. F. Example of two fully persistent HTML hyperlinks. [S.l.] Deposited in the URLib collection, 2023. IBI: . Available from: ibi:QABCDSTQQW/4A86BJH.
 

An "almost fully persistent hyperlink" is necessarely a "relative hyperlink".

To be interpreted as a relative hyperlink by the browser, the URI must be preceded by a period (.) followed by a slash mark (/).