Network Working Group C. Lynch
Request for Comments: 2288 Coalition for Networked Information
Category: Informational C. Preston
Preston & Lynch
R. Daniel
Los Alamos National Laboratory
February 1998
Using Existing Bibliographic Identifiers
as
Uniform Resource Names
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1998). All Rights Reserved.
Abstract
A system for Uniform Resource Names (URNs) must be capable of
supporting identifiers from existing widely-used naming systems.
This document discusses how three major bibliographic identifiers
(the ISBN, ISSN and SICI) can be supported within the URN framework
and the currently proposed syntax for URNs.
1. Introduction
The ongoing work of several IETF working groups, most recently in the
Uniform Resource Names working group, has culminated the development
of a syntax for Uniform Resource Names (URNs). The functional
requirements and overall framework for Uniform Resource Names are
specified in RFC 1737 [Sollins & Masinter] and the specification for
the URN syntax is RFC 2141 [Moats].
As part of the validation process for the development of URNs the
IETF working group has agreed that it is important to demonstrate
that the current URN syntax proposal can accommodate existing
identifiers from well established namespaces. One such
infrastructure for assigning and managing names comes from the
bibliographic community. Bibliographic identifiers function as names
for objects that exist both in print and, increasingly, in electronic
formats. This memo demonstrates the feasibility of supporting three
Lynch, et. al. Informational [Page 1]
RFC 2288 Bibligraphic Identifiers February 1998
representative bibliographic identifiers within the currently
proposed URN framework and syntax.
Note that this document does not purport to define the "official"
standard way of moving these bibliographic identifiers into URNs; it
merely demonstrates feasibility. It has not been developed in
consultation with these standards bodies and maintenance agencies
that oversee the existing bibliographic identifiers. Any actual
Internet standard for encoding these bibliographic identifiers as
URNs will need to be developed in consultation with the responsible
standards bodies and maintenance agencies.
In addition, there are several open questions with regard to the
management and registry of Namespace Identifiers (NIDs) for URNs.
For purposes of illustration, we have used the three NIDs "ISBN",
"ISSN" and "SICI" for the three corresponding bibliographic
identifiers discussed in this document. While we believe this to be
the most appropriate choice, it is not the only one. The NIDs could
be based on the standards body and standard number (e.g. "US-ANSI-
NISO-Z39.56-1997" rather than "SICI"). Alternatively, one could lump
all bibliographic identifiers into a single "BIBLIOGRAPHIC" name
space, and structure the namespace-specific string to specify which
identifier is being used. Any final resolution of this must wait for
the outcome of namespace management discussions in the working group
and the broader IETF community.
For the purposes of this document, we have selected three major
bibliographic identifiers (national and international) to fit within
the URN framework. These are the International Standard Book Number
(ISBN) [ISO1], the International Standard Serials Number (ISSN)
[NISO1,ISO2, ISO3], and the Serial Item and Contribution Identifier
(SICI) [NISO2]. An ISBN is used to identify a monograph (book). An
ISSN is used to identify serial publications (journals, newspapers)
as a whole. A SICI augments the ISSN in order to identify
individual issues of serial publications, or components within those
issues (such as an individual article, or the table of contents of a
given issue). The ISBN and ISSN are defined in the United States by
standards issued by the National Information Standards Organization
(NISO) and also by parallel international standards issued under the
auspices of the International Organization for Standardization (ISO).
NISO is the ANSI-accredited standards body serving libraries,
publishers and information services. The SICI code is defined by a
NISO document in the United States and does not have a parallel
international standards document at present.
Lynch, et. al. Informational [Page 2]
RFC 2288 Bibligraphic Identifiers February 1998
Many other bibliographic identifiers are in common use (for example,
CODEN, numbers assigned by major bibliographic utilities such as OCLC
and RLG, national library numbers such as the Library of Congress
Control Number) or are under development. While we do not discuss
them in this document, many of these will also need to be supported
within the URN framework as it moves to large scale implementation.
The issues involved in supporting those additional identifiers are
anticipated to be broadly similar to those involved in supporting
ISBNs, ISSNs, and SICIs.
2. Identification vs. Resolution
It is important to distinguish between the resource identified by a
URN and the resources a URN resolver that can reasonably return when
attempting to resolve an identifier. For example, the ISSN 0040-781X
identifies the popular magazine "Time" -- all of it, every issue for
from the start of publication to present. Resolving such an
identifier should not result in the equivalent of hundreds of
thousands of pages of text and photos being dumped to the user's
machine. It is more reasonable for ISSNs to resolve to a
navigational system, such as an HTML-based search form, so the user
may select issues or articles of interest. ISBNs and SICIs, on the
other hand, do identify finite, manageably-sized objects, but these
objects may still be large enough that resolution to a hierarchical
system is appropriate.
In addition, the materials identified by an ISSN, ISBN or SICI may
exist only in printed or other physical form, not electronically.
The best that a resolver may be able to offer is information about
where to get the physical resource, such as library holdings or a
bookstore or publisher order form. The URN Framework provides
resolution services that may be used to describe any differences
between the resource identified by a URN and the resource that would
be returned as a result of resolving that URN.
3. International Standard Book Numbers
3.1 Overview
An International Standard Book Number (ISBN) identifies an edition of
a monographic work. The ISBN is defined by the standard
NISO/ANSI/ISO 2108:1992 [ISO1]
Basically, an ISBN is a ten-digit number (actually, the last digit
can be the letter "X" as well, as described below) which is divided
into four variable length parts usually separated by hyphens when
printed. The parts are as follows (in this order):
Lynch, et. al. Informational [Page 3]
RFC 2288 Bibligraphic Identifiers February 1998
* a group identifier which specifies a group of publishers, based on
national, geographic or some other criteria,
* the publisher identifier,
* the title identifier,
* and a modulus 11 check digit, using X instead of 10.
The group and publisher number assignments are managed in such a way
that the hyphens are not needed to parse the ISBN unambiguously into
its constituent parts. However, the ISBN is normally transmitted and
displayed with hyphens to make it easy for human beings to recognize
these parts without having to make reference to or have knowledge of
the number assignments for group and publisher identifiers.
3.2 Encoding Considerations and Lexical Equivalence
Embedding ISBNs within the URN framework presents no particular
encoding problems, since all of the characters that can appear in an
ISBN are valid in the identifier segment of the URN. %-encoding, as
described in [MOATS] is never needed.
Example: URN:ISBN:0-395-36341-1
For the ISBN namespace, some additional equivalence rules are
appropriate. Prior to comparing two ISBN URNs for equivalence, it is
appropriate to remove all hyphens, and to convert any occurrences of
the letter X to upper case.
3.3 Additional considerations
The ISBN standard and related community implementation guidelines
define when different versions of a work should be assigned the same
or differing ISBNs. In actuality, however, practice varies somewhat
depending on publisher as to whether different ISBNs are assigned for
paperbound vs. hardbound versions of the same work, electronic vs.
printed versions of the same work, or versions of the same work
distinguished in some other way (e.g., published for example in the
US and in Europe). The choice of whether to assign a new ISBN or to
reuse an existing one when publishing a revised printing of an
existing edition of a work or even a revised edition of a work is
somewhat subjective. Practice varies from publisher to publisher
(indeed, the distinction between a revised printing and a new edition
is itself somewhat subjective). The use of ISBNs within the URN
framework simply reflects these existing practices. Note that it is
likely that an ISBN URN will often resolve to many instances of the
work (many URLs).
Lynch, et. al. Informational [Page 4]
RFC 2288 Bibligraphic Identifiers February 1998
4. International Standard Serials Numbers
4.1 Overview
International Standard Serials Numbers (ISSN) identify a work that is
published on a continued basis in issues; they identify the entire
(often open-ended, in the case of an actively published) work. ISSNs
are defined by the international standards ISO 3297:1986 [ISO2] and
ISO/DIS 3297 [ISO3] and within the United States by NISO Z39.9-1992
[NISO1]. The ISSN International Centre is located in Paris and
coordinates a network of regional centers. The National Serials Data
Program within the Library of Congress is the US Center of this
network.
ISSNs have the form NNNN-NNNN where N is a digit, the last digit may
be an upper case X as the result of the check character calculation.
Unlike the ISBN the ISSN components do not have much structure;
blocks of numbers are passed out to the regional centers and
publishers.
4.2 Encoding Considerations and Lexical Equivalence
Again, there is no problem representing ISSNs in the namespace-
specific string of URNs since all characters valid in the ISSN are
valid in the namespace-specific URN string, and %-encoding is never
required.
Example: URN:ISSN:1046-8188
Supplementary comparison rules are also appropriate for the ISSN
namespace. Just as for ISBNs, hyphens should be dropped prior to
comparison and occurrences of 'x' normalized to uppercase.
4.3 Additional Considerations
The ISSN standard and related community implementation guidelines
specify when new ISSNs should be assigned vs. continuing to use an
existing one. There are some publications where practice within the
bibliographic community varies from institution to institution, such
as annuals or annual conference proceedings. In some cases these are
treated as serials and ISSNs are used, and in some cases they are
treated as monographs and ISBNs are used. For example SIGMOD Record
volume 24 number 2 June 1995 contains the Proceedings of the 1995 ACM
SIGMOD International Conference on Management of Data. If you
subscribe to the journal (ISSN 0163-5808) this is simply the June
issue. On the other hand you may have acquired this volume as the
conference proceedings (a monograph) and as such would use the ISBN
0-89791-731-6 to identify the work. There are also varying practices
Lynch, et. al. Informational [Page 5]
RFC 2288 Bibligraphic Identifiers February 1998
within the publishing community as to when new ISSNs are assigned due
to the change in the name of a periodical (e.g. Atlantic becomes
Atlantic Monthly); or when a periodical is published both in printed
and electronic versions (e.g. The New York Times). The use of ISSNs
in URNs will reflect these judgments and practices.
5. Serial Item and Contribution Identifiers
5.1 Overview
The standard for Serial Item and Contribution Identifiers (SICI)
codes, which has recently been extensively revised, is defined by
NISO/ANSI Z39.56-1997 [NISO2]. The maintenance agency for the SICI
code is the UnCover Corporation.
SICI codes can be used to identify an issue of a serial, or a
specific contribution (e.g., an article, or the table of contents)
within an issue of a serial. SICI codes are not assigned, they are
constructed based on information about the issue or issue component
in question.
The complete syntax for the SICI code will not be discussed here; see
NISO/ANSI Z39.56-1997 [NISO2] for details. However, an example and
brief review of the major components is needed to understand the
relationship with the ISSN and how this identifier differs from an
ISSN. An example of a SICI code is: 0015-
6914(19960101)157:1<62:KTSW>2.0.TX;2-F
The first nine characters are the ISSN identifying the serial title.
The second component, in parentheses, is the chronology information
giving the date the particular serial issue was published. In this
example that date was January 1, 1996. The third component, 157:1,
is enumeration information (volume, number) for the particular issue
of the serial. These three components comprise the "item segment" of
a SICI code. By augmenting the ISSN with the chronology and/or
enumeration information, specific issues of the serial can be
identified. The next segment, <62:KTSW>, identifies a particular
contribution within the issue. In this example we provide the
starting page number and a title code constructed from the initial
characters of the title. Identifiers assigned to a contribution can
be used in the contribution segment if page numbers are
inappropriate. The rest of the identifier is the control segment,
which includes a check character. Interested readers are encouraged
to consult the standard for an explanation of the fields in that
segment.
Lynch, et. al. Informational [Page 6]
RFC 2288 Bibligraphic Identifiers February 1998
5.2 Encoding Considerations and Lexical Equivalence
The character set for SICIs is intended to be email-transport-
transparent, so it does not present major problems. However, all
printable excluded and reserved characters from the URN syntax are
valid in the SICI character set and must be %-encoded.
Example of a SICI for an issue of a journal:
URN:SICI:1046-8188(199501)13:1%3C%3E1.0.TX;2-F
For an article contained within that issue:
URN:SICI:1046-8188(199501)13:1%3C69:FTTHBI%3E2.0.TX;2-4
Equivalence rules for SICIs are not appropriate for definition as
part of the namespace and incorporation in areas such as cache
management algorithms. It is best left to resolver systems which try
to determine if two SICIs refer to the same content. Consequently,
we do not propose any specific rules for equivalence testing through
lexical manipulation.
5.3 Additional Considerations
Since the serial is identified by an ISSN, some of the ambiguity
currently found in the assignment of ISSNs carries over into SICI
codes. In cases where an ISSN may refer to a serial that exists in
multiple formats, the SICI contains a qualifier that specifies the
format type (for example, print, microform, or electronic). SICI
codes may be constructed from a variety of sources (the actual issue
of the serial, a citation or a record from an abstracting service)
and, as such are based on the principle of using all available
information, so there may be multiple SICI codes representing the
same article [NISO2, Appendix D]. For example, one code might be
constructed with access to both chronology and enumeration (that is,
date of issue and volume, issue and page number), another code might
be constructed based only on enumeration information and without
benefit of chronology. Systems that use SICI codes employ complex
matching algorithms to try to match SICI codes constructed from
incomplete information to SICI codes constructed with the benefit of
all relevant information.
Lynch, et. al. Informational [Page 7]
RFC 2288 Bibligraphic Identifiers February 1998
6. Security Considerations
This document proposes means of encoding several existing
bibliographic identifiers within the URN framework. This document
does not discuss resolution; thus questions of secure or
authenticated resolution mechanisms are out of scope. It does not
address means of validating the integrity or authenticating the
source or provenance of URNs that contain bibliographic identifiers.
Issues regarding intellectual property rights associated with objects
identified by the various bibliographic identifiers are also beyond
the scope of this document, as are questions about rights to the
databases that might be used to construct resolvers.
7. References
[ISO1] NISO/ANSI/ISO 2108:1992 Information and documentation
-- International standard book number (ISBN)
[ISO2] ISO 3297:1986 Documentation -- International standard
serial numbering (ISSN)
[ISO3] ISO/DIS 3297 Information and documentation --
International standard serial numbering (ISSN) (Revision of ISO
3297:1986)
[Moats] Moats, R., "URN Syntax", RFC 2141, May 1997.
[NISO 1] NISO/ANSI Z39.9-1992 International standard serial
numbering (ISSN)
[NISO 2] NISO/ANSI Z39.56-1997 Serial Item and Contribution
Identifier
[Sollins & Masinter] Sollins, K., and L. Masinter, "Functional
Requirements for Uniform Resource Names", RFC 1737, December
1994.
Lynch, et. al. Informational [Page 8]
RFC 2288 Bibligraphic Identifiers February 1998
8. Authors' Addresses
Clifford Lynch
Executive Director
Coalition for Networked Information
21 Dupont Circle
Washington, DC 20036
EMail: cliff@cni.org
Cecilia Preston
Preston & Lynch
PO Box 8310
Emeryville, CA 94662
EMail: cecilia@well.com
Ron Daniel Jr.
Advanced Computing Lab, MS B287
Los Alamos National Laboratory
Los Alamos, NM, 87545
EMail: rdaniel@acl.lanl.gov
Lynch, et. al. Informational [Page 9]
RFC 2288 Bibligraphic Identifiers February 1998
9. Full Copyright Statement
Copyright (C) The Internet Society (1998). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Lynch, et. al. Informational [Page 10]