Introduction

This ONIX for Books Implementation and Best Practice Guide is intended to be read in conjunction with the ONIX for Books Product Information Format Specification, and section numbering is compatible between the two documents.

This guide documents ‘best practice’, not ‘common practices’, and its aim is to improve the quality of communication using ONIX, industry-wide, by ensuring ONIX users implement the standard in a consistent and effective fashion.

‘Best practice’ is dynamic: as data providers and recipients improve their capabilities, as business models develop, and as market conditions change, the nature and extent of these best practice guidelines is liable to evolve. And the guidance in this document is not entirely limited to the data content of individual ONIX elements, but includes some comments on reasonable expectations for how that data should be treated by both data suppliers and recipients.

Within this Implementation and Best Practice Guide, the ONIX for Books message Header and each Block or Group of a Product record is considered in turn. Each of these sections begins with a boxed summary of the best practice, and is followed by one or more extended and annotated examples, and more detailed discussion of each composite or data element.

Throughout this guide, Reference names are used in text descriptions, diagrams and discussions of best practice. In all cases, equivalent Short tags are equally good practice, and examples given show both alternatives.

It is recognized that this Implementation and Best Practice Guide may be – for some ONIX users – somewhat beyond the current capabilities of their internal IT systems, and initial implementations of ONIX 3.0 may not meet all aspects of this best practice. However, implementors are encouraged to plan to comply with all aspects of best practice as a minimum, even if this plan involves a multi-phase implementation.

This guide does not set out to define a ‘minimum’ set of data elements for valid ONIX – technically, that minimum set is inherent in the ONIX schema, and is all but useless. In reality, the minimum set of elements is the set that meets your business requirements – and real business requirements will inevitably require use of a larger set of data elements than any technical minimum. These best practices go well beyond any technical minimum, and outline a set of data elements that are likely to be the most commercially relevant in a broad range of circumstances.

Omission of any data element from this Implementation and Best Practice Guide does not imply that all data providers should simply omit the data element from their ONIX messages – the uses to which ONIX data is put by data recipients vary widely, and the range of potentially useful data content that might be carried in ONIX messages should be discussed between providers and recipients. For certain types of product and particular trading arrangements, individual data elements may be of great importance, yet may not be covered in this guide. For example, it may be highly advantageous for the publisher to supply details of the product classification using the UNSPSC (UN Standard Product and Service Classification), WCO (World Customs Organization) Harmonized Commodity Coding or another commodity coding scheme to overseas distributors as an indication of the customs or tax status of the product. Equally, this best practice guide does not cover use of the <ReligiousText> composite, which of course would be critical to a publisher or reseller specializing in religious publications. However, this document provides guidance on those data elements that will be found most commonly useful to the broad range of supply chain partners – the ‘core content’ of an ONIX message.

Although ONIX 2.1 is considered a legacy format, this Guide can provide useful help for 2.1 in areas where there are no differences, or only modest changes, between 2.1 and 3.0. An appendix lists these changes.

Introduction to Release 3.0 revision 1 (3.0.1)

This revision of the Implementation and Best Practice Guide includes a number of updates taking into account additions made in Release 3.0 revision 1 (3.0.1) of the ONIX for Books Product Information Format Specification. Many of these changes are intended primarily for use with East Asian writing systems, or with multilingual metadata, but other additions are of more general use. The earliest release of the codelists suitable for use with version 3.0.1 is Issue 16.

Introduction to Release 3.0 revision 2 (3.0.2)

This revision of the Guide includes a number of updates taking into account additions made in Release 3.0 revision 2 (3.0.2) of the ONIX for Books Product Information Format Specification. Some data elements introduced in version 3.0.2 require use of Codelists Issue 24 or later.

Document history

14 April 2011
Working draft.
30 June 2011
Initial release.
26 July 2011
Added appendix listing key changes between ONIX 2.1 and ONIX 3.0.
Added stronger recommendation to use UTF‑8 encoding, and a note on omission of BOM with UTF‑16 encodings.
Added advice against including a DOCTYPE declaration or any reference to local schema files in transmitted messages.
Added note on use of the textformat attribute, and either CDATA or escaping of < characters in HTML.
Added guidance on preferred format for <SentDateTime> and datestamp.
Added note on sequence numbering for <UnnamedPersons> element.
Minor changes for clarity and style.
22 August 2011
Added note about use cases.
Added appendix checklist for new data feeds.
Added ‘At a glance’ diagrams.
Fixed section P.19.14.
Minor changes for clarity and style.
13 Sept 2011
Added appendix on codelists and internationalization.
Added notes about delivery of collateral resources.
Correction to <Header> At a glance diagram.
Correction to <Tax> composite example.
Minor changes for clarity and style.
4 Oct 2011
Added diagram illustrating semantics of List 64.
Minor changes for clarity and style.
21 Oct 2011
Added example illustrating use of <ProductFormFeature> for describing accessibility features of an e-book.
Minor updates to take into account other additional codes in Codelists Issue 15.
Added example showing use of publisher and publishing group.
Update of XML namespace used by ONIX messages.
Added notes on governance.
Minor changes for clarity and style.
13 Dec 2011
Corrections to Collection and Sender ‘At a glance’ diagrams.
Added appendix on development and support lifecycle.
Minor changes for clarity and style.
27 Jan 2012
Revised release, to accompany Release 3.0 revision 1 of ONIX for Books.
Added notes and examples related to provision of textual metadata in multiple ‘parallel’ languages, provision of transliterated names, inclusion of glosses in Chinese and Japanese, and provision of phonetic information for sorting names and titles in non-alphabetic writing systems.
Added notes relating to the <CollectionSequence> composite, the <SequenceNumber> within the <TitleElement> composite and the <ProductContact> composite.
Modified ‘At a glance’ diagrams to include new data elements introduced in ONIX 3.0.1.
Added examples showing wholesale and temporary ‘special offer’ pricing, and provision of multiple publishing dates for an e-book.
Minor changes for clarity and style.
10 Feb 2012
Added notes to some ‘At a glance’ diagrams.
Change to appendix to reflect the ONIX International Steering Committee decision and EDItEUR’s announcement that ONIX 2.1 support will be reduced at the end of 2014.
Minor changes for clarity and style.
18 Apr 2012
Added example showing versioned e-publication file format.
Updated best practice for ISBN-A.
Minor changes for clarity and style.
25 Apr 2012
Additions in P.7 to recommend use of the newly-published ISNI.
14 Aug 2012
Added paragraphs in Section 2 on the ‘metadata supply chain’, on the use of EDItX or EDI P&A update messages, and on user interfaces.
Added typical schema references and DOCTYPE declarations for validation.
Clarifications in use of the <TitleElement> composite and <ImprintIdentifier>.
Corrected minor errors in examples showing use of <Language> and <NameIDType>.
Added notes on use of rolling and instantaneous datetimes.
Added notes on CI/RI/CX/RX combinations in <Territory>.
Added example showing multiple, qualified prices.
Updated diagrams to reflect recent tightening of schema requirements.
Other minor changes for clarity and style.
19 Oct 2012
Added further clarification on naming of imprints and publishers.
Documented subset of List 65 likely to be applicable to digital products.
Added further clarification of best practices for tax-inclusive and tax-exclusive pricing.
Strengthened advice on use of narrow audience age ranges.
Extended example showing usage constraints in P.3.
Added example of review quote in P.14.
Added Sales outlet ID to example in P.21.
Added note on 2-series GTIN-13s.
Added note on RTL scripts.
Minor updates for clarity and style, and to align with Codelists Issue 19.
Fixed SVG to work around a problem with Firefox 17 (and later).
25 Jan 2013
Added sections on <Barcode> and <ContributorPlace> for North America.
Added section on specifying rental prices using <PriceCondition>.
Added examples of <RelatedWork> for an omnibus edition, and of use of et al.
Added diagrams for common packaging.
CSS improvements for printing.
Other minor changes for clarity and style, and updates to align with Codelists Issue 20.
26 Apr 2013
Added note on overseas territories and dependencies.
Added tables showing associations between <ProductFormDetail> and <ProductFormFeature>, and <ProductForm>.
Expanded notes on best practice with hierarchical subject categorization.
Added note on sort order.
Added example differentiating perpetual licenses and rentals of e-books.
Other minor changes for clarity and style, and updates to align with Codelists Issue 21.
19 Jul 2013
Added notes on encoding of versions, version history, extended XHTML recommended subset to include <dl>.
Added note on relative URIs in <ResourceLink>.
Added note on avoiding special characters in <RecordReference>.
Other minor changes for clarity and style, and updates to align with Codelists Issue 22.
Fixed error in labelling of ‘At a glance’ diagram in P.14.
18 Oct 2013
Added example of use of track count for extent of an audiobook.
Added example of retailer exclusion.
Added section on use of <ProductClassification>.
Other minor updates to align with Codelists Issue 23.
24 Jan 2014
Revised release, to accompany Release 3.0 revision 2 of ONIX for Books.
Added further notes on message sequence numbers, added para and example on <NoProduct/> and modified ‘At a glance’ diagram.
Added section on <EpubLicense>.
Added para on <NoPrefix/>, modified ‘At a glance’ diagram and updated examples.
Added note to <ContributorPlace>, modified ‘At a glance’ diagram and updated examples.
Added para on <PrizeStatement>, modified ‘At a glance’ diagram and updated example.
Modified section on <SalesRestriction>, modified ‘At a glance’ diagram and updated examples.
Added section on <CopyrightStatement>, added ‘At a glance’ diagram and added example.
Added section on <PriceIdentifier>, modified ‘At a glance’ diagram and updated examples.
Added para and example on linked prices in <PriceCondition>.
Added note on contrast between 3.0 handling of sales rights, markets, suppliers and prices, and equivalent in 2.1.
Added appendix sections on character sets and encodings, and on XHTML tagging.
Other minor updates to align with Codelists Issue 24.
29 Mar 2014
Corrected diagrammed cardinality of <AgentRole> in P.25 (it is mandatory).
Corrected second example in P.26.
Added definition of market publication date.
Added para on gaming the keywords.
Added note on conventions for decimal places in <PriceAmount>.
Minor updates to align with Codelists Issue 25.
11 Jul 2014
Added example to illustrate US ‘Common Core’ curriculum alignment in <Subject>.
Added example to illustrate use of <SupplierOwnCoding>.
Added example illustrating rental extensions and upgrades.
Modified ‘At a glance’ diagrams to highlight deprecated elements with pink tint.
Added section on <Stock> composite.
Added examples and diagrams to illustrate use of contributor dates and professional affiliations.
Corrections to ‘At a glance’ diagram showing structure of <ProductClassification> and of Block 6.
Extended notes on whitespace normalization, CDATA and escaping HTML in <BiographicalNote>.
Extended notes on <ProductRelationCode> relationships.
Other minor changes for clarity, and updates to align with Codelists Issue 26.
17 Oct 2014
Added aside on manifestations and works.
Other minor changes and updates to align with Codelists Issue 27.
24 Jan 2015
Added ‘DOI per chapter’ example in <ContentDetail>.
Added notes on digital exclusivity.
Elaborated example showing change of publisher of a product.
Noted that there is no longer a need to remove the xmlns attribute from <ONIXMessage> prior to DTD validation.
Added references to new ONIX Acknowledgement Message.
Other minor changes and updates to align with Codelists Issue 28 and the sunset date for ONIX 2.1.
26 Mar 2015
Added example of windowing around subscription services in P.24.
Added some text on Complexity measures.
Added appendix section on integer and real number data elements.
Other minor changes and updates to align with Codelists Issue 29.
29 Jul 2015
Added example of back cover copy as separately-downloadable supporting resource.
Added example of POD information, including order time.
Added example of revenue-share market using Unpriced item type.
Other minor changes and updates to align with Codelists Issue 30.

The ONIX for Books framework

ONIX for Books provides a standardized framework for transfer of rich bibliographic and product information between computer systems within the book and e-book supply chains.

The framework consists of an XML-based message specification – the ONIX for Books Product Information Format Specification (‘the Specification’), a message acknowledgement specification (‘the Acknowledgement Specification’), and the accompanying XML schemas – plus a set of regularly-updated controlled vocabularies (the ‘Codelists’) used in conjunction with the specifications, and various guidelines for implementation and best practice including this document (‘the Guide’). In addition, some national book trade bodies provide further guidance or certification schemes to denote compliance with the framework and with the wider needs of the supply chain.

The ONIX for Books documentation and various XML software tools are downloadable from the EDItEUR website:

Ongoing development and maintenance of the ONIX framework is managed by EDItEUR, in response to business requirements identified by its stakeholders. Although EDItEUR is a membership-based organization, ‘stakeholders’ encompasses all participants in the global book and e-book supply chains.

EDItEUR has established a governance model for ONIX that ensures a balance between responsiveness to changing business requirements and stability of the standards. ONIX for Books standards are developed and maintained in consultation with a network of ONIX National Groups – essentially committees of ONIX users and other stakeholders such as trade associations – that have been set up in many countries. National Groups are each represented on the ONIX International Steering Committee (ISC), which meets twice per year at the London and Frankfurt International Book Fairs. The ISC provides overall direction for development of the framework. Terms of reference for National Groups and for the ISC can be found on the EDItEUR website.

Change requests or proposals for new developments can be submitted by any stakeholder, either via a National Group or direct to EDItEUR. Most such requests can be met through supplementing the ONIX codelists. More rarely, a change to the Specification and associated XML schemas is required. In either case, requests are reviewed by EDItEUR and the National Groups, and major developments must be approved by the ISC. After approval, major developments are carried through in line with the ONIX development and support lifecycle.

Codelists are revised and extended on a regular basis (three or four times per year), and from time to time, minor corrections and revisions are incorporated into documentation and schemas. Such updates are announced via the EDItEUR website and the ONIX_implement e-mail listserver.

ONIX is founded partly on principles developed within the <indecs> project and upon the EPICS data dictionary, but is firmly rooted in real-world use cases and the practices of book supply chains in many countries. Typical use cases for ONIX messages include:

Although all of these supply chain partners are likely to maintain databases of product information, ONIX is primarily a message format, and is not in itself a specification for a database. ONIX specifies a range of key data elements or fields that might be stored in such a database. But the data structures defined by the ONIX specification are designed for communication of data. They are not necessarily a good match for either storage or management of product data. Databases optimized for management of bibliographic data would ideally have a more normalized (relational) or hierarchical structure to reduce the duplication of data (and thus the duplication of data management effort). Nonetheless, the ONIX specification is often used to inform the design of a database schema. And while ONIX codelists are often used as controlled vocabularies or lists of options in a database-backed application, the way ONIX encodes options (usually as numbers drawn from the codelists) is not ideal for presentation.

Implicit in the design of ONIX is the idea of a ‘metadata supply chain’. Product information is created alongside the product itself (though not necessarily by the same staff) and flows to the eventual retailer, possibly via one or more intermediary organizations – much as the products themselves do. But the metadata may take a different route, and may be managed and enhanced along the chain: no one department at the publisher, and no single organization in the chain, controls all the metadata elements. For example, within a publisher, product information may be drawn from editorial, production, advertising and promotion, the contracts or rights departments, and from sales. The distributor or wholesaler may contribute further information, and the eventual distribution of the data is sometimes accomplished by a data aggregator or other intermediary. Complete, accurate and timely metadata is the result of good business process and clear ‘ownership’ of each data element, and collating the metadata efficiently from the disparate sources may depend on a high level of process and application integration within an organization.

The metadata supply chain may be complex: publishers may need to provide data direct to retailers, indirectly to retailers via intermediaries such as data aggregators, distributors and wholesalers, and to other service providers such as e-book conversion vendors and online marketing organizations.

And because ONIX is about communication – unambiguous and consistent computer-to-computer communication suitable to support automation of business processes – precise semantics are important. Each party in the metadata supply chain must understand the meaning of the data supplied. A price is not simply ‘the price’: buyers and sellers must know whether that price includes or excludes relevant taxes, whether it is a fixed price or may be reduced by the retailer, and whether it applies to all classes of customer or is a special price for one specific class (such as for schools). This semantic precision should allow a single ONIX data feed to meet the needs of multiple recipients – if one party wants the extent of a book expressed as ‘the number of physical pages including unnumbered pages and blanks’ and another demands the extent as ‘the highest page number’, then both can be included in the ONIX message. And of course, to maintain this semantic precision, senders should never put data into a field simply because they want it displayed in a particular way by a specific recipient (for example putting a publisher name into a series or collection title field so that the recipient displays it more prominently).

In principle, a single ONIX message that meets the best practices described here should be suitable for any recipient able to accept ONIX 3.0. In practice, of course, business sensitivities may mean that one recipient should not receive details of a particular product that is exclusive to a competitor, or e-book retailers may wish not to receive details of physical products they cannot retail – which means that the content of messages can become recipient- or channel-specific. However, this specificity is limited to the selection of Product records included in the message, and the exact content of each Product record should not need to be varied.

Every ONIX recipient should be able to receive and process any correctly-constructed ONIX message, irrespective of whether they use every particular data element. For data senders, there should be no need to omit or include elements or whole composites on a recipient-by-recipient basis. As a simple example, if a particular retailer requires the book’s subject expressed using a code from a particular subject classification scheme, it should not reject messages that also contain codes from other schemes. At the very least, a recipient should be able to ignore elements and composites it does not wish to process. In a more complex case, there are some elements of ONIX data that may apparently be communicated in two different ways, but with the ‘correct’ option depending on the exact circumstances: for example, a collection title might be included in Group P.5 or in Group P.6. Suppliers and recipients should ensure they enable both options, rather than supporting only one and thus forcing business partners to use an incorrect option.

High-level message structure and conformance

ONIX for Books is an XML-based standard, and it is obviously an overriding requirement that ONIX messages are well-formed according to the XML syntax rules and valid according to the ONIX schema, and that they conform to the ONIX for Books Message Format Specification (including conforming to any ‘business rules’ described in the specification but not expressed in the DTD, XSD or RNG schemas – some of these rules may be enforced by future versions of the schemas). However, within that, there is huge scope for variation in what is considered valid ONIX for Books data. This document is intended to guide implementors, with the intention of narrowing the range of variation between implementations (particularly between different countries) and making it simpler for data senders and recipients to exchange ONIX data – adherence to agreed guidelines and best practice means less need for testing when establishing new data relationships, less tailoring of messages for individual recipients, and fewer ‘special cases’ built into recipient’s systems – all of which lower costs.

Consistent use of the ONIX data elements – putting the right data in the right boxes – is clearly paramount. And consistent use of codes drawn from the ONIX codelists is vital in this regard.

All ONIX for Books release 3.0 messages must begin with a standard XML declaration and message start tag. The <ONIXMessage> start tag must of course be balanced by an </ONIXMessage> tag at the end of the message. Between the two, there must be a <Header> and either one or more <Product> records or a single <NoProduct/> tag. The message header contains information about the message itself – which organization sent it, when and to whom – and each ‘Product record’ contains information describing a single product.

ONIXMessage Click to jump to diagram showing internal structure of <Header> Header Click to jump to diagram showing internal structure of <Product> Product NoProduct start circle represents <ONIXMessage> represents a <Product> composite, also known as a ‘Product record’. The arrows show the composite is repeatable represents a <NoProduct> element. The arrows show the <Product> composite and <NoProduct> element are mutually exclusive end circle represents </ONIXMessage>

This diagram shows the highest-level composites and data elements within an ONIX message – other similar diagrams show lower-level composites and elements within specific parts of the message. Composites have a darker blue background, individual data elements have a pale blue background, and most are clickable – they link either to a diagram showing the internal structure of the composite, or to detailed notes about the composite or data element. Elements or composites that are not included in EDItEUR’s ‘best practice’ guidelines (though they may still be needed to meet specific business requirements – for example the need to send supplier-specific codes using <SupplierOwnCode> or ‘empty update’ delta messages using the <NoProduct/> element), are grey rather than blue. The small number of deprecated elements such as <DateFormat> are tinted pink.

The arrows indicate the required sequence of data elements and composites within the message, and whether a particular element is optional (the arrow bypasses the element) or repeatable (the arrow loops back). Note that the diagram shows more information than the simple cardinality statement in the Specification – the cardinality of <Product> is nominally 0…n, but the diagram clearly indicates that if <Product> is omitted then <NoProduct/> must be included, and that they may not both occur in the same message.

message start, including optional (but recommended) encoding and namespace declarations
using Reference names
<?xml version="1.0" encoding="UTF-8"?>
<ONIXMessage release="3.0" xmlns="http://ns.editeur.org/​onix/​3.0/​reference">upper case M
using Short tags
<?xml version="1.0" encoding="UTF-8"?>
<ONIXmessage release="3.0" xmlns="http://ns.editeur.org/​onix/​3.0/​short">lower case m
The choice between <ONIXMessage> (upper case M) and <ONIXmessage> (lower case m) in the message start dictates the use of either Reference names or Short tags throughout the remainder of the message.
message end
using Reference names
</ONIXMessage>
using Short tags
</ONIXmessage>

The encoding declaration is technically optional, but is effectively mandatory if your message does not use UTF‑8 or -16, and it is best practice to include it even if it does. UTF‑8 is the recommended encoding for ONIX data, although recipients should also accept other encodings in common use in particular markets, for example ISO 8859‑15 (ISO Latin‑9) or Windows‑1252 (Windows codepage 1252). If a message uses UTF‑16, the encoding should be declared explicitly as UTF‑16BE or UTF‑16LE, and a byte order mark should not be included at the beginning of the data file.

The XML namespace declaration is also optional, but may be included as an xmlns attribute of the <ONIXMessage> element. The namespace declaration is usually required for validation of the ONIX message using the XSD or RNG schemas. It is not required for validation using the DTD, but if present it may remain in place. (Versions of the DTD dated before 2015-01-24 required any xmlns attribute be removed, but this is no longer necessary.)

ONIX 3.0 messages as transmitted may include the namespace declaration (the xmlns attribute), but should not include any reference to a schema file location (the xsi:schemaLocation attribute), since these are likely to be inaccessible to the recipient of the ONIX data. In particular – and unlike earlier versions of ONIX – transmitted messages should not include a DOCTYPE declaration. Of course such references may need to be added when files are validated and processed internally. (For validation with the DTD, any xmlns namespace declaration would need to be removed and a DOCTYPE declaration may need to be added. For validation with an XSD schema, the xmlns attribute must be present, and for some validation tools, the xmlns:xsi and xsi:schemaLocation attributes also need to be added to the <ONIXMessage> element, giving the local network location of the schema file.)

message start with namespace declaration and reference to location of the XSD schema file, for validation purposes only
using Reference names
<?xml version="1.0" encoding="UTF-8"?>
<ONIXMessage release="3.0" xmlns="http://ns.editeur.org/​onix/​3.0/​reference" xmlns:xsi="http://www.w3.org/​2001/​XMLSchema-instance" xsi:schemaLocation="http://ns.editeur.org/​onix/​3.0/​reference http://intranet/​onix/​ONIX_BookProduct_3.0_reference.xsd">
using Short tags
<?xml version="1.0" encoding="UTF-8"?>
<ONIXmessage release="3.0" xmlns="http://ns.editeur.org/​onix/​3.0/​short" xmlns:xsi="http://www.w3.org/​2001/​XMLSchema-instance" xsi:schemaLocation="http://ns.editeur.org/​onix/​3.0/​short http://intranet/onix/​ONIX_BookProduct_3.0_short.xsd">
The actual location of the XSD schema file, which is the second part of the value of the xsi:schemaLocation attribute – a location on the local intranet in the example – needs to be adjusted depending on local circumstance. The reference style required for the RNG schema is different. The xmlns:xsi and xsi:schemaLocation attributes should be removed before transmission of the message, since the location of the schema file will be different for any recipient.
message start with DOCTYPE reference and location of the DTD file, for validation purposes only
using Reference names
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ONIXMessage SYSTEM "http://intranet/​onix/​ONIX_BookProduct_3.0_reference.dtd">
<ONIXMessage release="3.0">
using Short tags
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ONIXmessage SYSTEM "http://intranet/​onix/​ONIX_BookProduct_3.0_short.dtd">
<ONIXmessage release="3.0">
The actual location of the DTD file – on the local intranet in the example – needs to be adjusted depending on local circumstances. The entire DOCTYPE line should be removed before transmission.
For more on validating XML – the technical process of checking ONIX messages meet the syntactic requirements of the standard – see DIY Schema Validation for Workmanlike ONIX, a comprehensive set of instructions for validation beginners by Tom Richardson of BookNet Canada. While it’s focused on ONIX 2.1, the same principles apply to ONIX 3.0.
21 ONIX XML hints and tips

Organization of data delivery

ONIX data files are sent from a ‘sender’ to a ‘recipient’ – for example from a publisher to a retailer. One data file, or ‘message’, may contain information about many products, and may form part of a sequence or feed of ONIX messages.

There is a separate – and entirely optional – Acknowledgement message which may be returned from recipient to sender, to confirm receipt and processing of the data. Use of the Acknowledgement message should be agreed between parties involved in a message exchange.

A single ONIX data file or ‘message’ includes a snapshot of the sender’s data about a range of products at the moment the message was created – and that snapshot can be transferred into the recipient’s system. But within the sender’s in-house systems, that data is subject to change. For example, as a forthcoming product approaches publication, more comprehensive bibliographic data becomes available, and previously collected data is corrected or refined. Thus exchanges of ONIX data are better thought of as ‘data feeds’ consisting of an ongoing sequence of messages – either a series of complete snapshots of the entire set of data, or (more likely) a series of changes between one snapshot and the next.

For a published product, aside from fixing any genuine errors, some elements of an ONIX Product record are effectively ‘frozen’ – the title or author cannot really change post-publication. However, other parts of the record – particularly collateral, price and availability elements that are mostly in Block 2 and Block 6 – remain subject to change throughout the life of the product. A subsequent message would include any new or updated data.

It is best practice to begin to deliver data about a planned product to supply chain partners several months in advance of actual availability. However, an initial Product record, produced eight or six months prior to publication, is not expected to be complete or final in all respects: at that point in the product’s lifecycle, many parts of the Product record would be subject to change – or missing altogether. Data suppliers should continue to update the Product record as information is confirmed or becomes available, and it is best practice for senders to ensure that all key information is available to recipients at least four months prior to publication. ONIX National Groups may operate compliance or certification schemes that specify more detailed (and perhaps earlier) targets for the delivery of comprehensive metadata prior to publication.

Certain information may only become available or be confirmed after manufacture of a physical product – an exact weight, for example. Data senders should continue to send updates and incorporate the latest available data in their data feed. And of course, although much of the bibliographic data in a Product record is in effect ‘frozen’ as the product is published, price and availability, collateral material such as reviews, relations to other works and products and other data elements should continue to be updated post-publication and throughout the product lifecycle.

Any exchange of ONIX data must be based (at least implicitly) on an agreed set of criteria for selecting a range of Product records from the data supplier’s in-house system – an agreed ‘catalog’ of products. These criteria may vary – how far in advance of publication should products be included, and should out of print products be included? Should the catalog include all markets and all types of product, or be tailored to include only those markets and products suitable for a specific class of recipient (or even a single recipient)? Each distinct set of selection criteria and catalog creates a distinct ‘data feed’, a sequence of messages, and in some circumstances, a particular set of selection criteria may be unique to a single ONIX data recipient.

Data senders should be particularly careful about products that initially fall within the selection criteria, where a subsequent change such as postponement of publication date means they fall outside the critera – such products should continue to be included in the feed. It is a common error to omit them from the subsequent feed, thus leaving the old (and incorrect) data relict in the recipient’s system.

ONIX for Books envisions three different ‘modes’ for a data feed:

  • Full update – supply of the complete set of Product records that fall within the feed’s selection criteria;
    • any ongoing exchange of ONIX data would at least have to begin with a full data feed;
    • each Product record carries all available information about that product, even if the entire Product record is unchanged from a previous message;
    • implementing a data feed which is simply a series of full updates is simple as there is no need to track when updates are made in the sender’s system, and no need for the recipient to ensure that all messages are received and processed in order. Missed messages, and messages processed out of order are ‘self-correcting’ when the next message is dealt with;
    • recipients should first match each received record with existing records on the basis of the <RecordReference>, then replace any and all existing data previously associated with each product with the new data in the latest message (this means that any data previously held for a matched product, but not included in the new full update, should be deleted);
    • full update files can be large, and the repeated supply of unchanged Product records places an unnecessary processing burden on recipients;
  • Delta update – supply of the set of Product records that fall within the feed’s selection criteria and where there has been a change to some data element within the record since a previous full feed or delta update. The message itself has the same structure as a full feed, and the method of processing is identical to a full feed;
    • each Product record carries all available information about that product;
    • but the message size is significantly reduced because unchanged Product records are omitted entirely, which greatly decreases the processing burden for recipients;
    • the sender must maintain a modification date or a full journal of changes for each Product record so unchanged records can be suppressed from each message, and must also manage a <MessageNumber> for each distinct data feed;
    • recipients should first match each received record with existing records on the basis of the <RecordReference>. The data in matched records should be replaced with the new data in the delta update message. Unmatched records are new and should be added. Existing records that are not matched should be left unchanged;
    • each recipient must ensure every message is parsed and processed in the order they were sent;
    • omission of a previously-supplied record from a delta update carries the implication that the previously-supplied data is still correct, and so any datestamps attached to the old records in the recipient’s system should be updated;
  • Block-level update – a refined type of delta update, new in ONIX 3.0. The body of an Product record consists of six more or less independent groups of data elements, or ‘blocks’. There are blocks dedicated to describing the nature of product itself, to marketing collateral, to publishing details and territorial rights, etc. These Blocks 1–6 are preceded by data elements in Groups P.1 and P.2 containing information about the record itself and the identifiers (often ISBNs) for the product it describes. This preamble is sometimes informally termed ‘block zero’. A Block-level update message contains only those Product records where data has changed, and for those records, includes only Groups P.1 and P.2 (‘block zero’) plus whichever of Blocks 1–6 contain data that has changed:
    • this minimizes the message size because unchanged Blocks are omitted;
    • but note that updates cannot be more granular than ‘whole block’ – you cannot send an update that consists of just the data element that has changed;
    • block-level updates are more complex to implement for both sender and recipient – for example, journalling of changes must be more granular than for ordinary delta updates;
    • many recipients forget that a block update of one or more blocks also signifies no change in all other blocks, which means that the <SentDateTime> of the block update in fact applies to the modified record as a whole. Omission of a block carries the implication that previously-supplied data is still correct, and so any datestamps attached to the old blocks in the recipient’s system should be updated;
    • note that block-level updates and ‘full record’ updates can be freely mixed within the same delta update message, as they are differentiated at record level (see <NotificationType> within Group P.1);
    • it is likely that some initial ONIX 3.0 implementations will not support block-level updates, so senders and recipients should discuss their capabilities prior to using block-level updates, to ensure that the integrity of the data can be maintained;
    • previous versions of ONIX for Books specified a compact Supply Update message format, which was intended to make delivery of price and availability updates more efficient. For ONIX 3.0, a simple price and availability update is not a distinct message type, but a normal block-level update message consisting of Groups P.1 and P.2, plus Block 6.

An important caveat: where recipients are receiving ONIX data (about the same products) from more than one source, then ‘replacement’ of old data with new should be conditional on both the message source and the relative age of the data, as indicated by the sourcetype, sourcename and datestamp attributes (where they are supplied), or by both the <Sender> identity and <SentDateTime>, as it is not always the case that the most reliable data is the latest to arrive. Recipients should generally maintain a ‘hierarchy of trust’ that guides whether data from a particular source should overwrite earlier data from a different and possibly more trusted source – and this hierarchy may differ from data element to element.

And special requirements may apply to Block 6: a recipient may receive price and availability information from multiple sources in separate ONIX feeds, and may wish to maintain them in parallel. For example a retailer may wish to track stock availability from multiple wholesalers.

When dealing with large numbers of products, full update ONIX files can be large. The comprehensive sample message included in the ONIX Specification is a little over 16 kilobytes for a single Product record: comprehensive Product records in languages other than English and non-Latin based scripts may be larger. And ONIX messages containing tens of thousands of Product records are not unknown. However, use of delta updates greatly diminishes the typical size and complexity of files: most products are simply omitted from most messages, because they have not been updated (the frequency of updates to a single product is much less than the frequency of update messages). Block-level updates reduce the amount of data per record and per message even more. ‘Size’ here is a rough proxy for ‘effort required to parse and ingest the data into the recipient’s database’, or the number of insert and update operations that need to be performed on the recipient’s database.

If the concern is filesize itself (rather than the burden of parsing and ingestion), then using Short tags also helps a little – it usually reduces the message size by around a third – but zipping files before transfer to the recipient is more effective. Zipping can reduce the size by a further factor of three or more. A zipped version of the sample message with short tags is only a little over 4 kilobytes. Ratios are a little different in languages other than English and non-Latin based scripts, and of course neither short tags nor zipping significantly affect the difficulty of the parsing and ingestion process.

Reference names are more commonly used that Short tags, but both are fully supported in ONIX 3.0.
There is a Tagname Converter available, a short XSLT script to translate an ONIX message between Reference names and Short tags (or vice versa). Many XML software tools are available to process the XSLT. The Converter is available from www.editeur.org/93/Release-3.0-Downloads/#Tools.

ONIX messages are normally exchanged via FTP, with the data sender depositing files on a server maintained by the recipient (‘push’), or by the sender maintaining a file on a server that one or multiple recipients download. The latter ‘pull’ approach is considerably more difficult to combine with delta and block-level updates, as the sender has no control over the frequency that fresh data is downloaded by the recipient – though it might be possible by providing a full file at the beginning of each month, then retaining that file plus all subsequent delta or block-level updates on the server until the end of the month, and only deleting all the updates and adding a replacement full update file at the beginning of the next month.

As an alternative to FTP, transferring messages as e-mail attachments may be suitable for messages with small numbers of Product records (fewer than about 100 records).

The mime type for such e-mail attachments should typically be ‘application/xml’, or ‘application/xml+zip’ if the file is zipped before sending.

The naming of ONIX files transferred from sender to recipient is not defined by the standard. Clearly, naming schemes have to be agreed between sender and recipient up front, but a practical naming scheme would probably combine a name for the sender (since the recipient may be receiving files from many senders), plus an element that indicates the order in which multiple files must be parsed, such as:

  • a year and ordinal day number (a number from 1 to 365 or 366). This cannot be used if intra-day updates might be required;
  • the Message sequence and Repeat numbers (the <MessageNumber> and <MessageRepeat> data elements included in the message header);
  • the Message creation date and time (the <SentDateTime> data element from the message header).

ONIX file names commonly include the ‘.xml’, ‘.onix’ or ‘.onx’ suffix (the first two are preferred).

RHGUK2010345.xml (a file sent by Random House UK, on 11th Dec [day 345] of 2010)
Templar_286_2.onix (second repeat of file number 286 from Templar Publishing)
LBBG20101211.xml (file from Little Brown Book Group sent on 11th December 2010)

In addition to transferring the ONIX message data from sender and recipient, supply chain partners need to exchange supporting resources such as cover images (see Group P.16). The ideal method here is the ‘pull’ model, in which the recipient downloads the collateral resource from a URL provided within the ONIX Product record (and the download would be triggered based on the ‘last updated’ date attached to the resource metadata in the ONIX). In practice, it is more common for recipients to push collateral resources to an FTP server maintained by the recipient – in effect a separate feed, perhaps with the filename provided in the ONIX metadata.

Sender and recipient may also need to agree a method for exchange of post-publication price and availability information. While this can be carried very effectively within a continuing series of ONIX messages – and clearly ONIX implementors are encouraged to use this mechanism – it is also common for post-publication price and availability updates to use other message standards. EDItEUR maintains a family of XML-based message formats under the EDItX banner, and a family of older EDIFACT EDI formats. Other EDI formats such as ANSI X.12 or Tradacoms might also be used. Great care is needed to ensure that if EDItX or an EDI method is used for price and availability updates, there is no conflict with information in a post-publication ONIX message sent to update information other than price or availability – in many cases, because the risk of conflict cannot be eliminated, when EDItX or EDI is in use for post-publication price and availability, Block 6 should not be sent in ONIX records sent post-publication.

Implied license to use ONIX data

Certain data included within, cited by or linked from an ONIX message – for example, excerpts from the product, tables of contents, cover or contributor images and other resources – is likely to be subject to copyright. The data in an ONIX message may also be the subject of other property rights, for example a sui generis database right.

ONIX data sent to a recipient is sometimes accompanied by an explicit and formal license covering use of that data. However – even in the absence of an explicit license – when a data supplier provides an ONIX message to a data recipient, there is a clear invitation extended to the recipient with an ‘implied license’ to use the product data supplied for the purposes of cataloging, trading in, merchandising, promoting and selling the products described, both internally within the recipient organization and in customer-facing applications. ONIX data recipients should treat the implied license as non-transferable, non-sub-licensable, and should not redistribute or provide access to the data to third parties either commercially or non-commercially without express permission. ONIX recipients should also treat compliance with Embargo, Valid From/Until and Announcement dates, Audience limitations, display of required credits and the like as a duty of the implied license, and should ensure reasonable efforts are made to process data updates from the supplier in a timely way (including requests to remove the data from public view in the case of legal or other issues). A typical retailer might not require anything beyond this implied license.

For the avoidance of doubt, it is strongly recommended that recipients wishing to make modifications to any data (as opposed to using the data as supplied) and/or intending to redistribute the data to a third party in any manner should secure an explicit license from the data supplier rather than relying solely on the implied license.

Formal agreements such as would be required for commercial or non-commercial redistribution of data supplied in an ONIX message (eg by a data aggregator) or for provision of third-party access to the data (eg via an API) should be concluded separately between data supplier and recipient. These agreements might cover both permitted use of the supplied data and the duties and service level expectations placed upon data supplier and recipient.

Checking your messages meet best practice

If you are an experienced XML developer, then you will have a range of software tools to validate the structure of any ONIX file against one or other of the ONIX schemas (these are available in XSD, RNG and DTD format, though the XSD and RNG are strongly recommended in preference to the DTD). EDItEUR uses the oXygen XML editor but other, similar software tools are equally suitable. Validation is an essential discipline.

But even XSD and RNG schemas check only the structure of the ONIX file and the content of those data elements which use controlled vocabularies from the codelists – they do not validate the content of free text data elements or the check digits of identifiers, nor can such schemas check for potential internal inconsistencies in the Product record. Validation with a schema is not – in itself – enough.

EDItEUR is developing an extended schema (using Schematron) that does check identifier check digits, internal inconsistencies (‘co-occurence constraints’), and provides warnings based on some of the best practice outlined in this Guide.

If you are not experienced with XML, then it can be difficult to judge whether your ONIX data is structured correctly and contains data in line with these best practices. However, XML is only a text file: in principle, it can be read and edited by plain text editors such as Notepad++ (on Windows), or TextWrangler (on Mac OS), and it can be viewed (but not edited) in a web browser. Even the simple built-in Notepad (Windows) or TextExit (Mac OS) applications can be used – but do not be tempted to use a word processor. If you do aim to check your ONIX data this way, it is best to stick to a very small ONIX file – no more than than a couple of dozen Product records.

Opening a small ONIX file in a modern web browser (Firefox 4+, Chrome 9+, Safari 5.1+, IE9+) or text editor (Notepad++, TextWrangler) produces a display like this:

 <Product>
       <RecordReference>com.globalbookinfo.onix.01734529</RecordReference>
       <NotificationType>03</NotificationType>
       <RecordSourceType>04</RecordSourceType>
     <RecordSourceIdentifier>
           <RecordSourceIDType>06</RecordSourceIDType>
           <IDValue>0614141800001</IDValue>
       </RecordSourceIdentifier>
       <RecordSourceName>Global Bookinfo</RecordSourceName>
     <ProductIdentifier>
           <ProductIDType>03</ProductIDType>
           <IDValue>9780007232833</IDValue>
       </ProductIdentifier>
     <ProductIdentifier>
           <ProductIDType>15</ProductIDType>
           <IDValue>9780007232833</IDValue>
       </ProductIdentifier>
     <DescriptiveDetail>
           <ProductComposition>00</ProductComposition>
           <ProductForm>BC</ProductForm>
           <ProductFormDetail>B105</ProductFormDetail>
         <Measure> … </Measure>
         <Measure>
               <MeasureType>02</MeasureType>
               <Measurement>130</Measurement>
               <MeasureUnitCode>mm</MeasureUnitCode>
           </Measure>

This highlights the markup (colored), the data (black) and the nesting structure (via indenting). Clicking on the triangles ( or ) hides or reveals the contents of the composite elements – the first of the two <Measure> composites has been collapsed in the example above.

Web browsers add the indentation automatically, and often appear to show extra new lines within data elements containing text with XHTML markup:
   <BiographicalNote textformat="05">
       <p>
            <strong>Maj Sjöwall</strong>
            was born in Stockholm…
These extra new lines are not present in the ONIX data, and are only added for display purposes. The original data most likely reads:
    <BiographicalNote textformat="05"><p><strong>Maj Sjöwall</strong> was born in Stockholm…

Text editors like Notepad++ and TextWrangler do similar syntax coloring and collapsing of composites (sometimes called ‘code folding’), and they also helpfully number each line of the XML. They do not (usually) add automatic indentation to highlight the XML structure or add misleading line breaks within text elements using XHTML. The basic built-in Notepad and TextEdit applications do not do coloring, folding, line numbering or automatic indentation.

This very basic technique makes it straightforward – though laborious – to check your ONIX file against the examples given in this guide, or to discover how data elements in a publisher’s internal application are expressed in the ONIX messages that the application sends.

It’s often simple to associate database fields in an internal application with XML data elements in an ONIX message. If the application’s user interface for a contributor contains several data fields like those shown below, the mapping from user interface field to ONIX data element is quite clear: Title goes into <TitlesBeforeName>, Given name goes into <NamesBeforeKey>, Family name is used to populate <KeyNames> and the Role field is used to work out the code carried in the ONIX <ContributorRole>. The application might also calculate the <PersonName> and <PersonNameInverted> data elements by combining the parts of the name in the correct order.

Title Given name Prefix Family name Suffix Role Add Prof. Steve Jones (author) +

This application could generate the following ONIX XML:

<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A01</ContributorRole>
    <PersonName>Prof. Steve Jones</PersonName>
    <PersonNameInverted>Jones, Prof. Steve</PersonNameInverted>
    <TitlesBeforeNames>Prof.</TitlesBeforeNames>
    <NamesBeforeKey>Steve</NamesBeforeKey>
    <KeyNames>Jones</KeyNames>
</Contributor>

However this simple user interface also suggests that the application might not properly support Hungarian or east Asian names, since there is no apparent way to populate the relevant <NamesAfterKey> ONIX data element. The application may not be able to differentiate corporate from personal names. And the application may impose a limitation of one role per contributor that prevents the ONIX from complying with best practice when, for example, a single contributor is both author and illustrator.

What can complicate such seemingly-straightforward mappings between parts of an application’s interface and the ONIX messages that the application might send or receive is that many applications use a relational or hierarchical structure to make managing metadata more efficient. Changing a single data field in the user interface (say the Family name) may affect many ONIX Product records, not just a single record. And since ONIX messages may contain data owned or managed by several different departments within a sender organization, the ONIX may be assembled from data taken from several different applications.

Mapping between data elements in an ONIX Product record and database columns in an internal application applies to ONIX recipients too: each distributor, data aggregator or retailer has to parse the ONIX data it receives, and insert the data from its suppliers into the internal database (product catalog). The structure of a retailer’s internal product catalog is likely to be very different from the structure of the database the publisher used to create the ONIX data: since there is usually much less need to manage the data actively, a retailer is less likely to use a highly normalized database structure and may opt for judicious denormalization to gain some performance improvements.

It’s important to understand that creating usable ONIX for Books data is not simply a matter of creating a file that conforms to the structural requirements of ONIX. Ensuring the right data is embedded in each data element, ensuring the data is accurate and that there are no internal inconsistencies, and ensuring that each Product record contains all the data that ONIX recipients require – and that it’s delivered in a timely fashion — are all vital. The business process and workflow challenge inherent in introducing ONIX to an organization is usually much greater than the technical challenge.

ONIX for Books Message header

Header composite

Header Sender Addressee MessageNumber MessageRepeat SentDateTime MessageNote DefaultLanguageOfText DefaultLanguageOfText DefaultPriceType DefaultCurrencyCode DefaultCurrencyCode
sending a <Header>
using Reference names
<Header>
    <Sender>
        <SenderIdentifier>
            <SenderIDType>06</SenderIDType>GLN
            <IDValue>5030670137992</IDValue>
        </SenderIdentifier>
        <SenderIdentifier>
            <SenderIDType>07</SenderIDType>SAN
            <IDValue>0137995</IDValue>
        </SenderIdentifier>
        <SenderName>HarperCollins UK</SenderName>
        <ContactName>Jane King</ContactName>
        <EmailAddress>jbk@hcp.co.uk</EmailAddress>
    </Sender>
    <Addressee>
        <AddresseeIdentifier>
            <AddresseeIDType>06</AddresseeIDType>GLN
            <IDValue>5030670090754</IDValue>
        </AddresseeIdentifier>
        <AddresseeIdentifier>
            <SAddresseeIDType>07</AddresseeIDType>SAN
            <IDValue>0090751</IDValue>
        </AddresseeIdentifier>
        <AddresseeName>Gardners Books</AddresseeName>
    </Addressee>
    <MessageNumber>262</MessageNumber>
    <SentDateTime>20100510</SentDateTime>
</Header>
using Short tags
<header>
    <sender>
        <senderidentifier>
            <m379>06</m379>
            <b244>5030670137992</b244>
        </senderidentifier>
        <senderidentifier>
            <m379>07</m379>
            <b244>0137995</b244>
        </senderidentifier>
        <x298>HarperCollins UK</x298>
        <x299>Jane King</x299>
        <j272>jbk@hcp.co.uk</j272>
    </sender>
    <addressee>
        <addresseeidentifier>
            <m380>06</m380>
            <b244>5030670090754</b244>
        </addresseeidentifier>
        <addresseeidentifier>
            <m380>07</m380>
            <b244>0090751</b244>
        </addresseeidentifier>
        <x300>Gardners Books</x300>
    </addressee>
    <m180>262</m180>
    <x307>20100510</x307>
</header>
Sender composite

The <Sender> composite is mandatory, and identifies the sender of the ONIX message file.

Sender SenderIdentifier SenderName ContactName EmailAddress SenderIdentifier SenderIDType IDTypeName IDValue must not omit both must include if ID is proprietary, otherwise omit

It is best practice to include:

  • any applicable identifiers for the sender organization in one or more repeats of the <SenderIdentifier> composite;
  • the business name of the sender organization in the <SenderName> element;
    • in many cases, the same identifier and name are repeated in <RecordSourceIdentifier> and <RecordSourceName>, where the message as a whole is created by the organization that is also responsible for the data within each Product record;
  • the name and e-mail address of a contact person who is responsible for the technical and operational aspects of the ONIX message, in the <ContactName> and <EmailAddress> elements.

The preferred identifier for international use is the GLN (Global Location Number), although in many countries, there are also local and industry-specific identifiers that are widely used, and these should also be supplied – for example, in many Anglophone countries, the SAN (Standard Address Number) is well established and in other countries, company or tax registration numbers are commonly used. The GLN is a global, cross-industry scheme administered by GS1 for identifying physical addresses. The SAN is an earlier, publishing-specific identifier for physical addresses maintained by Bowker, and is likely to be superseded by the GLN over the next few years. However, SANs are built into many widely-used electronic trading systems, and it is best practice to provide both wherever possible. Note with both SAN and GLN, the referent is an address rather than an organization. The organization is identified only by implication or proxy, as operating at that address. A single organization may have several address identifiers, each identifying a different physical location (for example the publishing office and a distribution warehouse), and care should be taken to ensure the correct identifiers are used.

SANs and GLNs are – to some degree – interoperable. Some SANs may be prefixed with a six-digit magic number, then the check digit recalculated to create a valid GLN. If you have a SAN but no GLN, contact your SAN Agency to discuss conversion.

The business name of the sender organization should not include suffixes such as ‘Inc’, ‘SA’ or ‘Ltd’, unless they are required to distinguish similarly-named organizations. However, where a single organization has several business units that send ONIX messages independently, a suffix should be added to indicate which business unit is responsible for the message.

Providing a contact name and e-mail address is important as this may be used for automated notification of receipt of the message, or when a recipient has a query about the content of the message.

Addressee composite

The <Addressee> composite is optional. However, it is best practice to include it when ONIX messages are prepared for specific recipients – for example where the message contains details of retailer-specific products – and it can be repeated if the message is intended for a specific set of recipients (eg for several bibliographic data aggregators, or for multiple wholesalers). The composite should be omitted if a single message file is made generally available, for example if it is placed on an FTP site for downloading by any supply chain partner.

Addressee AddresseeIdentifier AddresseeIdentifier AddresseeName ContactName EmailAddress AddresseeIdentifier AddresseeIDType IDTypeName IDValue must not omit both must include if ID is proprietary, otherwise omit

If one or more <Addressee> composites are included, then each should include:

  • any applicable identifiers for the recipient organization in one or more repeats of the <AddresseeIdentifier> composite;
  • the business name of the recipient organization in the <AddresseeName> element.

These data elements mirror those in the <Sender> composite, and similar recommendations apply.

H.13 Message sequence number

A message sequence number is optional. However, for practical purposes, it must be included in any data exchange where delta files are used – that is, where ONIX messages contain only those Product records where some change has occurred since the previous message – and its inclusion is strongly encouraged even when full (not delta) data files are used. It is vital that the recipient processes received messages in the correct order, and is able to identify whenever a message has been missed.

The message sequence number is a simple integer that increments by 1 for each new message in a particular data feed. Note that where an organization is generating multiple messages tailored for specific recipients or sets of recipients, the message sequence number for each recipient or set of recipients must be managed independently, and the <Addressee> composite used consistently to identify individual data feeds. Thus a publisher may send out a sequence of ‘generic’ messages without any <Addressee> composites, plus a sequence of tailored messages with an <Addressee> composite identifying retailer X, using the following message sequence numbers:

  • message 123 sent to all recipients
  • message 261 sent to X
  • message 124 to all
  • message 125 to all
  • message 262 to X
  • message 126 to all
  • message 263 to X

In this way, any recipient can identify whenever a message is missing or duplicated. While this is complex, using a simpler scheme based on, say, the day number lacks clarity when events like public holidays intervene – was there meant to be a message on day 125 (Cinco de Mayo)? – or when messages are sent irregularly. A sequence number makes it clear that if message 126 arrives while the recipient is expecting 125, then a message has been missed. And if 125 arrives when 126 is expected, it’s a duplicate message. Similarly, if updates are sent to a regular schedule but the sender deliberately skips a message (perhaps because no records need updating), upon receipt of the ‘next’ regular update, the sequence numbers make it clear that the recipient did not miss a message.

If there is any reason to ensure a regular schedule of messages is maintained, even when no Product records require updating, it is possible to send an ‘empty update’ – a delta file containing no Product records. In addition to the normal <Header> elements, this uses the <NoProduct/> element to provide a positive indication that the message is solely intended as a placeholder. The message sequence number should be incremented as normal, and the only net effects of the empty update upon the recipient system should be to confirm that all previously-received data remains correct, and to increment the expected value of the next message number in the sequence of messages.

Senders sometimes maintain a data feed consisting of a sequence of delta files interspersed with occasional full feed files, perhaps daily deltas and a full file once per four weeks. This ensures that if anything goes amiss with the processing of the deltas, it is automatically corrected at the next full feed. In this case, the sequence numbers of the delta files should increment as normal. If the full feed at the end of the month takes the place of one of the deltas (ie there are 27 deltas and one full feed in a 28 day period, and the full feed must be processed because it may contain updates that are not included in any delta) then it should be given the next sequential number. If the full feed serves only to confirm (ie there are 28 daily deltas plus one full feed in a 28 day period, and the full feed does not contain any updates that are not included in the deltas), the full feed should be given the same sequence number as the last delta. This emphasizes that the end result should be the same whether the recipient processes the full file, or processes all deltas including the last.

H.15 Message creation date/time

The <SentDateTime> element is mandatory.

Where a sender sends out no more than one message per day (in any particular data feed), and the recipient receives data exclusively from the sender, then the format YYYYMMDD is usually adequate. Where a sender may send multiple messages per day, or the recipient needs to aggregate data from multiple sources, then a more precise time should be included. Either of the formats:

  • YYYYMMDDThhmmZ (exact time expressed as UTC)
  • YYYYMMDDThhmm±hhmm (exact time, with a timezone offset)

are recommended. Similar formats with second precision (YYYYMMDDThhmmss±hhmm or YYYYMMDDThhmmssZ) are available, but times to the nearest second are unlikely to be needed. Formats including a time but without timezone information (such as YYYYMMDDThhmm) can be ambiguous and are not recommended. These considerations also apply to the datestamp attribute that acts as a datestamp on individual data elements (a datestamp should be included when the ‘age and reliability’ of a particular data element is significantly different from that of the overall message, such as when data from disparate sources is compiled into an aggregated data feed).

sent on 14th August at 3:10pm in a time zone four hours behind UTC, for example Toronto on Eastern Daylight Time (Eastern Standard Time, with Daylight Saving in operation)
using Reference names
<SentDateTime>20120814T1510-0400</SentDateTime>
using Short tags
<x307>20120814T1910Z</x307>
The two times here are the same – 3:10pm EDT is 19:10 UTC. Use of UTC times is preferred.

No dateformat attribute is required or allowed with the <SentDateTime> element or with a datestamp attribute, and the possible date and time formats are a small subset of those that may be used in other parts of an ONIX message (see List 55). Dates within <SentDateTime> and datestamp (ie the YYYYMMDD part) must always use the Gregorian calendar, even if other dates within the message use other calendars. Times within <SentDateTime> and datestamp (ie the hhmm or hhmmss part) can be specified in any timezone (the ±hhmm part), but using UTC is strongly preferred (the time is suffixed with Z).

Note that in the absence of datestamp information attached to individual data elements or composites, the <SentDateTime> element may be used to indicate when the data in the message was correct (according to the sender), and it is assumed there is no significant delay between generation of the message and its availability to the recipient. In general, in any data feed, the order of messages indicated by <SentDateTime> will be the same as the order indicated by <MessageNumber>. However, in a scenario where delta files are used, <SentDateTime> cannot be used by a recipient to identify when a message has been missed, so <MessageNumber> should be used for ensuring messages are processed in the correct sequence.

H.17 to H.19 Message defaults

It is best practice to omit any <DefaultLanguageOfText>, <DefaultPriceType> and <DefaultCurrencyCode> elements from the header, and to ensure the equivalent information is included within each Product record instead.

ONIX for Books Product record

Product composite

Every ONIX for Books message must contain at least one repeat of the <Product> composite. Each <Product> composite – a ‘Product record’ – contains a mandatory preamble comprising data element Groups P.1 and P.2 that together identify the record itself and the product to which it refers (the preamble is sometimes informally called ‘block zero’). This is followed by some or all of Blocks 1–5 (each Block is optional), plus an optional and repeatable Block 6.

Product RecordReference RecordReference NotificationType DeletionText RecordSourceType RecordSourceType RecordSourceIdentifier RecordSourceIdentifier RecordSourceName RecordSourceName ProductIdentifier Barcode DescriptiveDetail CollateralDetail ContentDetail PublishingDetail RelatedMaterial ProductSupply Preamble (‘block zero’) Block 1 Block 2 Block 3 Block 4 Block 5 Block 6

There are two typical types of Product record, distinguished by their <NotificationType> element. The first and (currently) most common type, contains all applicable information for the product: thus it contains Groups P.1 and P.2 plus most or all of the Blocks 1–6. This type of Product record may be supplied in the context of a ‘full’ data feed where the ONIX message contains Product records for all the supplier’s products, or it may be supplied in the context of a ‘delta’ feed where a message contains only Product records which contain data that has been updated since a previous message was sent. In either case, all previously-supplied data for the product should be entirely replaced by the new data in the message. Data from any previously-supplied records relating to products not included in a delta message should continue to be used unchanged.

Note that, for recipients, the treatment of the records in the message is the same, whether the message is a full-feed or a delta – the difference between the two is simply in the selection of records included in the message.

The second type, as yet not in widespread use, is a ‘block-level partial update’ (or simply, ‘block update’) record that contains Groups P.1 and P.2 (which are sometimes informally referred to as ‘block zero’) plus only those of Blocks 1–6 which contain data that has changed since a previous Product record was supplied. If a particular Product record contains only Groups P.1, P.2 and Block 4, it means that recipients should update information based on the Groups and Blocks supplied, and any previously-supplied data from Blocks 1–3, 5 and 6 should continue to be used unchanged. This type of Product record may be supplied only in the context of a ‘delta’ feed, and must also be distinguished with <NotificationType> code 04.

Note that for a block-level partial update, if any of the data in a block has changed, the whole block must be supplied. Granularity of updates does not extend below block level.

This update strategy means that if a previously-supplied data element is not supplied in any update (full or partial), then the previously-supplied data should be deleted from the recipient’s systems. For example, if a previous message included three contributors, and the update contains only two, the third contributor should be deleted (and in fact, it’s more correct to say that all three previous contributors should be deleted, then the newly-supplied two should be re-inserted).

Block 6 consists of the repeatable <ProductSupply> composite: for the purposes of sending block-level partial updates, a record containing several repeats of <ProductSupply> has only one Block 6. If one particular repeat must be updated, all repeats must be included in the block-level update. But for recipients, Block 6 is exceptional: a recipient may receive supply detail information from several sources (eg a retailer may receive data from a distributor and from several wholesalers). For Blocks 1–5, one particular source of data is likely to be definitive or more trusted. But for Block 6, the recipient should probably treat each feed separately – the wholesaler’s Block 6 may need to be retained in parallel with the Block 6 received from the distributor.

P.1 Record reference, type and source

identifying a Product record (using a meaningless internal ID)
using Reference names
<!-- record 71 of 246 -->
<RecordReference>com.xyzpublishers.onix.32032</RecordReference>
<NotificationType>02</NotificationType>
<RecordSourceType>01</RecordSourceType>
<RecordSourceName>XYZ Publishers</RecordSourceName>
using Short tags
<!-- record 71 of 246 -->XML comment
<a001>com.xyzpublishers.onix.32032</a001>Row ID from internal database
<a002>02</a002>Advance notification
<a194>01</a194>From publisher
<a197>XYZ Publishers</a197>
identifying a Product record (using a UUID)
using Reference names
<!-- record 31 of 46 -->
<RecordReference>f3a85abd-f29e-4e0b-92cc-2fa6a0833022</RecordReference>
<NotificationType>04</NotificationType>
<RecordSourceType>01</RecordSourceType>
<RecordSourceName>XYZ Publishers</RecordSourceName>
using Short tags
<!-- record 31 of 46 -->
<a001>f3a85abd-f29e-4e0b-92cc-​2fa6a0833022​</a001>Type 4 UUID
<a002>04</a002>Block-level update
<a194>01</a194>From publisher
<a197>XYZ Publishers</a197>
identifying a Product record sent for testing purposes
using Reference names
<!-- record 1 of 1 (test) -->
<RecordReference>com.xyzpublishers.onix.9932032</RecordReference>
<NotificationType>89</NotificationType>
<RecordSourceType>01</RecordSourceType>
<RecordSourceName>XYZ Publishers</RecordSourceName>
using Short tags
<!-- record 1 of 1 (test) -->
<a001>com.xyzpublishers.onix.9932032</a001>Test ID should not match any previously-sent live ID
<a002>89</a002>Test record
<a194>01</a194>From publisher
<a197>XYZ Publishers</a197>
deleting a Product record that was issued in error
using Reference names
<RecordReference>com.a2b.wms.01234567</RecordReference>
<NotificationType>05</NotificationType>
<DeletionText>Record issued in error – use record com.​a2b.​wms.​01234574 instead​</DeletionText>
<RecordSourceType>03</RecordSourceType>
<RecordSourceName>A2B Logistics</RecordSourceName>
using Short tags
<a001>com.a2b.wms.01234567</a001>
<a002>05</a002>Deletion
<a199>Record issued in error – use record com.​a2b.​wms.​01234574 instead​</a199>
<a194>03</a194>From wholesaler
<a197>A2B Logistics</a197>
In addition, a complete Product record for a deletion requires a <ProductIdentifier> composite, but no other part of the Product record is necessary.
P.1.1 Record reference

The <RecordReference> element is mandatory in every Product record. It serves to identify the Product record itself, so it must obviously be unique within a particular ONIX message file, and must not change if the Product record (with either identical or modified data) is resupplied in a subsequent message. Note the record reference does not identify the product, but rather the collection of metadata that describes the product. For this reason, the record reference should not be (or look like) a product identifier such as an ISBN. However, the record reference may include a product identifier within some longer string of characters, where for example the publisher’s internal product management system relies on the ISBN as a key field.

Reliance on an external public identifier as a key field in an internal database is generally considered bad practice. Database keys should be unique and persistent of course, but ideally also meaningless (ie have low affordance). In an internal database design, ISBNs are best treated as attributes of each record, and arbitrary row IDs or UUIDs generated by the database, or purely internal product numbers would be ideal key fields – this maximizes the flexibility of the database design, and of course ensures that records can be added to the database prior to ISBN assignment.

Data recipients may receive Product records relating to a single product from multiple sources – for example, a retailer may receive different records for the same product from several wholesalers. In order to ensure these records can be distinguished and – if necessary – managed separately, record references should as far as possible be globally unique: to ensure uniqueness, it is a good practice to use a record reference that includes a reversed Internet domain name (for example, ‘com.xyzpublisher’). Ideally this should be suffixed with an internal product identifier, or with a product identifier such as the ISBN. An arbitrary internal identifier (such as a record ID in an internal database) is somewhat better than an ISBN, as it will remain unchanged in the rare case where an ISBN may need to be corrected. Alternatively, a UUID (Universally Unique Identifier) constructed in accordance with RFC 4122 would make a good Record reference (types 1 or 4 are best).

When creating unique record references in this way, avoid the characters /, \, $, *, %, ? plus the space and colon characters. Why? Because the record reference can be used to construct filenames for supporting resources (see Group P.16) and these characters often have special meanings or are not allowed in filenames.

Where a single organization has more than one IT system that generates ONIX messages, the record reference should also include an indication of which system the record was generated by.

See P.1.4 Record source type code for details of how data aggregators/redistributors should handle record references.

P.1.2 Notification or update type code

The mandatory <NotificationType> element specifies the ‘type’ of Product record – whether the Product record contains pre-publication data that is subject to significant change, or whether it contains data confirmed upon or subsequent to publication.

The notification type changes in Product records issued at different times in the product’s lifecycle. Later Product records may be matched with their predecessors by matching the <RecordReference> element. An initial ONIX record may be issued a few months in advance of expected publication, with <NotificationType> 01 or 02 (values are taken from List 1). This may be followed by successive updated records with the matching <RecordReference> and with <NotificationType> 02 or 04 (the former for a full update of the whole record, the latter for a block-level partial update containing P.1, P.2 plus only those of Blocks 1–6 that have been updated). It is best practice to issue a full update with type 03 to confirm publication, even when block-level updates are used otherwise, and even when nothing other than the <PublishingStatus> in Block 4 and the <ProductAvailability> in Block 6 have changed. Any post-publication updates – for example price and availability updates – would be types 03 or 04.

Note that List 1 contains other notification type codes for various special purposes: code 05 should be used to indicate a record (as identified by its <RecordReference>) should be deleted (but also see <DeletionText> in P.1.3), and codes 08 and 09 should be used to provide notice of sale and acquisition of products (as identified by a product identifier in P.2, not the <RecordReference>, since it is unlikely the two publishers concerned will use the same Record reference). Code 08, a notice of sale that is sent by the former publisher, should if possible be dealt with as a block-level update by senders and recipients, since best practice for this record type is to include P.1, P.2 and Block 4 only. Code 09, a notice of acquisition, should be treated as a full update, and should be sent by the acquiring publisher. Records containing notification type codes 88 and 89 should always be ignored if they appear in a live data feed – they are intended for use in test messages only, and it is highly inadvisable to mix test and live records in the same ONIX message.

The diagram shows the <NotificationType> codes of two possible sequences of matching product records within a series of ONIX messages, the left timeline starting with an initial message sent more than six months in advance of expected publication and using only ‘full update’ Product records, and the right timeline starting with an initial message sent fewer than six months in advance of publication and using block-level ‘partial update’ Product records.

About six months prior to publication date Publication date Out of print date Type 01 Type 02 Type 02 Type 03 Type 03 Type 03 Type 02 Type 04 Type 03 Type 04 Type 04 Time

When using a potentially very long series of block updates, it is still a useful practice to provide a full update (type 03) at the time of publication for confirmation purposes. This ensures that the record is corrected around publication time, even if earlier updates were applied wrongly for any reason.

Note that the very first inclusion of a particular product in a data feed does not necessarily use <NotificationType> 01 – the notification type indicates the ‘age’ relative to the publication date, and thus provides some measure of confidence in the data. Data in records with <NotificationType> 01 should be treated as highly provisional, and some recipients may choose not to use type 01 records in consumer-facing contexts. If the first notification is later than six months prior to publication, confidence in the data should be higher, and type 02 should be used immediately even if this is the first time the record has been included in a feed. Type 03 should indicate a very high degree of confidence in the data – though this still does not preclude future post-publication correction of genuine errors, updates of data elements such as price and availability, or enhancement of the data by addition of extra information.

For recipients, the treatment of Product records with notification types 01, 02 and 03 is identical. First check whether a Product record with a matching Record reference has been received earlier. If so, it is a record update, and all data in the existing record should be replaced by the newly-received information (this might be through deleting the old record and creating a new one). Otherwise, a new record should be created using the newly-received information. For a record with notification type 04, only certain parts of the existing record should be updated, according to which Blocks are newly-received.

Note that Product records with differing <NotificationType> codes can be freely mixed within a single ONIX for Books message. However, senders should check with recipients before introducing any block-level partial updates, since not all recipients can cope with records of this type. And it is inadvisable to mix live and test records in a single message.

P.1.3 Reason for deletion

When the <NotificationType> code is 05, the reason for deletion should be included. Deletion refers to deletion of a metadata record, not to ‘deletion’ of a product – for example, deletion might be used to resolve a problem where two records have inadvertently been issued for the same product. Abandonment or cancellation of a planned and announced product, or declaring an existing product out of print are not reasons for deletion of a Product record from recipient systems (they should be treated as changes of <PublishingStatus> in Group P.20 or <MarketPublishingStatus> in Group P.25).

Deletions should be extremely rare, and are likely to be processed manually by data recipients.

<DeletionText> is one of a number of textual data elements that are repeatable in order that parallel text may be provided in multiple languages. See the notes with <BiographicalNote> for a fuller explanation.

P.1.4 Record source type code

<RecordSourceType>, <RecordSourceName> and – if possible – <RecordSourceIdentifier> should be set, particularly when they indicate that the record source is different from the <Sender> organization. This happens when data from multiple sources is aggregated and redistributed, for example by a distributor, wholesaler or bibliographic data supplier.

Where an organization aggregates and redistributes product data, the Product records within the redistributed data might retain the same <RecordReference> as the data supplied to the aggregator, or the aggregator may attach new a <RecordReference> to each record. The choice here should be guided by whether the aggregator simply resends exactly what is received, or whether the data is actively managed before redistribution. In the former case, where aggregation is a purely technical exercise, the original record reference should be retained, and the original source of each record should be indicated in <RecordSourceType>, <RecordSourceName> and possibly <RecordSourceIdentifier> data elements. The aggregator should use the <Sender> information, or the <RecordSourceType>, <RecordSourceName> and <RecordSourceIdentifier> elements, from the received records to populate these data elements. In addition, the <SentDateTime> from the received records may be reused as a datestamp attribute attached to the <Product> element in the sent data to indicate the age of the data.

Conversely, if the aggregator actively manages the data prior to redistribution, the aggregator assumes effective responsibility for the data. As a result, it should use a new <RecordReference> and record source information on each record. This clearly indicates that the aggregator is responsible for the redistributed data. This way, the combination of <Sender> information in the <Header> of a message containing redistributed data and the Record source information within each Product record indicate where responsibility for the data content lies.

RecordSourceIdentifier RecordSourceIDType RecordSourceIDType IDTypeName IDValue must include if ID is proprietary, otherwise omit
The preferred <RecordSourceIDType> for international use is the GLN (code 06 from List 44).

P.2 Product numbers

identifying a product with an ISBN and proprietary SKU
using Reference names
<ProductIdentifier>
    <ProductIDType>03</ProductIDType>GTIN-13
    <IDValue>9780001234567</IDValue>
</ProductIdentifier>
<ProductIdentifier>
    <ProductIDType>15</ProductIDType>ISBN
    <IDValue>9780001234567</IDValue>
</ProductIdentifier>
<ProductIdentifier>
    <ProductIDType>01</ProductIDType>Proprietary
    <IDTypeName>BookPoint Wholesale SKU</IDTypeName>
    <IDValue>BP0054321</IDValue>
</ProductIdentifier>
using Short tags, with additional ISBN-A
<productidentifier>
    <b221>03</b221>GTIN-13
    <b244>9780001234567</b244>
</productidentifier>
<productidentifier>
    <b221>15</b221>ISBN
    <b244>9780001234567</b244>
</productidentifier>
<productidentifier>
    <b221>01</b221>Proprietary
    <b233>BookPoint Wholesale SKU</b233>
    <b244>BP0054321</b244>
</productidentifier>
<productidentifier>
    <b221>06</b221>DOI
    <b244>10.978.000/1234567</b244>
</productidentifier>
<productidentifier>
    <b221>26</b221>ISBN-A
    <b244>10.978.000/1234567</b244>
</productidentifier>
Product identifier composite

At least one repeat of the <ProductIdentifier> composite is mandatory. Typically this will carry a standard identifier such as the ISBN that can be used for orders or sales reports within a trading relationship, but the whole composite is repeatable and other standard identifiers, and proprietary identifiers or SKUs may be sent in addition.

ProductIdentifier ProductIDType IDTypeName IDValue must include if ID is proprietary, otherwise omit
Do not bypass <IDTypeName> if <ProductIDType> is proprietary (code 01). Conversely, omit <IDTypeName> for non-proprietary identifier types. This applies to proprietary identifiers in similar composites throughout ONIX 3.0.

For any item that carries an ISBN, it is best practice to include the identifier twice, in separate repeats of the <ProductIdentifier> composite with <ProductIDType> codes 03 and 15. Why? Because some recipients are looking specifically for an ISBN, others for any GTIN-13. The same applies to a product carrying an ISMN (use codes 03 and 25). If instead the product carries a GTIN-13 that is not an ISBN or ISMN, this should be carried in a composite with <ProductIDType> code 03. ISBNs, ISMNs and GTIN-13s should not include hyphens or spaces.

Very occasionally, a product has both an ISBN and a separate GTIN-13 that is not an ISBN. While the assignment of a separate GTIN to an item with an ISBN is entirely unnecessary (the ISBN is by definition also a valid GTIN-13), it sometimes happens that a stationery item which has a GTIN-13 enters the books supply chain where some organization requires an ISBN. If this is unavoidable, the ISBN should be carried with <ProductIDType> code 15 and the ‘other’ GTIN-13 should be carried with code 03.

If the product’s ISBN is also registered as an ISBN-A (‘Actionable ISBN’), this should be carried in two further composites with type code 06 (ie as a DOI) and code 26 (ie specifically as an ISBN-A) – just as with ISBN and GTIN-13. And note that for a DOI or ISBN-A, the period and slash (/) characters plus any hyphens and other puctuation characters are an essential part of the identifier. A <ProductIdentifier> composite carrying an ISBN-A should always be accompanied by a composite carrying the matching ISBN – thus it is best practice to include three <ProductIdentifier> composites alongside a composite carrying an ISBN-A (the ISBN-A itself, plus the generic DOI, the ISBN and the generic GTIN-13). All, of course, carry essentially the same number.

In some trading relationships, the GTIN-14 identifier is useful. Broadly, the GTIN-14 identifies trade items, whereas the GTIN-13 identifies retail items. So where an ONIX record describes multiple retail items in a trade-only pack, the pack may be identified with a GTIN-14. The GTIN-14 may be specified in the metadata record (for the pack) using <ProductIDType> code 14 (and within this record, the GTIN-13 of the items in the pack should be carried inside <ProductPart>). There may also be a separate metadata record for the individual retail items in the pack, and that record should not include the GTIN-14. Note that although any GTIN-13 may be expressed as a GTIN-14 simply by prefixing it with a zero, an ONIX record should not carry a GTIN-14 constructed ‘artificially’ in this way, unless the product actually carries a GTIN-14 barcode (which uses a different barcode ‘symbology’ from the GTIN-13).

The continued use of obsolete identifiers such as the ISBN-10 is strongly discouraged, but they may still be needed by certain recipients within the context of a specific trading relationship. If provided, they should be carried alongside an ISBN or GTIN-13.

Use of non-product specific ‘identifiers’ such as price-point GTIN-12s (formerly called UPCs or UCC-12s) as the only identifier is not sufficient as these do not identify specific tradeable items. However, item-specific GTIN-12s or UPCs do identify tradeable items, and should be included in the metadata when (and only when) the product carries a UPC number or barcode, and should use <ProductIDType> code 04. Although any GTIN-12 may be expressed as a GTIN-13 by prefixing the GTIN-12 with a zero, an ONIX record should not carry a GTIN-13 constructed in this way (as for example it would imply the use of a different barcode symbology).

If a proprietary identifier such as a publisher’s internal product identifier, a catalog number or a wholesaler’s SKU is used (<ProductIDType> code 01), then a ‘namespace’ – a consistent name for that identifier – should always be included in the <IDTypeName> element. This applies to all uses of proprietary identifiers or proprietary coding schemes within ONIX for Books – a name for the scheme or identifier should always be included. When minting such names, the data provider should try to promote uniqueness by (for example) including an appropriate organization or domain name, and some indication of what is being identified. ‘PubID’ and ‘SKU’ are not a good names – too likely to be used by others and thus likely not to be unique – whereas ‘Olive Press SKU’ or ‘com.duckhouse.product.id’ are much better. The <IDTypeName> data element has a suggested maximum length of 50 characters.

Many supply chain organizations (correctly) allocate so-called ‘2-series’ GTIN-13s as proprietary identifiers. These numbers (thirteen digits beginning with a 2 and ending with the usual GTIN calculated check digit) come from a range of GTINs designated ‘for internal use only’. Such proprietary identifiers should not usually be included in ONIX records, since they are not intended for communication between organizations. If they are included, in the context of an agreed one-to-one exchange of ONIX data between parties, they must be treated as proprietary IDs, with <ProductIDType> 01 and a suitable, likely-to-be-unique name in <IDTypeName>, and not as GTIN-13s with <ProductIDType> 03.

The <ProductIdentifier> composite is also used to deliver various identifiers that are not used for supply chain transactions, but are important to particular groups of data users – for example library identifiers such as Library of Congress control numbers, OCLC identifiers and national legal deposit numbers. Where these are included in the Product record, they should be accompanied by one or more standard product identifiers.

Barcode composite

This composite specifies whether the product carries a printed barcode, and if so, its symbology and optionally its position. Although in most countries the use of GTIN-13 barcodes on physical products is ubiquitous and need not be stated, in North America the continued use of GTIN-12s means that the symbology is valuable information for retailers. It should be repeated where a product carries more than one barcode.

The composite can also be used to provide a positive indication that the product does not carry a barcode, even when it might be expected (perhaps because the product is of unusually small size).

Barcode BarcodeType PositionOnProduct PositionOnProduct
standard ISBN (GTIN-13) barcode on back cover
using Reference names
<Barcode>
    <BarcodeType>02</BarcodeType>
    <PositionOnProduct>01</PositionOnProduct>
</Barcode>
using Short tags
<barcode>
    <x312>02</x312>GTIN-13 symbology (no extension)
    <x313>01</x313>on Cover 4 (outside back cover)
</barcode>
no barcode
using Reference names
<Barcode>
    <BarcodeType>00</BarcodeType>
</Barcode>
using Short tags
<barcode>
    <x312>00</x312>No barcode on product
</barcode>

Block 1: Product description

Descriptive detail composite

Block 1 of the Product record is intended to carry the core information about the physical or digital nature of the product, its title and authorship, content and subject matter, and it’s target audience.

DescriptiveDetail (P.3 and P.4) ProductComposition ProductComposition ProductForm ProductFormDetail ProductFormDetail ProductFormFeature ProductFormFeature ProductPackaging ProductPackaging ProductFormDescription ProductFormDescription TradeCategory PrimaryContentType PrimaryContentType ProductContentType ProductContentType Measure CountryOfManufacture CountryOfManufacture EpubTechnicalProtection EpubTechnicalProtection EpubUsageConstraint EpubUsageConstraint EpubLicense MapScale ProductClassification ProductClassification ProductPart Continued in Group P.5

P.3 Product form

Additional guidance on the description of digital products in ONIX 3.0 will be found in a separate document ONIX for Books Product Information Message: How to Describe Digital Products in ONIX 3.

describing the form of a simple paperback book product
using Reference names, with measurements provided in both Imperial and metric units for broad international use including North America
<ProductComposition>00</ProductComposition>
<ProductForm>BC</ProductForm>
<ProductFormDetail>B101</ProductFormDetail>Rack-size paperback
<Measure>7⅛ × 4¼ in, ⅝ in spine
    <MeasureType>01</MeasureType>
    <Measurement>7.125</Measurement>
    <MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>02</MeasureType>
    <Measurement>4.25</Measurement>
    <MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>03</MeasureType>
    <Measurement>0.625</Measurement>
    <MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>Weight 6⅞ oz
    <MeasureType>08</MeasureType>
    <Measurement>6.875</Measurement>
    <MeasureUnitCode>oz</MeasureUnitCode>
</Measure>
<Measure>Same measurements
    <MeasureType>01</MeasureType>using mm and gr
    <Measurement>181</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>02</MeasureType>
    <Measurement>108</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>03</MeasureType>
    <Measurement>16</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>08</MeasureType>
    <Measurement>195</Measurement>
    <MeasureUnitCode>gr</MeasureUnitCode>
</Measure>
<CountryOfManufacture>US​</CountryOfManufacture>Manufactured in USA
using Short tags, with measurements provided in metric units only, for international use excluding North America
<x314>00</x314>Single-item product
<b012>BC</b012>Paperback
<b333>B106</b333>Trade paperback (UK usage)
<measure>Traditional ‘Demy’ size
    <x315>01</x315>216 × 138 mm, 19 mm spine
    <c094>216</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>02</x315>
    <c094>138</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>03</x315>
    <c094>19</c094>
    <c095>mm</c095>
</measure>
<measure>Weight 345 gr
    <x315>08</x315>
    <c094>345</c094>
    <c095>gr</c095>
</measure>
<x316>GB</x316>Made in UK
US usage of ‘trade paperback’ is close to UK usage of ‘B-format’. UK usage of ‘trade paperback’ indicates a paperback produced in a larger size (typically a Demy or Royal size).
describing a packaged multi-item product (small book and audio CD in a blister pack)
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SA</ProductForm>
<ProductPackaging>22</ProductPackaging>
<Measure>
    <MeasureType>01</MeasureType>
    <Measurement>145</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>02</MeasureType>
    <Measurement>195</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>03</MeasureType>
    <Measurement>11</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>08</MeasureType>
    <Measurement>195</Measurement>
    <MeasureUnitCode>gr</MeasureUnitCode>
</Measure>
<!-- P.4 Product parts must be included, with -->
<!-- separate Country of Manufacture per item -->
using Short tags
<x314>10</x314>Multi-item product
<b012>SA</b012>Packaging unspecified
<b225>22</b225>Blister pack
<measure>
    <x315>01</x315>Pack is 145 × 195 mm,
    <c094>145</c094>11 mm thick
    <c095>mm</c095>
</measure>
<measure>Note shape is ‘landscape’
    <x315>02</x315>
    <c094>195</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>03</x315>
    <c094>11</c094>
    <c095>mm</c095>
</measure>
<measure>Weight 195 gr
    <x315>08</x315>
    <c094>195</c094>
    <c095>gr</c095>
</measure>
<!-- P.4 Product parts must be included, -->
<!-- with separate x316 per item -->
providing e-book details for an enhanced EPUB format product, with usage constraints enforced by technical protection measures (‘DRM’)
using Reference names
<ProductComposition>00</ProductComposition>
<ProductForm>ED</ProductForm>
<ProductFormDetail>E101</ProductFormDetail>
<ProductFormDetail>E201</ProductFormDetail>
<ProductFormDetail>E202</ProductFormDetail>
<ProductFormFeature>
    <ProductFormFeatureType>15</ProductFormFeatureType>
    <ProductFormFeatureValue>101B</ProductFormFeatureValue>
</ProductFormFeature>
<PrimaryContentType>10</PrimaryContentType>
<ProductContentType>06</ProductContentType>
<ProductContentType>13</ProductContentType>
<EpubTechnicalProtection>03</EpubTechnicalProtection>
<EpubUsageConstraint>
    <EpubUsageType>05</EpubUsageType>
    <EpubUsageStatus>01</EpubUsageStatus>
</EpubUsageConstraint>
<EpubUsageConstraint>
    <EpubUsageType>03</EpubUsageType>
    <EpubUsageStatus>02</EpubUsageStatus>
    <EpubUsageLimit>
        <Quantity>10</Quantity>
        <EpubUsageUnit>05</EpubUsageUnit>
    </EpubUsageLimit>
</EpubUsageConstraint>
<EpubUsageConstraint>
    <EpubUsageType>06</EpubUsageType>
    <EpubUsageStatus>02</EpubUsageStatus>
    <EpubUsageLimit>
        <Quantity>1</Quantity>
        <EpubUsageUnit>10</EpubUsageUnit>
    </EpubUsageLimit>
    <EpubUsageLimit>
        <Quantity>14</Quantity>
        <EpubUsageUnit>09</EpubUsageUnit>
    </EpubUsageLimit>
</EpubUsageConstraint>
<EpubUsageConstraint>
    <EpubUsageType>02</EpubUsageType>Printing
    <EpubUsageStatus>03</EpubUsageStatus>Is disallowed
</EpubUsageConstraint>
using Short tags (additionally, printing is allowed but restricted)
<x314>00</x314>Single-item product
<b012>ED</b012>Digital download
<b333>E101</b333>EPUB format
<b333>E201</b333>Fixed format
<b333>E202</b333>Does not require network connection (the video and audio resources are packaged within the EPUB itself)
<productformfeature>
    <b334>15</b334>File format version code
    <b335>101B</b335>EPUB 3.0
</productformfeature>
<x416>10</x416>Eye-readable text
<b385>06</b385>Enhanced with video
<b385>13</b385>and audio content
<x317>03</x317>ACS4 DRM
<epubusageconstraint>
    <x318>05</x318>Text to speech
    <x319>01</x319>Is unrestricted
</epubusageconstraint>
<epubusageconstraint>
    <x318>03</x318>Copy/paste
    <x319>02</x319>Is limited
    <epubusagelimit>
        <x320>10</x320>Ten
        <x321>05</x321>Percent
    </epubusagelimit>
</epubusageconstraint>
<epubusageconstraint>
    <x318>06</x318>Lending
    <x319>02</x319>Is limited
    <epubusagelimit>
        <x320>1</x320>Only one
        <x321>10</x321>Occasion
    </epubusagelimit>
    <epubusagelimit>
        <x320>14</x320>For fourteen
        <x321>09</x321>Days
    </epubusagelimit>
</epubusageconstraint>
<epubusageconstraint>
    <x318>02</x318>Printing
    <x319>02</x319>Is limited
    <epubusagelimit>
        <x320>64</x320>Max of 64 pages
        <x321>04</x321>
    </epubusagelimit>
    <epubusagelimit>
        <x320>72</x320>Max of 72dpi
        <x321>21</x321>
    </epubusagelimit>
</epubusageconstraint>
providing both folded and flat dimensions, plus scales, for a road map with larger-scale inset city map panels
using Reference names
<ProductComposition>00</ProductComposition>
<ProductForm>CB</ProductForm>
<Measure>
    <MeasureType>01</MeasureType>
    <Measurement>245</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>02</MeasureType>
    <Measurement>140</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>10</MeasureType>
    <Measurement>980</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>11</MeasureType>
    <Measurement>1120</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>08</MeasureType>
    <Measurement>125</Measurement>
    <MeasureUnitCode>gr</MeasureUnitCode>
</Measure>
<CountryOfManufacture>ES</CountryOfManufacture>
<MapScale>500000</MapScale>
<MapScale>25000</MapScale>
using Short tags
<x314>00</x314>Single-item product
<b012>CB</b012>Sheet map, folded
<measure>Folded size
    <x315>01</x315>245 × 140 mm
    <c094>245</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>02</x315>
    <c094>140</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>10</x315>980 × 1120 mm flat
    <c094>980</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>11</x315>
    <c094>1120</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>08</x315>Weight 125 gr
    <c094>125</c094>
    <c095>gr</c095>
</measure>
<x316>ES</x316>Manufactured in Spain
<b063>500000</b063>Scale 1cm to 5km
<b063>25000</b063>Insets at 4cm to 1km
trade pack containing multiple books
using Reference names
<ProductComposition>30</ProductComposition>
<ProductForm>XL</ProductForm>
<Measure>
    <MeasureType>01</MeasureType>
    <Measurement>395</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>02</MeasureType>
    <Measurement>260</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>03</MeasureType>
    <Measurement>180</Measurement>
    <MeasureUnitCode>mm</MeasureUnitCode>
</Measure>
<Measure>
    <MeasureType>08</MeasureType>
    <Measurement>1950</Measurement>
    <MeasureUnitCode>gr</MeasureUnitCode>
</Measure>
<ProductPart>
    <ProductIdentifier>
        <ProductIDType>03</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductIdentifier>
        <ProductIDType>15</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductForm>BC<ProductForm>
    <ProductFormDetail>B105</ProductFormDetail>
    <NumberOfCopies>48</NumberOfCopies>
    <CountryOfManufacture>GB</CountryOfManufacture>
</ProductPart>
using Short tags
<x314>30</x314>Multi-item trade pack
<b012>XL</b012>Shrink-wrapped pack
<measure>
    <x315>01</x315>197 × 130 mm, 18mm spine,
    <c094>394</c094>packed 2 × 2 × 12
    <c095>mm</c095>(48 copies total)
</measure>
<measure>
    <x315>02</x315>
    <c094>260</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>03</x315>
    <c094>216</c094>
    <c095>mm</c095>
</measure>
<measure>
    <x315>08</x315>
    <c094>1950</c094>
    <c095>gr</c095>
</measure>
<productpart>
    <productidentifier>
        <b221>03</b221>GTIN-13 and…
        <b244>9780001234567</b244>
    </productidentifier>
    <productidentifier>
        <b221>15</b221>ISBN of the book inside the pack
        <b244>9780001234567</b244>(the ISBN or GTIN-14 of the pack must be different)
    </productidentifier>
    <b012>BC<b012>B-format paperback
    <b333>B105</b333>
    <x323>48</x323>48 copies
    <x316>GB</x316>
</productpart>
book-as-toy with safety warnings (EU Toy Safety Directive)
using Reference names
<ProductComposition>00</ProductComposition>
<ProductForm>BK</ProductForm>
<ProductFormDetail>B213</ProductFormDetail>
<ProductFormDetail>B215</ProductFormDetail>
<ProductFormFeature>
    <ProductFormFeatureType>13</ProductFormFeatureType>
    <ProductFormFeatureValue>01</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>13</ProductFormFeatureType>
    <ProductFormFeatureValue>07</ProductFormFeatureValue>
    <ProductFormFeatureDescription>http://www.livrescoccinelle.fr/​4123/​CEdéclaration.pdf​</ProductFormFeatureDescription>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>13</ProductFormFeatureType>
    <ProductFormFeatureValue>03</ProductFormFeatureValue>
    <ProductFormFeatureDescription>Not suitable for children under 36 months, due to small parts​</ProductFormFeatureDescription>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>13</ProductFormFeatureType>
    <ProductFormFeatureValue>05</ProductFormFeatureValue>
    <ProductFormFeatureDescription>Not tested on animals​</ProductFormFeatureDescription>
</ProductFormFeature>
<!-- <Measure> omitted for brevity -->
<CountryOfManufacture>CN</CountryOfManufacture>
using Short tags, with safety warning text provided in two parallel languages (English and French)
<x314>00</x314>Single item product
<b012>BK</b012>Novelty book
<b333>B213</b333>Book-as-toy
<b333>B215</b333>Fuzzy felt book
<productformfeature>
    <b334>13</b334>EU Toy Safety
    <b335>01</b335>Carries ‘CE’ logo
</productformfeature>
<productformfeature>
    <b334>13</b334>EU Toy Safety
    <b335>07</b335>‘CE’ Declaration of Conformity
    <b336>http://www.livrescoccinelle.fr/​4123/​CEdéclaration.pdf​</b336>URL of full document
</productformfeature>
<productformfeature>
    <b334>13</b334>EU Toy Safety hazard warning
    <b335>03</b335>Carries ‘0–3’ logo and warning
    <b336 language="eng">Not suitable for children under 36 months, due to small parts​</b336>Publisher’s exact wording of warning
    <b336 language="fre">Ne convient pas aux enfants de moins de 3 ans, contient de petites pièces susceptibles d’être ingérées​</b336>
</productformfeature>
<productformfeature>
    <b334>13</b334>EU Toy Safety
    <b335>05</b335>Associated non-warning text
    <b336 language="eng">Not tested on animals​</b336>Publisher’s exact wording of associated text
    <b336 language="fre">Non testé sur les animaux​</b336>
</productformfeature>
<!-- <measure> omitted for brevity -->
<x316>CN</x316>Made in China
Where text is provided in parallel languages, the language attribute must be included for each language. Where only a single language is provided, the attribute is optional.
P.3.1 Product composition

The <DescriptiveDetail> composite must begin with a mandatory <ProductComposition> element, to specify whether the product:

  • is a solus item (codes 00 or 20 from List 2);
  • contains multiple items (for example, several volumes in a set retailed as one product, or a ‘classroom pack’ of textbooks, with code 10);
  • contains multiple items that are expected to be retailed individually (code 30);
  • Code 11 is an exception in that it describes a collection that is not itself a product.

If the record is not a solus item (ie <ProductComposition> codes 10, 11 or 30) then P.4 Product part information should always be included in the Product record.

Note that products such as an audiobook that consists of multiple CDs are usually not treated as a multi-item retail product (ie it is common to use <ProductComposition> code 00, and the number of discs may be included in <ProductFormDescription>). This is somewhat inconsistent, as a multi-volume collection available as a single item – for example, a boxed set – is generally treated as a multiple-item retail product (ie use <ProductComposition> code 10). However, individual discs are almost never available separately, they have essentially no value without the remainder of the discs, nor do they have individual product identifiers, so treating a multi-disc audiobook as a multi-item product and enumerating the discs using a single repetition of <ProductPart> may be viewed as unnecessary:

describing a multi-disc audiobook as a multi-item retail product
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SA</ProductForm>
<ProductPackaging>05</ProductPackaging>
<!-- <Measure> omitted for brevity -->
<CountryOfManufacture>AT</CountryOfManufacture>
<ProductPart>
    <!-- discs not individually identified -->
    <ProductForm>AC</ProductForm>
    <ProductFormDetail>A101</ProductFormDetail>
    <NumberOfItemsOfThisForm>5</NumberOfItemsOfThisForm>
</ProductPart>
using Short tags
<x314>10</x314>Multi-item product
<b012>SA</b012>Packaging unspecified
<b225>05</b225>In jewel case
<!-- <measure> omitted -->
<x316>AT</x316>Pressed in Austria
<productpart>
    <!-- discs not individually identified -->
    <b012>AC</b012>CD-Audio
    <b333>A101</b333>‘Red book’
    <x322>5</x322>5 discs
</productpart>
describing a multi-disc audiobook as a single-item product
using Reference names
<ProductComposition>00</ProductComposition>
<ProductForm>AC</ProductForm>
<ProductFormDetail>A101</ProductFormDetail>
<ProductPackaging>05</ProductPackaging>
<ProductFormDescription>5 discs</ProductFormDescription>
<!-- <Measure> omitted for brevity -->
<CountryOfManufacture>AT</CountryOfManufacture>
<!-- no ProductPart composite required -->
using Short tags
<x314>00</x314>Single-item product
<b012>AC</b012>CD-Audio
<b333>A101</b333>‘Red book’
<b225>05</b225>In jewel case
<b014>5 discs</b014>
<!-- <measure> omitted -->
<x316>AT</x316>Pressed in Austria
<!-- no productpart composite -->

These two examples have exactly equivalent meanings, but the latter, simpler version is preferred, unless the multiple discs are (or could plausibly become) available separately or there is some other reason to describe the discs individually.

P.3.2 Product form code

This element is mandatory within the <DescriptiveDetail> composite, and for single-item products specifies the nature of the product.

It is generally not good practice to use any of the ‘generic’ codes from List 150, for example code BA for ‘Book – detail unspecified’ or code PA for ‘Miscellaneous print – detail unspecified’, unless the product is being described many months before publication and the detail is genuinely undecided. In this case, the generic Product form code must be updated in a later ONIX message. Occasionally, an unusual product might need both a <ProductForm> code and a full description of the product in <ProductFormDescription>.

For multi-item products with <ProductComposition> codes 10 and 11, codes SB–SF from List 150 describe the way that the items are packaged together for retail sale (eg slipcased, shrinkwrapped or with smaller items physically enclosed within the largest item). However, the <ProductPackaging> data element is more flexible, and may be used with <ProductForm> code SA. This implies there are two acceptable ways to describe multi-item products, for example the product form and packaging of several related volumes in a slipcase (commonly termed a ‘boxed set’):

describing a boxed set without using <ProductPackaging>
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SC</ProductForm>
using Short tags
<x314>10</x314>Multi-item product
<b012>SC</b012>In slip-case
using <ProductPackaging> to describe a boxed set
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SA</ProductForm>
<!-- intervening elements not shown -->
<ProductPackaging>11</ProductPackaging>
using Short tags
<x314>10</x314>Multi-item product
<b012>SA</b012>Packaging unspecified
<!-- intervening elements not shown -->
<b225>11</b225>Slip-cased set

These two examples have exactly equivalent meanings, but the former method is preferred. However, the latter method may be used whenever codes SB–SF do not adequately describe the packaging method.

For multi-item trade packs with <ProductComposition> code 30, codes XC, XE and XL from List 150 should always be used, rather than using code XA plus <ProductPackaging>. If <ProductPackaging> is included, this should refer to the packaging of each item within the trade pack, not to the trade pack itself. If codes XC, XE or XL do not adequately describe the outer packaging of the trade-only pack, use code XA plus descriptive text in <ProductFormDescription>.

P.3.3 Product form detail

Although this is an optional data element, for certain values of <ProductForm> it provides vital extra detail.

And where such detail is provided, <ProductFormDetail> is also repeatable – there may be several properties of the product that are important. For this and other repeatable simple data elements (eg <MapScale>), it is obviously an error to repeat the same information, and although validation via the DTD or schema will not reject such repetition, extended schema validation may do so.

For audio products:

  • use <ProductForm> code AC only for CD-Audio (in the sense of ‘Red Book’ CD-Audio or SACD discs) products, which are eligible to carry the relevant standards compliance logos;
    • then use <ProductFormDetail> codes A101 and A102 to differentiate CD-Audio and SACD products
  • use <ProductForm> code AE for data discs which contain audio data in other data formats. Two good examples are ‘MP3 CDs’, which should not carry the Compact Disc Digital Audio logo but which are playable in some CD or DVD player devices, and DAISY books distributed on CD media;
    • then use other <ProductFormDetail> codes to specify the format of the audio data on the disc (WAV, MP3, AAC etc);
    • reserve <ProductForm> code DB for other types of CD-Rom that contain non-audio data;
  • use <ProductForm> code ED in preference to code AJ for downloadable digital audio products:
    • then use other <ProductFormDetail> codes to specify the data format – for example code A109 for proprietary Audible.com format, A103 for a ‘generic’ MP3 format or A107 for an AAC file;
    • in addition, as with e-books, you may include <EpubTechnicalProtection> and <EpubUsageConstraint> information to describe any limitations on usage of the downloaded data, and any DRM or other technical protection measures.

For printed book products:

  • where <ProductForm> is BB or BC, <ProductFormDetail> B1xx codes may be used to indicate common named categories of product which indicate a certain combination of product form, physical size and sometimes also target market and terms of trade. Physical sizes may be approximate: in contrast, exact measurements should be given in the <Measure> composite;
  • note the different meaning of ‘Trade paperback’ in the US and UK, and the very similar ‘Paperback (DE)’ in the German book trade;
  • other <ProductFormDetail> codes should be included to specify any other important feature of the product – for example a feature of a children’s book such as die-cutting or pop-up engineering, or features of fine binding such as the use of real leather or half-binding.

For e-publications such as e-books, mobile apps including book text, <ProductFormDetail> is used to specify either:

  • a non-proprietary file format such as EPUB (a standard maintained by the IDPF), where any usage constraints and any technical protection measures (DRM) that enforce those usage constraints should be specified separately using the <EpubTechnicalProtection> element and <EpubUsageConstraint> composite;
  • a proprietary file format, which may include platform-specific technical protection. Any usage constraints and technical protection should be be specified using <EpubTechnicalProtection> and the <EpubUsageConstraint> composite;
  • further repeats of <ProductFormDetail> can be used to specify whether the text within the e-book reflows or is ‘fixed-format’, what shape of screen a fixed-format e-book is optimized for, whether there are any network-based resources that form a part of the e-book, or whether content present in an equivalent printed publication has been omitted from the e-book (a common occurrence with trade non-fiction books subsequently converted to e-book, where illustration rights cannot be obtained).
Owing to the rapid evolution of e-publication types, this distinction in the treatment of proprietary and non-proprietary file formats is not a strict rule. Additional guidance on the description of digital products in ONIX 3.0 will be found in a separate document ONIX for Books Product Information Message: How to Describe Digital Products in ONIX 3.

There are common-sense affinities between values of <ProductForm> and values of <ProductFormDetail>. As illustration, these values for <ProductFormDetail> are intended for use only in combination with listed values for <ProductForm>.

Description ProductForm ProductFormDetail
CD-AudioACA101–A102
Digital audio filesAA, AE, AJ–L, AZ, D*, E*A103–A112
A201–A212
All audioA*, D*, E*A301–A304
Paperback booksBCB101–B107
B113–B114, B116–B118
B504
Hardback booksBBB115
B306–B307, B315
B401–B402
Loose leafBDB301–B303
Spiral-boundBEB311–B314
Fine bindingBGB308–B309
B403–B406
All printed booksB*B108–B112, B119–B130
B201–B215
B304–B305, B310
B409–B415
B501–B503, B505–B510
B601–B602
B701–B707
Digital video filesD*, E*, VA, VZD101–D105
Digital on carrierD*D201–D207
D301–D316
e-books and ‘apps’D*, E*E1xx
E2xx
CalendarsPCP101–P113
Calendars or organizersPC, PSP114
StationeryPA, PB, PF, PL, PR, PS, PZP201–P204
Analogue videoVA, VI–K, VZV201–V203

The associations listed in the table are not enforced by the XSD or RNG schemas, but may be enforced by an extended schema. Other combinations might superficially ‘work’, but do not make sense: for example the combination of Product form AG (Sony’s proprietary MiniDisc) might not be entirely incompatible with Product form detail A103 or A107 (MP3 or AAC format audio) – in that MiniDiscs can be used to store arbitrary data files – but in practice, any commercial MiniDisc products used Sony’s own ATRAC recording format.

Spiral binding types:

comb-bound (B311) Wire-O (B312) coiled wire (B314)
Product form feature composite

It is best practice to use the <ProductFormFeature> composite wherever necessary to convey other product attributes that may be significant to the eventual purchaser. To illustrate this, <ProductFormFeature> should be used to specify:

  • any hazard warnings associated with the product (eg CPSIA or EU Toy Safety Directive hazard warnings), as there may be a legal requirement for the warning to be displayed in an online store catalog;
  • the binding material used in a fine binding, or the cover color of a hymnal or other religious text (a congregation may wish to purchase matching colors), or where similar products are available in contrasting colorways (eg boy and girl versions);
  • the text typeface and size used in a large print product, where the exact details may determine suitability of a physical book for a particular print-impaired reader;
  • any required operating system or other technical requirements, for example for an educational software product or mobile ‘app’;
  • where an e-publication file format is versioned (eg EPUB 2 and EPUB 3);
  • ‘accessibility’ features provided by an e-publication, such as provision of alternative textual descriptions attached to images within the e-book, or inclusion of mathematical content using MathML rather than embedding equations as images, that may determine the suitability of the product for a print-impaired reader.

In contrast, <ProductFormFeature> is not expected to be used to specify the cover color or typeface name and size of – for example – a typical fiction hardback, where these details are largely immaterial to the purchaser.

Publishers or printers may use <ProductFormFeature> to state the product complies with FSC or PEFC scheme requirements related to the use of paper and board made in an environmentally-responsible way. But they should exercise particular care in claiming FSC or PEFC compliance, where the provenance of raw materials for future manufacturing batches might be uncertain. If a particular impression is for any reason manufactured with non-compliant material, the scheme logo must of course be omitted from those copies. However, all metadata records must also be updated to remove the compliance statement in a timely way. This applies both to metadata held by the publisher or printer, but also metadata held by supply chain partners and third parties. Statements that are not (or cannot) be updated and removed when necessary (which might be on an impression-by-impression basis) may affect future auditing and re-certification of the organization.

Within the <ProductFormFeature> composite, <ProductFormFeatureDescription> is repeatable in order that parallel text may be provided in multiple languages. See the notes with <BiographicalNote> for a fuller explanation, and the example above showing EU Toy Safety Directive warnings.

ProductFormFeature ProductFormFeatureType ProductFormFeatureType ProductFormFeatureValue ProductFormFeatureValue ProductFormFeatureDescription ProductFormFeatureDescription repeatable, to describe a feature in multiple languages

<ProductFormFeatureDescription> is mandatory with some values of <ProductFormFeatureType> (eg codes 03, 07), and <ProductFormFeatureValue> is omitted. Other Product form feature type codes need no additional description, and <ProductFormFeatureDescription> should be omitted unless it provides greater detail beyond that encoded in <ProductFormFeatureValue>

Description ProductForm ProductFormFeature…
Type Value ¹ Descr ¹
All printed books B* 01, 02, 04
03, 08
1
0
0…n
1…n
Regionalized video VI, VO 05 1 0
Electronic products A* (except AB, AC, AF), D*, E* 06
07
1
0
0…n
1…n
e-books and ‘apps’ D*, E* 09, 10, 15 1 0…n
Hazardous products any 12, 13, 14 1 0…n
Paper products B*, C*, P* 30
31–37
40
0
1
0
0
0…n
0…n

¹ 0 = must be omitted, 1 = mandatory, 0…n = optional and repeatable, 1…n = mandatory and repeatable

specifying color of cover binding material
using Reference names – single color
<ProductFormFeature>
    <ProductFormFeatureType>01</ProductFormFeatureType>
    <ProductFormFeatureValue>PNK</ProductFormFeatureValue>
</ProductFormFeature>
using Short tags – multiple color
<productformfeature>
    <b334>01</b334>Color of cover
    <b335>MUL</b335>Multicolored
    <b336>Pale blue with green panel</b336>
</productformfeature>
specifying the typeface of a large print product
using Reference names
<ProductFormFeature>
    <ProductFormFeatureType>03</ProductFormFeatureType>
    <ProductFormFeatureDescription>Tiresias LP, 18pt​</ProductFormFeatureDescription>
</ProductFormFeature>
. . . .
<EditionType>LTE</EditionType>
using Short tags
<productformfeature>
    <b334>03</b334>Text font
    <b336>Tiresias LP, 18pt</b336>
</productformfeature>
. . . .
<x419>LTE</x419>Large print edition
book manufactured with certified environmentally-responsible paper
using Reference names
<ProductFormFeature>
    <ProductFormFeatureType>32</ProductFormFeatureType>
    <ProductFormFeatureValue>SGS-COC-002061</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>36</ProductFormFeatureType>
    <ProductFormFeatureValue>35</ProductFormFeatureValue>
</ProductFormFeature>
using Short tags
<productformfeature>
    <b334>32</b334>FSC-certified, mixed sources
    <b335>SGS-COC-002061</b335>Chain of Custody reference (Clays Ltd)
</productformfeature>
<productformfeature>
    <b334>36</b334>FSC-certified pre- and post-consumer waste
    <b335>35</b335>Percentage
</productformfeature>
specifying the accessibility features of a fixed-format e-book
using Reference names
<ProductFormDetail>E101</ProductFormDetail>
<ProductFormDetail>E201</ProductFormDetail>
<ProductFormDetail>E210</ProductFormDetail>
<ProductFormDetail>E222</ProductFormDetail>
<ProductFormFeature>
    <ProductFormFeatureType>15</ProductFormFeatureType>
    <ProductFormFeatureValue>101A</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>09</ProductFormFeatureType>
    <ProductFormFeatureValue>10</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>09</ProductFormFeatureType>
    <ProductFormFeatureValue>11</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>09</ProductFormFeatureType>
    <ProductFormFeatureValue>13</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>09</ProductFormFeatureType>
    <ProductFormFeatureValue>15</ProductFormFeatureValue>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>09</ProductFormFeatureType>
    <ProductFormFeatureValue>97</ProductFormFeatureValue>
    <ProductFormFeatureDescription>Tested with Adobe Digital Editions 1.8 and JAWS screen reader on Windows, and with VoiceOver on Mac​</ProductFormFeatureDescription>
</ProductFormFeature>
using Short tags
<b333>E101</b333>EPUB
<b333>E201</b333>Fixed format
<b333>E210</b333>optimized for landscape
<b333>E222</b333>4:3 aspect ratio
<productformfeature>
    <b334>15</b334>e-publication version code
    <b335>101A</b335>EPUB 2.0.1
</productformfeature>
<productformfeature>
    <b334>09</b334>e-publication accessibility detail
    <b335>10</b335>No reading system options disabled
</productformfeature>
<productformfeature>
    <b334>09</b334>
    <b335>11</b335>Navigation via table of contents
</productformfeature>
<productformfeature>
    <b334>09</b334>
    <b335>13</b335>Single logical reading order
</productformfeature>
<productformfeature>
    <b334>09</b334>
    <b335>15</b335>Full alternative text descriptions
</productformfeature>
<productformfeature>
    <b334>09</b334>
    <b335>97</b335>Compatibility testing
    <b336>Tested with Adobe Digital Editions 1.8 and JAWS screen reader on Windows, and with VoiceOver on Mac</b336>
</productformfeature>
specifying operating system and other technical requirements
using Reference names
<ProductForm>DI</ProductForm>
<!-- <ProductFormDetail>D202</ProductFormDetail> -->
<ProductFormFeature>
    <ProductFormFeatureType>06</ProductFormFeatureType>
    <ProductFormFeatureValue>10</ProductFormFeatureValue>
    <ProductFormFeatureDescription>Vista or Windows 7​</ProductFormFeatureDescription>
</ProductFormFeature>
<ProductFormFeature>
    <ProductFormFeatureType>07</ProductFormFeatureType>
    <ProductFormFeatureDescription>Requires at least 2GB of memory, DirectX 9-compatible graphics​</ProductFormFeatureDescription>
</ProductFormFeature>
using Short tags
<b012>DI</b012>DVD-Rom
<!-- <b333>D202</b333> -->MS Windows
<productformfeature>
    <b334>06</b334>Operating system
    <b335>10</b335>MS Windows
    <b336>Vista or Windows 7</b336>
</productformfeature>
<productformfeature>
    <b334>07</b334>Other system requirements
    <b336>Requires at least 2GB of memory, DirectX 9-compatible graphics​</b336>
</productformfeature>
Some operating systems can be specified in either <ProductFormFeature> or <ProductFormDetail> (as shown in the commented line above). <ProductFormFeature> is preferred wherever possible, as more specific version requirements may be included.
CPSIA choking hazard warning on boxed book plus toy
using Reference names
<ProductForm>SB</ProductForm>
<ProductFormFeature>
    <ProductFormFeatureType>12</ProductFormFeatureType>
    <ProductFormFeatureValue>01</ProductFormFeatureValue>
</ProductFormFeature>
using Short tags
<b012>SB</b012>Boxed multiple-item retail product
<productformfeature>
    <b334>12</b334>CPSIA choke hazard warning
    <b335>01</b335>Small parts, not for children under 3
</productformfeature>
If the warning text in List 143 is not suitable, the exact text of a warning can be carried in <ProductFormFeatureDescription> (though <ProductFormFeatureValue> must not be omitted).
P.3.7 Product packaging type code

This should be used with any packaged product to specify the nature of any packaging – for example to describe whether a CD is packaged in a card sleeve, a jewel case, a Digipak, an Amaray-style keep case or some other form of packaging, or to describe a book or a collection of books in a slipcase, or a book plus an audio CD in a blister pack. The data element may be omitted if the product is not contained within any separate packaging.

Common packaging types, and the codes used from List 80:

keep case (03) jewel case (05) slip-cased set (11) slip-cased (10) clamshell (02) blister pack (22) box (09) slip-sleeve (01) digipak (06)

If a product has multiple levels of packaging – for example a slip-cased book that is also shrinkwrapped, use <ProductPackaging> to describe the primary packaging (in this case, it would be the slip-case), and add text in <ProductFormDescription> to describe any secondary packaging.

Note that the height, width and thickness (spine width) dimensions and weights in the <Measure> composite should include any packaging.

See P.3.2 Product form code for notes relating to the packaging of multi-item products and trade packs.

P.3.9 Trade category code

Although not necessarily a recommended best practice, the Trade category is important in some particular circumstances, as it is used, inter alia, to clarify the legal status of particular publications, which may affect their tax status.

specifying a livre scolaire
using Reference names
<TradeCategory>08</TradeCategory>
using Short tags
<b384>08</b384>French livre scolaire, within the context of Decree 2004–922
P.3.10, P.3.11 Primary and Product content type codes

These elements are primarily intended to describe the type of content included within e-publications including e-books, mobile ‘apps’ and other downloadable or online products. For example, a simple e-book may contain just eye-readable text, and this would be specified using <PrimaryContentType> code 10.

‘Enhanced’ e-books or mobile apps that contain both book text and additional content not included in a related ‘unenhanced’ product should be described the same way, but the nature of the enhancements may be specified using a combination of the <PrimaryContentType> and <ProductContentType> elements – <ProductContentType> is repeatable for products that contain several types of additional content or enhancements. ‘Enhanced’ products may also carry the ENH code in <EditionType>. Note that an enhanced edition requires the existence of an ‘unenhanced’ edition – without the latter, the enhanced edition is simply the normal product and should not carry the <EditionType> code ENH. However, the use of additional repeats of <ProductContentType> does not depend on the existence of an ‘unenhanced’ version.

However, the use of these data elements is not limited to e-publications. They may also be used to differentiate between, say, audiobooks containing a single voice reading of a book, and multi-voice ‘dramatized’ audio. It may also be used with multi-item products, where for example product may contain both a book (with eye-readable text) and an audio version of the text, or with ‘multimedia’ educational software products. In general, if omitted from the Product record, the contents of the product should be assumed to be primarily textual.

Measure composite

The <Measure> composite should be included for all physical products.

The whole composite is repeatable, and at minimum, the height and width (across the front cover) dimensions should be provided for all physical products. This may be based on specifications provided in advance to the manufacturer (the printer and binder), or measured from the finished book-in-hand. Thickness (spine width) may not be known until production details such as the page extent and paper caliper are finalized, or until finished copies are available, and it may subsequently vary slightly between different impressions of the same product due to variations in the density and water content of the paper used. In a similar way, weight may not be known accurately until physical manufacture and is subject to equivalent variations. However, estimates of weight and spine width made prior to manufacture may still be useful to recipients and should be provided whenever possible: if weight and spine width measurements are provided greater than two months prior to publication, recipients should treat them as estimates, and senders should send updated data as soon as accurate measurements are available.

Never provide zero measurements: if a measurement is unknown, omit it.

Measure MeasureType Measurement MeasureUnitCode MeasureUnitCode

The primary linear measurement dimensions, and the codes used from List 48:

Spine width (02) height (01) thickness (03) diameter (12) height (01)

Product height, width, thickness (spine width) and weight measurements should include any retail packaging – for a retailer, if a book is sold in a slipcase or box, then it is the dimensions of the slipcase or box that determine whether or not it can be conveniently shelved or shipped. This is doubly important for multi-item products. (Packaging expected to be discarded before retail should not be included.)

For broad international use outside USA, metric units should be used for linear dimensions (millimetres) and weight (grams). For products sold only in USA, Imperial units (inches, ounces) should be used instead. If a product is sold in both USA and elsewhere (for example, globally, or simply in both USA and Canada), inclusion of the same dimensions in both metric and Imperial is clearly the best option, but if only one set of dimensions can be provided, metric is preferred over Imperial. Note that if a book is manufactured to a particular nominal size, there is a certain commercially-acceptable tolerance and the exact size of the finished copies may vary a little: typically, linear measurements should not be specified more precisely than the nearest 1mm or ⅛in. Recipients should treat linear measurements as having an expected accuracy within ±2mm or ±⅛in, and weights with an accuracy of ±5gr or ±¼oz. Similarly, whenever unit conversions are necessary, the results should be rounded sensitively to avoid misleadingly precise measurements (eg conversion of 197mm to Imperial units indicates a size of 7.756in, but round to 7¾ (ie to 7.75) inches).

For hardbacks, note that the overall width and height are not the same as the trimmed page size (trimmed leaf size), as the cover boards overhang the book block. Trimmed leaf size may be provided in separate repeats of the <Measure> composite.

For products such as maps, both folded (or rolled) and flat measurements should be included whenever possible, but if only one set of dimensions can be included, the folded (or rolled) sizes used at retail are preferred.

For multi-item products – both those that are retailed as such (<ProductComposition> code 10–11 and <ProductForm> codes SA–SF), and those that are split before retail (<ProductComposition> code 30 and <ProductForm> codes XA, XC, XE, XL) – measurements should relate to the full multi-item product. For those products that are split before retail, the measurements of a single item must be obtained from the Product record for the individual product.

P.3.15 Country of manufacture

This information is required in some countries to meet legal requirements, so it is best practice to include it for all physical products available internationally. For a physical book, the country of manufacture is the country where it is printed and bound, not the country in which the publisher is based (though of course they may be the same).

If the product is a multi-item product, the country of manufacture should instead be specified individually for each of the physical components in the product, in P.4, especially if different items are manufactured in different countries or if the items in a multi-item trade pack are intended to be retailed individually. For a multi-item product or pack, a pack-level country of manufacture could be interpreted as the country in which the items were packaged together, rather than where they were each manufactured.

P.3.16 Digital product technical protection

This element should be used to specify any technical protection measures applied to the product, and should be used for all downloadable and (where appropriate) other digital products – including the case when no technical protection is applied.

<EpubTechnicalProtection> is not relevant solely to e-books in EPUB format. The ‘Epub’ is simply a contraction of ‘e-publication’, and elements so named are relevant to all types of e-publication.

Some types of e-publication are defined by their unique combination of file format (eg .epub or .xps, specified in <ProductFormDetail>) and type of technical protection (eg Apple or Adobe DRM). For these products, specification of the technical protection type is clearly vital.

Note that technical protection need not necessarily be used for enforcement. Customer-specific ‘watermarking’ (which normally identifies the purchase transaction, rather than the actual person) or other ‘social DRM’ may be applied to digital files at the moment of sale, so that individual digital copies of a product may be linked with a purchaser, without this being associated with any enforcement measures embodied in the product itself.

It is common with e-publications to question whether the <EpubTechnicalProtection> element and the <UsageConstraint> composite describe the file the publisher supplies to the retailer (or retail platform), or the file the retailer sends to the end customer. It is the latter. So in a common case where the publisher-supplied file does not have DRM, and the retail platform adds the DRM wrapper before delivery to the end customer, the ONIX Product record should indicate the product is DRM-protected.
Usage constraint composite

The <EpubUsageConstraint> composite should be used to specify the license terms which apply to a digital product, whether or not these terms are enforced by any technical protection measures. Multiple repeats of the composite should be included to give a clear picture of what a purchaser may and may not do (legitimately) with the content of the product.

EpubUsageConstraint EpubUsageType EpubUsageStatus EpubUsageStatus EpubUsageLimit EpubUsageLimit Quantity EpubUsageUnit

Note that there are no default usage rights or constraints: if no usage constraints are specified in the Product record, it means only that there is no information, and does not imply an unconstrained usage right. Data providers should aim to specify at least those usages and constraints that vary between products on the ‘retail platform’ the product will be sold through.

One unusual feature of many e-publications is that the range of possible uses or potential constraints may change post-publication, as the technical capabilities of the reading platform may be modified through software upgrades. For example, the addition of text-to-speech or temporary lending to a platform may affect prior purchases on that platform. But there needs to be a clear understanding of whether any capability or constraint is associated with the reading platform or the product. Where these new capabilities are ‘optional’, and controllable at a per-product level, they should be specified in an ONIX (update) message, whether the publisher of the product chooses to opt in or opt out of enabling the new feature. Where these capabilities are added to all products on that platform, without any ‘opt-out’ or per-product control, then no change to ONIX metadata is required: the new capability is a reading platform feature, rather than an attribute of the product.

Similarly, capabilities and constraints may in some cases be somewhat platform-specific, even for a single product usable with multiple reading platforms. Where the same product is usable on multiple reading platforms, a constraint that is wholly related to the platforms should not be listed. For example, for a single product that is usable with two e-book reading platforms, where text-to-speech is enabled for all products without exception on one platform, but is completely unavailable on the other – perhaps because that platform lacks audio output of any kind – text-to-speech should not be included in the list of constraints.

Library-lendable e-book
using Reference names: the book can be loaned out up to 52 times, only one borrower can be loaned the book at any one time, for a maximum of a fortnight. Printing is prohibited, text-to-speech is allowed, and these constraints are protected by DRM
<EpubTechnicalProtection>01</EpubTechnicalProtection>
<EpubUsageConstraint>
    <EpubUsageType>02</EpubUsageType>
    <EpubUsageStatus>03</EpubUsageStatus>
</EpubUsageConstraint>
<EpubUsageConstraint>
    <EpubUsageType>05</EpubUsageType>
    <EpubUsageStatus>01</EpubUsageStatus>
</EpubUsageConstraint>
<EpubUsageConstraint>
    <EpubUsageType>06</EpubUsageType>
    <EpubUsageStatus>02</EpubUsageStatus>
    <EpubUsageLimit>
        <Quantity>52</Quantity>
        <EpubUsageUnit>10</EpubUsageUnit>
    </EpubUsageLimit>
    <EpubUsageLimit>
        <Quantity>14</Quantity>
        <EpubUsageUnit>09</EpubUsageUnit>
    </EpubUsageLimit>
    <EpubUsageLimit>
        <Quantity>1</Quantity>
        <EpubUsageUnit>07</EpubUsageUnit>
    </EpubUsageLimit>
</EpubUsageConstraint>
using Short tags: the book can be loaned out up to 52 times, but with unlimited concurrency (ie 52 sequential loans stretching over two years or more, or 52 ‘parallel’ loans over two weeks). Printing is prohibited, text-to-speech is allowed, and DRM enforces these limitations
<x317>01</x317>DRM-protected
<epubusageconstraint>
    <x318>02</x318>Printing
    <x319>03</x319>Is prohibited
</epubusageconstraint>
<epubusageconstraint>
    <x318>05</x318>Text-to-speech
    <x319>01</x319>Is unrestricted
</epubusageconstraint>
<epubusageconstraint>
    <x318>06</x318>Lending
    <x319>02</x319>Is limited
    <epubusagelimit>
        <x320>52</x320>Up to 52
        <x321>10</x321>Occasions
    </epubusagelimit>
    <epubusagelimit>
        <x320>14</x320>For fourteen
        <x321>09</x321>Days
    </epubusagelimit>
    <epubusagelimit>
        <x320>0</x320>Unlimited
        <x321>07</x321>Concurrent users
    </epubusagelimit>
</epubusageconstraint>
Zero concurrent users is a special case indicating there is no limit on the number of concurrent users. A maximum of 52 concurrent users would have the same meaning in this particular case. If the number of concurrent users is not specified, there can be only one concurrent user.
e-book purchase
using Reference names: the product is an e-book ‘purchase’ – in reality, the purchase of a perpetual license to the product. The product is watermarked
<EpubTechnicalProtection>02</EpubTechnicalProtection>
<EpubUsageConstraint>
    <EpubUsageType>07</EpubUsageType>
    <EpubUsageStatus>01</EpubUsageStatus>
</EpubUsageConstraint>
using Short tags: perpetual license
<x317>02</x317>Watermarked
<epubusageconstraint>
    <x318>07</x318>Time limit
    <x319>01</x319>Is unrestricted
</epubusageconstraint>
e-book rental – 30-day time-limited license to the product, enforced by DRM
using Reference names
<EpubTechnicalProtection>03</EpubTechnicalProtection>
<EpubUsageConstraint>
    <EpubUsageType>07</EpubUsageType>
    <EpubUsageStatus>02</EpubUsageStatus>
    <EpubUsageLimit>
        <Quantity>30</Quantity>
        <EpubUsageUnit>09</EpubUsageUnit>
    </EpubUsageLimit>
</EpubUsageConstraint>
using Short tags
<x317>03</x317>Adobe Content Server DRM
<epubusageconstraint>
    <x318>07</x318>Time limit
    <x319>02</x319>Is restricted
    <epubusagelimit>
        <x320>30</x320>
        <x321>09</x321>To 30 days
    </epubusagelimit>
</epubusageconstraint>
Differentiating rentals from purchases in this way relies on using separate product identifiers for each. If only a single identifier is used for purchase and one or more rental periods, see <PriceCondition>.
Digital product license composite

This composite can be used to deliver details of the license terms for a digital product. The license must have a name or title, and if the license is available on the internet, a link to the actual license may be provided (there may be several links to different expressions of the same license – for example a link to a legal document and a separate link to a summary intended for consumers).

Links to machine-readable license expressions – for example using ONIX-PL – are likely to become valuable in the future, particularly in library contexts.

EpubLicense EpubLicenseName EpubLicenseName EpubLicenseExpression EpubLicenseExpression EpubLicenseExpression EpubLicenseExpressionType EpubLicenseExpressionType EpubLicenseExpressionName EpubLicenseExpressionName EpubLicenseExpressionLink EpubLicenseExpressionLink
Open Access e-book available under a CC-By license
using Reference names
<EpubLicense>
    <EpubLicenseName>Creative Commons Attribution 4.0 International Public License​</EpubLicenseName>
    <EpubLicenseExpression>
        <EpubLicenseExpressionType>02</EpubLicenseExpressionType>
        <EpubLicenseExpressionLink>http://creativecommons.org/​licenses/​by/​4.0/​legalcode</EpubLicenseExpressionLink>
    </EpubLicenseExpression>
</EpubLicense>
. . .
<TextContent>
    <TextType>20</TextType>
    <ContentAudience>00</ContentAudience>
    <Text>Open access under CC-By license</Text>
</TextContent>
. . .
<Publisher>
    <PublishingRole>01<PublishingRole>
    <PublisherName>Manchester University Press</PublisherName>
</Publisher>
<Publisher>
    <PublishingRole>14<PublishingRole>
    <PublisherName>Knowledge Unlatched</PublisherName>
</Publisher>
<Publisher>
    <PublishingRole>15<PublishingRole>
    <PublisherIdentifier>
        <PublisherIDType>32</PublisherIDType>
        <IDValue>10.13039/100004440</IDValue>
    </PublisherIdentifier>
    <PublisherName>Wellcome Trust</PublisherName>
</Publisher>
. . .
<Website>
    <WebsiteRole>29</WebsiteRole>
    <WebsiteDescription>OAPEN Repository</WebsiteDescription>
    <WebsiteLink>http://www.oapen.org/​download?​type=​document&​docid=​341341</WebsiteLink>
</Website>
. . .
<UnpricedItemType>01</UnpricedItemType>
using Short tags – with links to both human-readable and professional license expressions
<epublicense>
    <x511>Creative Commons Attribution 4.0 International Public License​</x511>
    <epublicenseexpression>
        <x508>01</x508>Human readable
        <x510>http://creativecommons.org/​licenses/​by/​4.0/​deed</x510>
    </epublicenseexpression>
    <epublicenseexpression>
        <x508>02</x508>‘Professional’ readable
        <x510>http://creativecommons.org/​licenses/​by/​4.0/​legalcode</x510>
    </epublicenseexpression>
</epublicense>
. . .
<textcontent>
    <x426>20</x426>Open access statement
    <x427>00</x427>
    <d104>Open access under CC-By license</d104>
</textcontent>
. . .
<publisher>
    <b291>01<b291>Publisher
    <b081>Manchester University Press</b081>
</publisher>
<publisher>
    <b291>14<b291>Publication funder
    <b081>Knowledge Unlatched</b081>
</publisher>
<publisher>
    <b291>15<b291>Research funder
    <publisheridentifier>
        <x447>32</x447>FundRef DOI
        <b244>10.13039/100004440</b244>
    </publisheridentifier>
    <b081>Wellcome Trust</b081>
</publisher>
. . .
<website>
    <b367>29</b367>Full content available from
    <b294>OAPEN Repository</b294>
    <b295>http://www.oapen.org/​download?​type=​document&​docid=​341341</b295>
</website>
. . .
<j192>01</j192>Free of charge
The <EpubLicense> composite is not applicable only to Creative Commons and proprietary ‘open access’ licenses – it may also be used for commercial and limited licenses as well. However, for open access and other free at the point of use digital products only, an ‘open access statement’ should always be provided in Group P.14 (use <TextType> code 20). Presence of this statement acts as a ‘flag’ to indicate the product is open access, and the statement text can be displayed as a one line summary of the license terms. For limited and commercial licenses, there should be no open access statement. Open access materials would normally also name the funder(s) in the <Publisher> composite, and are likely to be free of charge (denoted using <UnpricedItemType>). They may also specify a location from which the digital product can be downloaded. The <Website> composite might link to a repository managed by the author, funder, publisher, supplier or another party, and the composite should be used in the appropriate context.
P.3.21 Map scale

The main scale of a cartographic product should always be provided. If the map contains multiple panels at different scales, repeats of the <MapScale> element may be used to express the scales used in the significant panels – for example the Product record for a road map with inset city center street maps might include two <MapScale> elements.

When a cartographic product such as an atlas is comprised of maps at just one or a small number of scales, the same approach should be followed. If the maps in an atlas are drawn at a wide variety of scales, it is not useful to specify them all individually: specify only the two or three scales used most widely in the product.

Product classification composite

The <ProductClassification> composite can be used in international trading to indicate the customs or tax status of the product, most often using the World Customs Organization’s ‘Harmonized system’ of product classification, or one of its regional or national variations or extensions. This can be critical in international trading, and where prices are typically communicated exc-tax, the recipient of the ONIX data can usually decide the local tax rate or any tariffs that are applicable to the product based on the commodity code.

ProductClassification ProductClassificationType ProductClassificationType ProductClassificationCode ProductClassificationCode Percent
Product classification for an atlas
using Reference names
<ProductClassification>
    <ProductClassificationType>10</ProductClassificationType>
    <ProductClassificationCode>49059100</ProductClassificationCode>
</ProductClassification>
using Short tags
<productclassification>
    <b274>10</b274>Mercosul NCM commodity code
    <b275>49059100</b275>for an atlas
</productclassification>
Product classification for a multi-component product
using Reference names
<ProductClassification>
    <ProductClassificationType>01</ProductClassificationType>
    <ProductClassificationCode>490199</ProductClassificationCode>
    <Percent>75</Percent>
</ProductClassification>
<ProductClassification>
    <ProductClassificationType>01</ProductClassificationType>
    <ProductClassificationCode>852432</ProductClassificationCode>
    <Percent>25</Percent>
</ProductClassification>
using Short tags
<productclassification>
    <b274>01</b274>WCO Harmonized system
    <b275>490199</b275>Book / other
    <b337>75</b337>75% of value
</productclassification>
<productclassification>
    <b274>01</b274>WCO Harmonized system
    <b275>852432</b275>Recorded media for sound
    <b337>25</b337>25% of value
</productclassification>
Most product classification types are expressed with periods separating the number into fields (eg 4901.99.00 in the first example above). The period characters are not usually included in the ONIX data.
The value percentage is a percentage of the overall price excluding tax. Tax may then have to be added based on the product classification itself.

Note that for multi-component products using <ProductPart>, the Product classification has to be described in P.3, not within each <ProductPart>. If the product consists of components that have different classifications – for example a children’s picture book bundled with an audio CD – then the <ProductClassification> composite is repeated, and the <Percent> element can be used to carry the percentage of the value of the product represented by each component.

P.4 Product parts

If the Product record describes a multi-item or multi-component product, then multiple repeats of the <ProductPart> composite are used to identify and describe the form of the various components of the product. As a result, much of the structure of the composite is identical to P.3, and similar best practices apply. Use of <ProductPart> implies nothing about whether each individual part or component within the product is available separately, and it may be used whether the various parts are of similar or different Product form.

If the Product record describes a single-item, single-component product, then <ProductPart> should be omitted entirely.

Product part composite

<ProductPart> specifies a relationship between a product and a component of that product that might otherwise – if the component is also for sale separately – be included in <RelatedProduct> in Block 5 using <ProductRelationCode> 02 (product includes related product). It is good practice to include identifiers for the individual saleable parts within multi-item products in <ProductPart>, and not to repeat them in <RelatedProduct>. ONIX recipients wishing to identify related products should use identifiers from <ProductPart> alongside those listed in Group P.23.

ProductPart PrimaryPart ProductIdentifier ProductForm ProductFormDetail ProductFormDetail ProductFormFeature ProductFormFeature ProductFormDescription ProductFormDescription ProductContentType ProductContentType NumberOfItemsOfThisForm NumberOfItemsOfThisForm NumberOfCopies NumberOfCopies CountryOfManufacture CountryOfManufacture must not omit both
three-volume work, paperback, in box (not a slip-case), individual parts not identified separately
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SA</ProductForm>
<ProductPackaging>09</ProductPackaging>
<!-- <Measure> omitted for brevity -->
<CountryOfManufacture>DE</CountryOfManufacture>
<ProductPart>
    <!-- volumes not individually identified -->
    <ProductForm>BC</ProductForm>
    <ProductFormDetail>B102</ProductFormDetail>
    <NumberOfItemsOfThisForm>3</NumberOfItemsOfThisForm>
</ProductPart>
using Short tags
<x314>10</x314>Multi-item product
<b012>SA</b012>Packaging unspecified
<b225>09</b225>In box (not slipcase)
<!-- <measure> omitted for brevity -->
<x316>DE</x316>Made in Germany
<productpart>
    <!-- volumes not individually identified -->
    <b012>BC</b012>Paperback
    <b333>B102</b333>Trade paperback (US usage)
    <x322>3</x322>Three vols
</productpart>
trade-only pack of 8 copies each of two different hardback books, to be retailed as 16 individual items
using Reference names
<ProductComposition>30</ProductComposition>
<ProductForm>XL</ProductForm>
<!-- <Measure> omitted for brevity -->
<ProductPart>
    <ProductIdentifier>
        <ProductIDType>03</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductIdentifier>
        <ProductIDType>15</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductForm>BB</ProductForm>
    <NumberOfCopies>8</NumberOfCopies>
    <CountryOfManufacture>IT</CountryOfManufacture>
</ProductPart>
<ProductPart>
    <ProductIdentifier>
        <ProductIDType>03</ProductIDType>
        <IDValue>9780001234584</IDValue>
    </ProductIdentifier>
    <ProductIdentifier>
        <ProductIDType>15</ProductIDType>
        <IDValue>9780001234584</IDValue>
    </ProductIdentifier>
    <ProductForm>BB</ProductForm>
    <NumberOfCopies>8</NumberOfCopies>
    <CountryOfManufacture>IT</CountryOfManufacture>
</ProductPart>
using Short tags
<x314>30</x314>Multiple-item trade pack
<b012>XL</b012>Shrink-wrapped pack
<!-- <measure> omitted -->
<productpart>
    <productidentifier>
        <b221>03</b221>GTIN-13
        <b244>9780001234567</b244>(of book A)
    </productidentifier>
    <productidentifier>
        <b221>15</b221>ISBN
        <b244>9780001234567</b244>
    </productidentifier>
    <b012>BB</b012>Hardback
    <x323>8</x323>Eight copies
    <x316>IT</x316>
</productpart>
<productpart>
    <productidentifier>
        <b221>03</b221>
        <b244>9780001234584</b244>GTIN-13
    </productidentifier>(of book B)
    <productidentifier>
        <b221>15</b221>ISBN
        <b244>9780001234584</b244>
    </productidentifier>
    <b012>BB</b012>Hardback
    <x323>8</x323>Eight copies
    <x316>IT</x316>Printed in Italy
</productpart>
softback book with two-disc audiobook in sleeve attached to inside back cover (the audio version is not available as a separate product)
using Reference names
<ProductComposition>10</ProductComposition>
<ProductForm>SF</ProductForm>
<!-- <Measure> omitted for brevity -->
<ProductPart>
    <PrimaryPart/>
    <ProductIdentifier>
        <ProductIDType>03</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductIdentifier>
        <ProductIDType>15</ProductIDType>
        <IDValue>9780001234567</IDValue>
    </ProductIdentifier>
    <ProductForm>BC</ProductForm>
    <NumberOfCopies>1</NumberOfCopies>
    <CountryOfManufacture>ES</CountryOfManufacture>
</ProductPart>
<ProductPart>
    <!-- no separate identifier available -->
    <ProductForm>AC</ProductForm>
    <ProductFormDetail>A101</ProductFormDetail>
    <NumberOfItemsOfThisForm>2</NumberOfItemsOfThisForm>
    <CountryOfManufacture>AT</CountryOfManufacture>
</ProductPart>
using Short tags
<x313>10</x313>Multi-item retail product
<b012>SF</b012>Part(s) enclosed within largest item
<!-- <measure> omitted -->
<productpart>
    <x457/>Book is primary part
    <productidentifier>
        <b221>03</b221>GTIN-13
        <b244>9780001234567</b244>(of book as single product)
    </productidentifier>
    <productidentifier>
        <b221>15</b221>ISBN
        <b244>9780001234567</b244>
    </productidentifier>
    <b012>BC</b012>Paperback
    <x323>1</x323>One book
    <x316>ES</x316>Printed in Spain
</productpart>
<productpart>
    <!-- no separate identifier available -->
    <b012>AC</b012>CD-Audio
    <b333>A101</b333>‘Red Book’ audio format
    <x322>2</x322>Two discs
    <x316>AT</x316>Pressed in Austria
</productpart>
The <Measure> composite cannot be included inside <ProductPart> (although some size information may be included within <ProductFormDetail>). The exact physical size of each item in a multi-item product or (more relevantly) in a trade-only pack must be expressed in separate ONIX records for the individual items.

The use of <ProductPart> is required whenever <ProductComposition> indicates a multi-item product. But for some legacy systems, it may not be possible to provide full details of each component within the product. What is the minimum amount of component information that can be supplied?

Within <ProductPart>, only the <ProductForm> element plus one of <NumberOfItemsOfThisForm> or <NumberOfCopies> are mandatory. So the absolute minimum information required is – in effect – the number of components there are:

using Reference names
<ProductPart>
    <ProductForm>BA</ProductForm>
    <NumberOfItemsOfThisForm>3</NumberOfItemsOfThisForm>
</ProductPart>
using Short tags
<productpart>
    <b012>BA</b012>Book – detail unspecified
    <x322>3</x322>
</productpart>
In this case, the only information about the components is that there are three different components, and that they are books (it is not stated whether they are hardcovers or paperbacks). Three identical components would use <NumberOfCopies> instead.

Providing such minimal information about multi-component products is obviously poor practice, and enhancing the data so it at least clearly specifies the product form of the components should be a priority.

P.4.1 ‘Primary part’ indicator

This should only be used with multi-item retail products (<ProductComposition> code 10), and when one item within that multi-item product can be considered the primary part. Omission implies that all parts are of similar importance.

Product identifier composite (product part)

It is recommended that the <ProductIdentifier> composite is used for any item within a multiple-item product that carries any kind of separate identifier, even if that identifier cannot be used for separate ordering of a single item from within the multi-item product. For example, the individual volumes within a boxed set may or may not be available individually, but in either case may have individual ISBNs.

See Product identifier composite within P.2 for best practice guidance.

P.4.5 Product form code (product part)

The <ProductForm> element is mandatory within each repeat of the <ProductPart> composite.

P.4.6 Product form detail (product part)

See P.3.3 Product form detail for best practice guidance.

Product form feature composite (product part)

See Product form feature composite within Group P.3 for best practice guidance.

Note that product form features such as safety warnings should normally be described at product level, even if they apply to only one part of a multi-item product. Safety warnings should only be carried within Group P.4 for multi-item trade packs that are intended to be broken into individual products before retail.

P.4.12, P.4.13 Number of items or copies

Either <NumberOfItemsOfThisForm> or <NumberOfCopies> should be included in every <ProductPart> composite:

  • <NumberOfCopies should be used when a <ProductPart> composite is describing a particular item within a multi-item product (there may be one or several copies of that particular item within the multi-item product). There would normally be a <ProductIdentifier> composite within the <ProductPart> composite;
  • <NumberOfItemsOfThisForm> should be used only when a single <ProductPart> composite is used to describe several different – but undifferentiated – items of identical Product form included within a multi-item product. The most common use case would be in describing a product such as a boxed set comprised of several volumes which are not themselves available individually. Note that if they were available separately, they would each have product identifiers, and so in describing the boxed set, separate <ProductPart> composites for each volume (and each of those with <NumberOfCopies> set to 1) would be more appropriate.
P.4.14 Country of manufacture (product part)

See P.3.15 Country of manufacture for best practice guidance.

P.5 Collection

Group P.5 consists of the <Collection> composite and the <NoCollection/> flag. The <Collection> composite may be used to carry details of any ‘bibliographic collection’ to which a product belongs – a collection might formally be termed a ‘set’ or a ‘series’, or might be a more informal grouping of related products. Alternatively, similar bibliographic collection details may be carried in Group P.6.

DescriptiveDetail (P.5 to P.7) Continued from P.4 Collection NoCollection TitleDetail ThesisType ThesisPresentedTo ThesisPresentedTo ThesisYear Contributor ContributorStatement ContributorStatement NoContributor Continued in Group P.8

Collections can be prescribed by the publisher, or ascribed to a range of independent products by a party other than the publisher – for example by a wholesaler or data aggregator. In the former case, details of the collection will usually appear formally on the product’s title page, whereas the latter may be an ad hoc grouping of products created for marketing or merchandising reasons.

Members of a collection share a collective identity, including at least a common collection title, and may share other attributes. For example, each product in a collection might have the same contributors, and collections usually have a consistent physical form, size and design style. Each member of a collection also has a unique identity – the ‘unshared’ title text. A collection may have an identifier for the collection as a whole, and the whole collection may be available as a single product, or as many individual products – ONIX makes no distinction. It is possible – though relatively rare – for a single product to be included in more than one collection, in which case multiple <Collection> composites may be used.

Collection identifiers and the shared identity of ascribed collections are always carried in P.5. A prescribed collection’s shared identity might be carried in either of Groups P.5 or P.6. Choosing between P.5 or P.6 can be tricky:

  • the collective identity – the shared parts of the title – are the ‘collection-level title details’;
    • where one collection is nested inside another, larger collection, these may be arranged hierarchically, in collection, sub-collection, sub-sub-collection fashion;
    • collections sit logically ‘within’ imprints (see Group P.19), rather than spanning imprints;
  • the unique, unshared parts of the title are the ‘product-level title details’;
  • if the product-level title is sufficiently distinct on its own, then the collection-level title is carried in Group P.5, and the product-level title in P.6;
  • if the product-level title always needs to be qualified with the collection-level title to provide context, because it is not distinctive enough on its own, then both collection- and product-level titles should be carried in Group P.6;
  • a collection-level title should never appear in both P.5 and P.6.

As an illustration, if a product-level title is ‘Workbook 6’, then it makes little sense without the collection-level title ‘Focus on Mathematics’ that expresses which student course Workbook 6 is a part of – there may be a ‘Workbook 6’ in ‘Focus on Physics’ too. In this case, both collection- and product-level titles should be in Group P.6. If the title is ‘The Beautiful and Damned’, then that is sufficiently distinctive on its own, and any collection title (such as ‘Penguin Modern Classics’) should be carried in Group P.5:

The Beautiful and the Damned (illustrating the structure of collection- and product-level title elements only)
<!-- Group P.5 -->
<Collection>
    <TitleDetail>
        <-- collection-level title element = Penguin Modern Classics -->
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <-- product-level title element = The Beautiful and the Damned -->
</TitleDetail>
Focus on Mathematics – Workbook 6
<!-- Group P.6 -->
<TitleDetail>
    <-- collection-level title element = Focus on Mathematics -->
    <-- product-level title element = Workbook 6 -->
</TitleDetail>
Choosing to use the second method does not necessarily mean there is no <Collection> composite – it may still be present, if for example it contains a collection identifier or collection sequence composite.
A recipient of the data is expected to display details from P.6 as ‘the title’, and this may include collection-level title detail. Other collection-level title information from P.5 should also be displayed, but is logically separate.

Where the products within a collection have a particular enumeration or sequence indicated on the product itself and considered to be part of the title, the <PartNumber> data element (or perhaps the <YearOfAnnual> element) should be used. However, this is not part of the collective identity: it is product-level data. Thus while typically a part or volume number will appear in the same Group as the collection-level title (either P.5 or P.6, depending on whether the product-level title always needs to be qualified with the collection-level information), it appears in a separate repeat of <TitleElement> with a different (ie a product-level) Title element level.

In contrast, where products within a collection have a particular sequence that is not indicated on the product (or occasionally, is indicated but is not considered to be part of the title), the <CollectionSequence> composite should be used.

And in some circumstances, it is useful to include <CollectionSequence> in addition to <PartNumber> – for example if the Part number is in Roman numerals or words. Supplying a numeric collection sequence number make such collections much simpler to sort into order.

The <NoCollection/> empty element indicates that the product does not belong to any type of collection (ie it indicates that there are no collection details in either P.5 or P.6). <NoCollection/> does not simply indicate there is no <Collection> composite – so it’s entirely possible to have no P.5 content at all in an ONIX Product record. It is also possible that a collection identifier or collection sequence number be included in P.5, even when the collection title is supplied in P.6.

Additional guidance and extensive examples on the description of collections are included in a separate document ONIX for Books: Product Information Format: How to describe sets, series and multiple-item products.

Any product in a collection should also make use of repeated <RelatedProduct> composites in Group P.23 to link to other products in the collection using <ProductRelationCode> code 30. In addition, if the collection is available as a single product (eg a boxed set), the individual product can link to the set product using relation code 02 (‘is part of’) (and in reverse, the set product can link to the individual product using relation code 01).

Collection CollectionType SourceName CollectionIdentifier CollectionIdentifier CollectionSequence CollectionSequence TitleDetail Contributor ContributorStatement ContributorStatement CollectionIdentifier CollectionIDType IDTypeName IDValue CollectionSequence CollectionSequenceType CollectionSequenceType CollectionSequenceTypeName CollectionSequenceTypeName CollectionSequenceNumber CollectionSequenceNumber must include if ID is proprietary, otherwise omit must include if sequence type is proprietary, otherwise omit
product that is part of an unordered collection
using Reference names
<Collection>
    <CollectionType>10</CollectionType>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <NoPrefix/>
            <TitleWithoutPrefix textcase="02">Oxford World’s Classics​</TitleWithoutPrefix>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitlePrefix textcase="02">The</TitlePrefix>
        <TitleWithoutPrefix textcase="02">Lost World</TitleWithoutPrefix>
        <Subtitle textcase="01">Being an account of the recent amazing adventures of Professor George E. Challenger, Lord John Roxton, Professor Summerlee, and Mr E. D. Malone of the Daily Gazette</Subtitle>
    </TitleElement>
<TitleDetail>
using Short tags
<collection>
    <x329>10</x329>Publisher’s collection
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>02</x409>Collection level
            <x501/>
            <b031 textcase="02">Oxford World’s Classics</b031>
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>
    <titleelement>Distinctive title
        <x409>01</x409>Product level
        <b030 textcase="02">The</b030>
        <b031 textcase="02">Lost World</b031>
        <b029 textcase="01">Being an account of the recent amazing adventures of Professor George E. Challenger, Lord John Roxton, Professor Summerlee, and Mr E. D. Malone of the Daily Gazette</b029>
    </titleelement>
<titledetail>
product that is part of an ordered collection. Le jour du soleil is the first book in the XIII series
using Reference names
<Collection>
    <CollectionType>10</CollectionType>
    <CollectionSequence>
        <CollectionSequenceType>02</CollectionSequenceType>
        <CollectionSequenceNumber>1</CollectionSequenceNumber>
    </CollectionSequence>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <NoPrefix/>
            <TitleWithoutPrefix textcase="01">XIII</TitleWithoutPrefix>
        </TitleElement>
        <TitleElement>
            <TitleElementLevel>01</TitleElementLevel>
            <PartNumber>Tome 1</PartNumber>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitlePrefix textcase="01">Le</TitlePrefix>
        <TitleWithoutPrefix textcase="01">jour du soleil​</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
using Short tags
<collection>
    <x329>10</x329>
    <collectionsequence>
        <x479>02</x479>Title order (confirmation)
        <x481>1</x481>
    </collectionsequence>
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>02</x409>Collection level
            <x501/>
            <b031 textcase="01">XIII</b031>
        </titleelement>
        <titleelement>
            <x409>01</x409>Product level
            <x410>Tome 1</x410>First item in collection
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <b030 textcase="01">Le</b030>
        <b031 textcase="01">jour du soleil​</b031>
    </titleelement>
</titledetail>
Note that the collection title XXXI is carried as a title (in <TitleWithoutPrefix>) even though it superficially resembles a part number.
The part number itself (Tome 1) is product-level attribute, not a part of the collective identity, so it occurs with a <TitleElementLevel> code 01, but it occurs within P.5, because the product level title in P.6 is distinctive enough without it.
product is part of a nested subcollection. The History of Greek Philosophy is in several volumes, of which The Fifth Century Enlightenment is the third. This third volume comes in two Parts, of which this product is the second
using Reference names
<Collection>
    <CollectionType>10</CollectionType>
    <CollectionSequence>
        <CollectionSequenceType>02</CollectionSequenceType>
        <CollectionSequenceNumber>3.2</CollectionSequenceNumber>
    </CollectionSequence>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <TitlePrefix textcase="02">A</TitlePrefix>
            <TitleWithoutPrefix textcase="02">History of Greek Philosophy</TitleWithoutPrefix>
        </TitleElement>
        <TitleElement>
            <TitleElementLevel>03</TitleElementLevel>
            <PartNumber>Volume 3</PartNumber>
            <TitlePrefix textcase="02">The</TitlePrefix>
            <TitleWithoutPrefix textcase="02">Fifth Century Enlightenment</TitleWithoutPrefix>
        </TitleElement>
        <TitleElement>
            <TitleElementLevel>01</TitleElementLevel>
            <PartNumber>Part 2</PartNumber>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix>Socrates</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
using Short tags
<collection>
    <x329>10</x329>Publisher collection
    <collectionsequence>
        <x479>02</x479>Title order (confirmation)
        <x481>3.2</x481>Two-level numbering
    </collectionsequence>
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>02</x409>Collection level
            <b030 textcase="02">A</b030>
            <b031 textcase="02">History of Greek Philosophy</b031>
        </titleelement>
        <titleelement>
            <x409>03</x409>Subcollection level
            <x410>Volume 3</x410>
            <b030 textcase="02">The</b030>
            <b031 textcase="02">Fifth Century Enlightenment</b031>
        </titleelement>
        <titleelement>
            <x409>01</x409>Product level
            <x410>Part 2</x410>
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <x501/>
        <b031>Socrates</b031>
    </titleelement>
</titledetail>
It is certainly possible to argue that the collection and subcollection details in the example above should be presented in Group P.6: in cases like this, the choice between P.5 and P.6 is not clear cut.
collection identifier only in Group P.5, collection title supplied in Group P.6
using Reference names
<Collection>
    <CollectionType>10</CollectionType
    <CollectionIdentifier>
        <CollectionIDType>02</CollectionIDType>
        <IDValue>00173231</IDValue>
    </CollectionIdentifier>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>02</TitleElementLevel>
        <TitleText textcase="02">Granta</TitleText>
        <Subtitle textcase="01">The magazine of new writing</Subtitle>
    </TitleElement>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <PartNumber>109</PartNumber>
        <TitleText textcase="02">Work</TitleText>
    </TitleElement>
</TitleDetail>
using Short tags
<collection>
    <x329>10</x329Publisher collection
    <collectionidentifier>
        <x344>02</x344>ISSN
        <b244>00173231</b244>
    </collectionidentifier>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>02</x409>Collection level
        <b203 textcase="01">Granta</b203>Title
        <b029 textcase="01">The magazine of new writing</b029>Subtitle
    </titleelement>
    <titleelement>
        <x409>01</x409>Product level
        <x410>109</x410>Number within collection
        <b203 textcase="01">Work</b203>Title
    </titleelement>
</titledetail>
In this case, the sender’s system is unable to distinguish reliably between titles with non-sorting prefixes and those without, so it uses <TitleText> rather than a combination of <NoPrefix/> and <TitleWithoutPrefix> to hold the collection-level and product-level titles (respectively, Granta and Work).
a product that is not a part of any collection
using Reference names
<NoCollection/>
using Short tags
<x411/>
Collection composite

Much of the structure of the <Collection> composite is shared with the <TitleDetail> composite, and similar best practices apply.

Although it is possible to identify contributors to the collection (eg a series editor) within the <Collection> composite – and this may be normal practice in some countries – it is best practice to identify all contributors in Group P.7 in international usage.

P.5.1 Collection type code

The <CollectionType> element is mandatory within a <Collection> composite. It is best practice to use only codes 10 or 20 from List 148, and also to include a source for any ascribed collection in the <SourceName> element. Source names need not be supplied for ‘publisher’ collections, as they are always prescribed by the publisher.

Collection identifier composite

This composite should be used to specify any relevant identifiers for the collection as a whole. It is unlikely that any formal identifiers exist for ascribed collections, but it would be possible to include a proprietary ID for an ascribed collection.

Note that collection identifiers can only be carried in Group P.5, and may be included in P.5 even when the title text for the collection is carried in P.6.

Collection sequence composite

Where the collection is ordered in some way, this composite should be used to provide or confirm that ordering. An order or enumeration may be more or less explicit in the title (‘Volume 7’ is likely to be the seventh in the collection) and this should be included in <TitleDetail> – but it may be repeated here for confirmation, using <CollectionSequenceType> code 02 – this is particularly useful when the ordering is described in Roman numerals or in words (eg The Ersatz Elevator, ‘Book the Sixth’ in A Series of Unfortunate Events), because such collections are difficult to sort. However, the main use for this composite is to provide sequence information that is not shown on the product, for example, a narrative order for a fiction collection, or an original publication order for collected works previously published independently.

It is possible for the products in a collection to have more than one order – for example the narrative order of a collection of fiction products may not be the same as the order the products are published in, and in this case, multiple <CollectionSequence> composites may be supplied. It’s also common for the sequence number of an existing product to be changed as a result of publication of later products.

The numbering provided in <CollectionSequence> is not intended for display by data recipients – what should be displayed is provided in <TitleDetail>. But the numbering provided in <CollectionSequence> should be used to sort the items in the collection.

‘prequel’ – separate narrative and publication orders
using Reference names
<CollectionSequence>
    <CollectionSequenceType>03</CollectionSequenceType>
    <CollectionSequenceNumber>3</CollectionSequenceNumber>
</CollectionSequence>
<CollectionSequence>
    <CollectionSequenceType>04</CollectionSequenceType>
    <CollectionSequenceNumber>1</CollectionSequenceNumber>
</CollectionSequence>
using Short tags
<collectionsequence>
    <x479>03</x479>Publication order
    <x481>3</x481>Third
</collectionsequence>
<collectionsequence>
    <x479>04</x479>Narrative order
    <x481>1</x481>First
</collectionsequence>
In simple collections, the collection sequence number is an integer (1, 2, 3…). However, for sub-collections that are nested within larger collections, a multi-level number can be used – for example sequence number 3.2 may indicate the second item in a sub-collection, where the sub-collection is sequentially third in the larger collection.
Note that the first-published item in a collection need not have the sequence number 1 (except of course when collection sequence type is 03). If a later ‘prequel’ is planned from the outset, the initial publication might be sequence number 2. A prequel that was not planned from the outset may require revision of previously-supplied metadata for already-published products in the collection, as their narrative order will be affected. A multi-level number like 0.5 or 0.1 should not be used to indicate a prequel added before item 1 in the collection.
Multi-level numbers can also be used to indicate intercalations. A collection may have three items in narrative sequence, with matching print and e-publications. But an additional publication might be added as an optional bonus, perhaps as a digital exclusive, and without being critical to the overall narrative. If the additional publication has a suggested sequential position, this can be indicated with a multi-level number (eg 2.1 would indicate the first e-publication intercalated between items 2 and 3 in the collection, and 0.1 could be used to indicate an intercalation before the first main publication). If the numbering is two-level, while the collection information indicates only a simple one-level collection, the sequence number clearly indicates an intercalation.
For products that are in an ordered collection but outside the sequential ordering of that collection (such items are termed « hors série » in French), the <CollectionSequence> composite should be omitted. Such products are normally sorted after the ordered products in the collection.
Title detail composite

The structure of this composite is identical to <TitleDetail> in Group P.6, but it should be used here to specify a title for the collection (unless it is provided in P.6). The composite must include a <TitleType> – typically, this will be code 01 (Distinctive title), but other types of collection title may be specified. The <TitleDetail> composite must also include at least one repeat of the <TitleElement> composite to carry the actual title.

Title element composite

The <TitleElementLevel> data element may be used to differentiate between a collection title (code 02) and a subcollection title (code 03, used where one collection is nested inside another). Occasionally, most commonly in children’s publishing, a collection also carries a ‘master brand’ (code 05), which might be used across many non-book products. Other values from List 149 may not be used within Group P.5, and code 03 should not be used without an accompanying repeat of <TitleElement> that uses code 02.

Any of P.5.7 Part number through P.5.12 Title text without prefix may carry information about the collection title. In addition, the P.5.13 Subtitle element may be used to carry a subtitle that is shared by all items in the collection. Typically, the title of the collection would be in either <TitleText>, or in a combination of <TitlePrefix> and <TitleWithoutPrefix> – do not use both of these options. See the note about the use of <TitlePrefix> in section Group P.6 <TitleElement> composite.

See the note about the <PartNumber> and <YearOfAnnual> data elements in Group P.6 <TitleElement> composite.

Spacing and punctuation should follow that used on the book itself. However, see the note about the case of text in section Group P.6 <TitleElement> composite.

Occasionally, the product’s collection- and product-level titles do contain some repetition. Repetition of title elements on the product itself should be replicated in the metadata.

repetition of title elements on the product itself should be followed in the metadata
using Reference names
<Collection>
    <CollectionType>10</CollectionType>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>05</TitleElementLevel>
            <NoPrefix/>
            <TitleWithoutPrefix textcase="02">Barbie™​</TitleWithoutPrefix>
        </TitleElement>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <NoPrefix/>
            <TitleWithoutPrefix textcase="02">Easy Reader with Large Print​</TitleWithoutPrefix>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix textcase="02">Barbie as a Pilot​</TitleWithoutPrefix>
    </TitleElement>
<TitleDetail>
using Short tags
<collection>
    <x329>10</x329>Publisher’s collection
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>05</x409>Master brand
            <x501/>
            <b031 textcase="02">Barbie™</b031>
        </titleelement>
        <titleelement>
            <x409>02</x409>Collection level
            <x501/>
            <b031 textcase="02">Easy Reader with Large Print​</b031>
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>
    <titleelement>Distinctive title
        <x409>01</x409>Product level
        <x501/>
        <b031 textcase="02">Barbie as a Pilot​</b031>
    </titleelement>
<titledetail>
The repetition in this case potentially also encompasses the <Edition> information – the publisher should decide whether ‘with Large Print’ is a part of the collection identity, or whether it should be described as a large print edition (code LTE from List 21 in <EditionType>, within Group P.9). This may depend on whether there is a ‘normal print’ version as well.
P.5.64 “No collection” indicator

This is one of the few ONIX for Books data elements that are defined as empty (or ‘void’) XML elements. The ‘self-closing’ or minimized XML syntax should always be used (ie use <NoCollection/>, not <NoCollection> followed by </NoCollection>).

P.6 Product title detail

Group P.6 consists primarily of the <TitleDetail> composite, which carries the title of the product. Every Product record must carry a title.

It is best practice to deliver title text in a structured and granular fashion, rather than simply providing text with punctuation to indicate the difference between collection, main title and subtitle. By providing the title in a structured way, it ensures the recipient of the data can store and process it in the most appropriate way in their own system. With products that have both a shared ‘collection’ title and an individual product-level title, these should be delivered in separate repeats of the <TitleElement> composite having different values in the <TitleElementLevel> data element. (Note that collection-level titles might be provided in Group P.5 instead.)

Do not add extra information to the title that properly appears elsewhere in the ONIX record. For example, there is often a temptation to add information about the product form, edition or audience to the title ‘because the title is always displayed prominently in the online store.’

For products which are known under more than one title – for example the distinctive title that is on the product, plus a former title that the work was previously known by – the entire <TitleDetail> composite is repeatable, with different values of <TitleType>.

TitleDetail TitleType TitleElement TitleStatement TitleElement SequenceNumber SequenceNumber TitleElementLevel TitleElementLevel PartNumber YearOfAnnual Title Subtitle must include at least one for the contents of this box, see the diagram below. There is no associated <Title> tag – <YearOfAnnual> is immediately ‘followed’ by one of <TitleText>, <TitlePrefix> or <NoPrefix>, though note the entire box is optional
Title TitleText TitlePrefix NoPrefix TitleWithoutPrefix TitleWithoutPrefix use when sender system cannot differentiate between titles with and without prefixes use one or other when sending system can differentiate titles with and without prefixes
simple product not part of a collection (sending system cannot reliably distinguish between titles with prefixes and without)
using Reference names
<!-- Group P.5 -->
<NoCollection/>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitleText textcase="02">Mordsfreunde</TitleText>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- Group P.5 -->
<x411/>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <b203 textcase="02">Mordsfreunde</b203>
    </titleelement>
</titledetail>
simple product with title and subtitle (sending system can reliably distinguish between titles with prefixes and without)
using Reference names
<!-- Group P.5 -->
<NoCollection/>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix>خريف الغضب​</TitleWithoutPrefix>
        <Subtitle>قصة بداية ونهاية السادات</Subtitle>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- Group P.5 -->
<x411/>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <x501/>
        <b031>خريف الغضب</b031>Title
        <b029>قصة بداية ونهاية السادات</b029>and subtitle
    </titleelement>
</titledetail>
simple product with an alternative title
using Reference names
<!-- Group P.5 -->
<NoCollection/>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix textcase="02">Slumdog Millionaire​</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
<TitleDetail>
    <TitleType>08</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix textcase="02">Q &amp; A</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- Group P.5 -->
<x411/>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <x501/>
        <b203 textcase="02">Slumdog Millionaire</b203>
    </titleelement>
</titledetail>
<titledetail>
    <b202>08</b202>Former title
    <titleelement>
        <x409>01</x409>Product level
        <x501/>
        <b031 textcase="02">Q &amp; A</b031>Must use entity for ‘&’
    </titleelement>
</titledetail>
product that is part of a collection specified in Group P.5
using Reference names
<!-- Group P.5 -->
<Collection>
    <CollectionType>10</CollectionType>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <NoPrefix/>
            <TitleWithoutPrefix textcase="02">Larousse Petits Classiques​</TitleWithoutPrefix>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitlePrefix textcase="02">Les</TitlePrefix>
        <TitleWithoutPrefix textcase="02">Misérables</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- Group P.5 -->
<collection>
    <x329>10</x329>Publisher collection
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>02</x409>Collection level
            <x501/>
            <b031 textcase="02">Larousse Petits Classiques</b031>
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <b030 textcase="02">Les</b030>
        <b031 textcase="02">Misérables</b031>
    </titleelement>
</titledetail>
omnibus edition of three novels, without a distinctive title of its own
using Reference names
<!-- no Group P.5 -->
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <SequenceNumber>1</SequenceNumber>
        <TitleElementLevel>01</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix>Sense and Sensibility / Pride and Prejudice / Northanger Abbey​</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- no Group P.5 -->
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>
    <titleelement>
        <b034>1</b034>
        <x409>01</x409>
        <x501/>
        <b031 textcase="02">Sense and Sensibility / Pride and Prejudice / Northanger Abbey</b031>
    </titleelement>
</titledetail>
If the omnibus edition has a distinctive title of its own (something like The Jane Austen Omnibus, Part One), that should be used in preference. The constituent titles could then be listed as a subtitle if necessary. Multiple titles are conventionally separated by spaced slash characters. A more sophisticated alternative would make use of <ContentDetail> (see Block 3), but this would be unnecessary in a simple case like this.
manga novel with collection title included in P.6
using Reference names
<!-- no Group P.5 -->
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <SequenceNumber>1</SequenceNumber>
        <TitleElementLevel>02</TitleElementLevel>
        <NoPrefix/>
        <TitleWithoutPrefix textcase="02">Kobato.</TitleWithoutPrefix>
    </TitleElement>
    <TitleElement>
        <SequenceNumber>2</SequenceNumber>
        <TitleElementLevel>01</TitleElementLevel>
        <PartNumber>3</PartNumber>
    </TitleElement>
    <TitleStatement>Kobato. 3</TitleStatement>
</TitleDetail>
using Short tags
<!-- no Group P.5 -->
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <b034>1</b034>
        <x409>02</x409>Collection level
        <x501/>
        <b031 textcase="02">Kobato.</b031>
    </titleelement>
    <titleelement>
        <b034>2</b034>
        <x409>01</x409>Product level
        <x410>3</x410>
    </titleelement>
    <x478>Kobato. 3</x478>
</titledetail>
The part number is the only product-level title information for this product – there is no product level title text. The title statement is somewhat unnecessary here, since it is a simple concatenation of the title elements, but is provided for the avoidance of doubt about the nature of the product-level title.
Title detail composite

At least one <TitleDetail> composite is mandatory, to carry the product title. The composite must include a <TitleType> – typically, this will specify code 01 (Distinctive title) – and at least one repeat of the <TitleElement> composite.

Multiple repeats of the whole <TitleDetail> composite can be used to carry different types of title – for example a former title of a product whose name has been changed, or the original language title of a product published in translation. While a recipient might not display such alternative titles, they can be helpful as search terms.

Contrast the titles of the Shakespeare plays Twelfth Night, or What You Will, where the full title itself contains an ‘alternative’ name for the play, and Much Ado About Nothing which is occasionally known by the genuine alternative title of Benedick and Beatrice. The latter requires two <TitleDetail> composites, the former just one, with the complete title text in a single <TitleText> element.

It may also be useful to specify the distributor’s title: this is often abbreviated or truncated in order to fit on a single line on order confirmations, invoices and other documents, or sized to fit a fixed-length database field in a legacy sales order processing system – it is common for distribution systems to limit the length of a title to 30 or so characters. Such titles often also mix collection and product-level elements, or other non-title details such as binding type, edition and so on (eg 12th N sch ed w. class no 30pk). Thus a Distributor’s title can appear somewhat cryptic, and specifying it can be an aid if any matching against other documents (eg invoices) is required.

Title element composite

Any of P.6.3 <PartNumber> through P.6.8 <Subtitle> may carry information about the product title.

Conceptually, each of the elements of a title belongs to either ‘product’ or ‘collection’ level – that is, it is part of the identity of the product itself, or is part of the identity of a group of products.

Multiple repeats of the <TitleElement> composite within one <TitleDetail> are often needed if the product belongs to a collection and the full collection details are not included in Group P.5 (ie there are two or more <TitleElement> composites with different levels), or if the title contains several largely independent parts – for example an annual or yearbook may have both a textual title and a year or date (ie there are two or more <TitleElement> composites with the same level).

Where the presentation order of multiple title elements at different levels is important, then the <SequenceNumber> data element should be included within each repeat of <TitleElement> – thus <SequenceNumber> should be used to guide data recipients on whether a collection title should ideally be displayed before or after the product-level title. However, their display order (eg on a web page) may be dictated by the recipient’s screen design. Despite this, never carry a collection title in the <Subtitle> element ‘to ensure it is displayed after the main title’.

Where a title has multiple data elements at the same level – say it has a <PartNumber>, <TitleText>, plus a <Subtitle>, which are all at ‘product level’ – then these can be expressed in a single <TitleElement> composite, or in two separate repeats of <TitleElement> within a single <TitleDetail> (note that a <TitleElement> containing only a subtitle is not valid). Either is acceptable. Data recipients might concatenate these three elements for display purposes in a common-sense order (part number first, subtitle last), or – if each separate <TitleElement> contains a <SequenceNumber> – should follow the guidance of the <SequenceNumber>.

multiple title elements at the same level: a book entitled Best Season Ever: 1998 in a collection Baseball Year by Year
using Reference names
<!-- collection title carried in Group P.5 -->
<TitleElement>
    <SequenceNumber>1</SequenceNumber>
    <TitleElementLevel>01</TitleElementLevel>
    <TitleText textcase="02">Best Season Ever</TitleText>
</TitleElement>
<TitleElement>
    <SequenceNumber>2</SequenceNumber>
    <TitleElementLevel>01</TitleElementLevel>
    <YearOfAnnual>1998</YearOfAnnual>
</TitleElement>
using Short tags
<!-- collection title carried in Group P.5 -->
<titleelement>
    <b034>1</b034>First part of
    <x409>01</x409>product-level title
    <b203 textcase="02">Best Season Ever</b203>
</titleelement>
<titleelement>
    <b034>2</b034>Second part of
    <x409>01</x409>product-level title
    <b020>1998</b020>
</titleelement>
This approach is better than putting the title text and year of annual in the same <TitleElement> composite, because it ensures the recipient understands that the title is not 1998: Best Season Ever.
The sender’s system is unable to distinguish reliably between titles with and without prefixes, so it uses <TitleText> instead of <NoPrefix/> and <TitleWithoutPrefix>.
multiple title elements at different levels: a book called Descent: Book 3 of The Exo Cycle
using Reference names
<TitleElement>
    <SequenceNumber>1</SequenceNumber>
    <TitleElementLevel>01</TitleElementLevel>
    <NoPrefix/>
    <TitleText textcase="02">Descent</TitleText>
</TitleElement>
<TitleElement>
    <SequenceNumber>2</SequenceNumber>
    <TitleElementLevel>01</TitleElementLevel>
    <PartNumber>Book 3</PartNumber>
</TitleElement>
<TitleElement>
    <SequenceNumber>3</SequenceNumber>
    <TitleElementLevel>02</TitleElementLevel>
    <TitlePrefix textcase="02">The</TitlePrefix>
    <TitleWithoutPrefix textcase="02">Exo Cycle</TitleWithoutPrefix>
</TitleElement>
using Short tags
<titleelement>
    <b034>1</b034>First part of
    <x409>01</x409>product-level title
    <x501/>
    <b031 textcase="02">Descent</b031>
</titleelement>
<titleelement>
    <b034>2</b034>Second part of title
    <x409>01</x409>(at product level)
    <x410>Book 3</x410>
</titleelement>
<titleelement>
    <b034>3</b034>
    <x409>02</x409>
    <b030 textcase="02">The</b030>Third part of title
    <b031 textcase="02">Exo Cycle</b031>(at collection level)
</titleelement>
This makes it clear that the collection title is best placed after the product-level title (ie not The Exo Cycle – Book 3: Descent). Positioning of the collection title within P.6 rather than in P.5 is a complex decision. See the guidance in P.5 Collection.
In this case, the sender’s system can distinguish reliably between titles with and without prefixes that are ignored for sorting purposes. Thus the collection-level title The Exo Cycle uses <TitlePrefix> and the product-level title Descent makes use of <NoPrefix/> and <TitleWithoutPrefix>.

Optionally, if the title is particularly complex, and simple concatenation of the various title elements (in the order specified) is not enough, then a <TitleStatement> can be used (always in addition to the granular title elements). For recipients, if a title statement is provided, this should be used for display purposes wherever possible, while the individual title elements may be used for more advanced search or collation purposes. Note that a title statement should include the subtitle, and any intermediate punctuation, but should not include the text of any alternative titles – alternative titles may have a title statement of their own.

Typically, the main text of a title element would be in one of:

  • a combination of <NoPrefix/> and <TitleWithoutPrefix> or
  • a combination of <TitlePrefix> and <TitleWithoutPrefix> or
  • <TitleText>

The last of these options – <TitleText> – is reserved for use when the sending system cannot differentiate between titles that include a prefix and titles that do not. The other two choices should be used when the system can differentiate. <TitlePrefix> is used when the title begins with a prefix that should be ignored for collation purposes (depending on the language of the title text: in English, ‘A’, ‘An’ or ‘The’ are ignored for collation, in Spanish, «El», «La», «Las», «Lo», «Los», «Un», «Una», «Unas», «Unos»). <NoPrefix/> is an empty element that provides positive indication that there is no prefix. Note that a title beginning with ‘A to Z’ or with a place name like ‘El Paso’ would not normally use <TitlePrefix>, since ‘A’ or ‘El’ are not ignored for collation purposes in these circumstances. Titles beginning with quotation marks, an apostrophe or initial punctuation like ¡ or ¿ do not use <TitlePrefix> even though those marks are ignored for collation purposes. In rare cases where there is both initial punctuation and a non-sorting prefix, such as the title Spanish ¿La madre también?, <TitlePrefix> would be used for «¿La».

If limitations in internal systems mean that a data supplier genuinely cannot distinguish titles that carry prefixes from those that do not – and thus cannot use <TitlePrefix>, <NoPrefix/> and <TitleWithoutPrefix> correctly – then <TitleText> should be used. This means that a recipient which wishes to take definite account for prefixes when sorting titles needs to inspect all records using <TitleText>.

For languages which do not normally use definite and indefinite articles – for example Norwegian or Swedish – the <NoPrefix/> element should be used.

When a title is not in the same language as the main text content of the product itself, the language attribute should be included.

product in English, with title in French – compare with the Larousse Petits Classiques version (above), which is in French throughout. Distributor’s title also included
using Reference names
<!-- Group P.5 -->
<Collection>
    <CollectionType>10</CollectionType>
    <TitleDetail>
        <TitleType>01</TitleType>
        <TitleElement>
            <TitleElementLevel>02</TitleElementLevel>
            <TitleText textcase="02">Penguin Classics</TitleText>
        </TitleElement>
    </TitleDetail>
</Collection>
<!-- Group P.6 -->
<TitleDetail>
    <TitleType>01</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitlePrefix language="fre" textcase="02">Les</TitlePrefix>
        <TitleWithoutPrefix language="fre" textcase="02">Misérables</TitleWithoutPrefix>
    </TitleElement>
</TitleDetail>
<TitleDetail>
    <TitleType>10</TitleType>
    <TitleElement>
        <TitleElementLevel>01</TitleElementLevel>
        <TitleText>LES MISERABLES (PENG CLASS B)</TitleText>
    </TitleElement>
</TitleDetail>
using Short tags
<!-- Group P.5 -->
<collection>
    <x329>10</x329>Publisher collection
    <titledetail>
        <b202>01</b202>Distinctive title
        <titleelement>
            <x409>02</x409>Collection level
            <b203 textcase="02">Penguin Classics​</b203>
        </titleelement>
    </titledetail>
</collection>
<!-- Group P.6 -->
<titledetail>
    <b202>01</b202>Distinctive title
    <titleelement>
        <x409>01</x409>Product level
        <b030 language="fre" textcase="02">Les</b030>Title is in French…
        <b031 language="fre" textcase="02">​Misérables​</b031>not translated into English as ‘The Wretched Ones’ or similar, and is in Title case
    </titleelement>
</titledetail>
<titledetail>
    <b202>10</b202>Distributor’s title
    <titleelement>
        <x409>01</x409>Product level
        <b203>LES MISERABLES (PENG CLASS B)​</b203>
    </titleelement>
</titledetail>
The Collection title given is paired with the Distinctive title, not the Distributor’s title, on the basis that they carry the same code in <TitleType>.

It is not normally expected that metadata provided by publishers follows formal library cataloging rules (for example the AACR2 or RDA content rules). With the exception of the case of the text, the title information provided in an ONIX record should follow the title as it is provided on the product title page (or equivalent). Acronyms and initialisms should be capitalized and punctuated as they are on the product’s title page.

In scripts with distinct upper and lower case (including Latin, Cyrillic etc, but not for example Arabic or Japanese Kanji), the textcase attribute is important with textual data elements such as <TitleText>, <TitlePrefix>, <TitleWithoutPrefix> and <Subtitle>. It is best practice to follow the conventions of the language of the title, and to include the textcase attribute. In many languages using a cased script, it is normal style to present titles and similar textual data in ‘sentence case’ (first word and all proper nouns carry an initial capital – note this means that the first character of <TitleWithoutPrefix> is lower case, unless it is a proper noun or a capitalized acronym). In English and some other cased languages, it is normal style to supply titles – but not subtitles – in ‘title case’ (all words carry an initial capital except for articles, most prepositions and most conjunctions that appear mid-sentence). Subtitles should always be presented in sentence case. Titles and similar textual data should never be presented in ALL UPPER CASE, and in instances where this is unavoidable (for example, with legacy data), use the textcase attribute with code 03.

Upper case titles constructed solely from acronyms or initialisms (eg ‘VAX FORTRAN’) are correct in both title and in sentence case, and should not usually marked with textcase code 03.

In scripts where the sort order is not essentially alphabetic – for example in Japanese Kanji where names and titles are sorted phonetically, or in Traditional or Simplified Chinese Hanzi where Pinyin or Zhuyin phonetics or (occasionally) the number of brush strokes in individual characters control the collation order – the collationkey attribute can be included. This would typically contain a phonetic transliteration of the data element (in Hiragana, Katakana, Pinyin or Zhuyin etc).

When the language of the title is not the expected language of the message, then the language attribute should also be provided.

Note that <PartNumber> indicates a sequential position within a collection, and should not be used unless details of the collection are included (either in Group P.5 or in P.6). A volume or part number is normally product-level information, not part of the shared identity of a collection, and would thus appear with <TitleElementLevel> 01 – and it might appear either in the same repeat of <TitleElement> as the product-level title or as a separate title element. It is a common error to supply a part number or year at collection level, when it belongs at product level. However, <PartNumber> may occasionally be sub-collection level information where one collection is nested inside another.

<PartNumber> might be a simple number, or might convey the ‘type’ of the item within the collection – a volume, a part, a book, a tome and so on. If anything more than a simple number is used, ensure the terminology matches that on the product, and that the text is presented consistently (including consistent spacing and abbreviation) so that items in the collection will sort correctly into order whenever possible. In this circumstance, it is good practice also to provide the ordinal position of the product within the collection in the <CollectionSequence> composite in Group P.5. Where it is provided, recipients should use <CollectionSequence> to sort members of a collection into order, in preference to using <PartNumber>. This composite is particularly useful with sub-collections.

<PartNumber>3</PartNumber> (number 3 in a series)
<PartNumber>Part 3</PartNumber> (a named Part 3)
<PartNumber>Volume 3</PartNumber> (a named Volume 3)
<PartNumber>Book III</PartNumber> (note that neither roman numerals nor a named ‘Book Three’ sort into the correct order alphabetically)
<PartNumber>Parts 3–5</PartNumber> (Parts 3, 4 and 5 of a collection in a single item)

Similar considerations apply to the use of <YearOfAnnual>, which might contain a simple year (‘2011’), or something qualified such as ‘2009/10 Season’. Consistency is important, or the individual products of a collection will not sort into the correct order (even when simple sorting is possible).

P.7 Authorship

The various contributors to a product are listed in repeats of the <Contributor> composite. It is best practice to include at least all the contributors named on the book’s title page (or those named with equivalent prominence on another type of product). However, it is normal to omit those such as ‘ghost writers’ not named on the product itself.

If there are no repeats of the <Contributor> composite, the <NoContributor/> empty element gives a positive indication that there are no contributors.

Contributor composite

Personal naming schemes are complex, sensitive and vary greatly between cultures, so data suppliers and recipients should take great care that names are presented correctly. The ONIX data elements for structured names attempt to isolate the parts of the name used for cataloging, advanced searching and sorting purposes, rather than trying to specify the ‘genealogical’ or familial structure of a name. Consider for example, three names, Sharon Stanton Russell, Anna Margaret Lindholm and Carmen Conde Abellán. For the (American) name Sharon Stanton Russell, Stanton is her unmarried family name (maiden name) retained after marriage, whereas for (British) Anna Margaret Lindholm, Margaret is a second given name. ONIX does not differentiate between these cases – ‘Russell’ and ‘Lindholm’ are the <KeyNames>, and Sharon Stanton and Anna Margaret would both be treated as <NamesBeforeKey>, even though one contains only given names and the other contains a family name. In the case of the (Spanish) name Carmen Conde Abellán, the <KeyNames> are ‘Conde Abellán’, since both are family names used for sorting purposes.

Contributor SequenceNumber SequenceNumber ContributorRole FromLanguage ToLanguage NameType Associated attributes UnnamedPersons UnnamedPersons Personal or Corporate name AlternativeName Associated attributes ContributorPlace for the contents of these boxes, see the diagrams below. Note there is no tag associated with either box (<NameType> is followed directly by the optional <NameIdentifier> or <ContributorDate> tags from the other diagrams)
Personal or Corporate name personal or corporate names should use one of these three options – a simple personal name (above), a structured personal name (left), or a corporate name (below) NameIdentifier PersonName PersonNameInverted PersonNameInverted must include at least one NameIdentifier CorporateName CorporateNameInverted CorporateNameInverted must include at least one NameIdentifier PersonName PersonNameInverted PersonNameInverted TitlesBeforeNames TitlesBeforeNames NamesBeforeKey NamesBeforeKey PrefixToKey KeyNames NamesAfterKey SuffixToKey LettersAfterNames LettersAfterNames TitlesAfterNames there is some value in providing PersonName and PersonNameInverted in addition to a structured name
Associated attributes ContributorDate ProfessionalAffiliation ProfessionalAffiliation BiographicalNote BiographicalNote Website ContributorDescription ContributorDescription repeatable, to provide the contributor biography in multiple languages
single contributor, contributor identifiers (including an ISNI) and all three forms of name provided
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A01</ContributorRole>
    <NameIdentifier>
        <NameIDType>16</NameIDType>
        <IDValue>0000000020691583</IDValue>
    </NameIdentifier>
    <NameIdentifier>
        <NameIDType>01</NameIDType>
        <IDTypeName>Ullstein Buchverlage Autor ID</IDTypeName>
        <IDValue>01381</IDValue>
    </NameIdentifier>
    <PersonName>Nele Neuhaus</PersonName>
    <PersonNameInverted>Neuhaus, Nele</PersonNameInverted>
    <NamesBeforeKey>Nele</NamesBeforeKey>
    <KeyNames>Neuhaus</KeyNames>
    <BiographicalNote textformat="05"><p><strong>Nele Neuhaus</strong> wurde 1967 als zweites von vier Kindern der Familie Löwenberg in Münster/Westfalen geboren. Sie wuchs in Paderborn auf, bevor sie im Alter von 11 Jahren mit ihrer Familie in den Taunus zog, als ihr Vater Landrat wurde. Schon seit frühester Kindheit schrieb Nele Neuhaus, zuerst handschriftlich in Schulhefte, später mit Reiseschreibmaschine und Computer.</p><p>Durch ihre Leidenschaft für Pferde lernte sie auf einem Reitturnier ihren Mann kennen, mit dem sie heute in Kelkheim/Taunus lebt.</p></BiographicalNote>
</Contributor>
using Short tags
<contributor>
    <b034>1</b034>First contributor
    <b035>A01</b035>Written by
    <nameidentifier>
        <x415>16</x415>ISNI
        <b244>0000000020691583</b244>
    </nameidentifier>
    <nameidentifier>
        <x415>01</x415>Proprietary ID
        <b233>Ullstein Buchverlage Autor</b233>
        <b244>01381</b244>
    </nameidentifier>
    <b036>Nele Neuhaus</b036>
    <b037>Neuhaus, Nele</b037>
    <b039>Nele</b039>
    <b040>Neuhaus</b040>
    <b044 textformat="05"><p><strong>Nele Neuhaus</strong> wurde 1967 als zweites von vier Kindern der Familie Löwenberg in Münster/Westfalen geboren. Sie wuchs in Paderborn auf, bevor sie im Alter von 11 Jahren mit ihrer Familie in den Taunus zog, als ihr Vater Landrat wurde. Schon seit frühester Kindheit schrieb Nele Neuhaus, zuerst handschriftlich in Schulhefte, später mit Reiseschreibmaschine und Computer.</p><p>Durch ihre Leidenschaft für Pferde lernte sie auf einem Reitturnier ihren Mann kennen, mit dem sie heute in Kelkheim/Taunus lebt.</p></b044>Text uses XHTML markup, and message uses UTF‑8 encoding so non-ASCII characters such as ‘ü’ need not be expressed as numerical character references (and must not be carried as HTML entities such as ‘&uuml;’)
</contributor>
single corporate contributor, name provided in both normal and inverted form
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A09</ContributorRole>
    <CorporateName>CLAMP</CorporateName>
    <CorporateNameInverted>CLAMP</CorporateNameInverted>
    <BiographicalNote textformat="05"><p><strong>CLAMP</strong> is a renowned all-female mangaka collective. Over more than 20 years, it has created many internationally successful series, including <em>xxxHolic</em>, <em>Tsubasa Reservoir Chronicles</em>, <em>Card Captor Sakura</em>, <em>Magic Knight Rayearth</em>, <em>Chobits</em> and <em>Tokyo Babylon</em>.</p></BiographicalNote>
</Contributor>
using Short tags
<contributor>
    <b034>1</b034>
    <b035>A09</b035>Created by
    <b047>CLAMP</b047>
    <x443>CLAMP</x443>
    <b044 textformat="05"><p><strong>CLAMP</strong> is a renowned all-female mangaka collective. Over more than 20 years, it has created many internationally successful series, including <em>xxxHolic</em>, <em>Tsubasa Reservoir Chronicles</em>, <em>Card Captor Sakura</em>, <em>Magic Knight Rayearth</em>, <em>Chobits</em> and <em>Tokyo Babylon</em>.</p></b044>
</contributor>
single contributor with alternative name (and alternative name identifier)
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A01</ContributorRole>
    <NameIdentifier>
        <NameIDType>01</NameIDType>
        <IDTypeName>HCP Author ID</IDTypeName>
        <IDValue>6108</IDValue>
    </NameIdentifier>
    <PersonNameInverted>Westmacott, Mary</PersonNameInverted>
    <NamesBeforeKey>Mary</NamesBeforeKey>
    <KeyNames>Westmacott</KeyNames>
    <AlternativeName>
        <NameType>04</NameType>
        <NameIdentifier>
            <NameIDType>01</NameIDType>
            <IDTypeName>HCP Author ID</IDTypeName>
            <IDValue>1067</IDValue>
        </NameIdentifier>
        <PersonNameInverted>Christie, Agatha</PersonNameInverted>
        <NamesBeforeKey>Agatha</NamesBeforeKey>
        <KeyNames>Christie</KeyNames>
    </AlternativeName>
    <BiographicalNote textformat="05"><p><strong>Mary Westmacott</strong> was a pseudonym used on six novels by the ‘Queen of Crime’ Agatha Christie.</p><p>Agatha Christie was born in Torquay in 1890 and became, quite simply, the best-selling novelist in history, outsold only by The Bible and Shakespeare.</p><BiographicalNote>
</Contributor>
<ContributorStatement>Agatha Christie, writing as Mary Westmacott</ContributorStatement>
using Short tags
<contributor>
    <b034>1</b034>First contributor
    <b035>A01</b035>Written by
    <nameidentifier>
        <x415>01</x415>Proprietary
        <b233>HCP Author ID</b233>
        <b244>6108</b244>
    </nameidentifier>
    <b037>Westmacott, Mary</b037>
    <b039>Mary</b039>Name as on product
    <b040>Westmacott</b040>(a pseudonym)
    <alternativename>
        <x414>04</x414>Real name
        <nameidentifier>
            <x415>01</x415>Proprietary
            <b233>HCP Author ID</b233>
            <b244>1067</b244>
        </nameidentifier>
        <b037>Christie, Agatha</b037>
        <b039>Agatha</b039>Real name
        <b040>Christie</b040>of author
    </alternativename>
    <b044 textformat="05"><p><strong>Mary Westmacott</strong> was a pseudonym used on six novels by the ‘Queen of Crime’ Agatha Christie.</p><p>Agatha Christie was born in Torquay in 1890 and became, quite simply, the best-selling novelist in history, outsold only by The Bible and Shakespeare.</p><b044>
</contributor>
<b049>Agatha Christie, writing as Mary Westmacott</b049>Text to be used for display purposes
In this case, the proprietary <NameIdentifier> is different for the two names, even though they refer to the same person, and (arguably) to the same public identity.
multiple contributors
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>B01</ContributorRole>
    <!-- <NameIdentifier> omitted for brevity -->
    <PersonNameInverted>Jones, Steve</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>2</SequenceNumber>
    <ContributorRole>B01</ContributorRole>
    <PersonNameInverted>Martin, Robert</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>3</SequenceNumber>
    <ContributorRole>B01</ContributorRole>
    <PersonNameInverted>Pilbeam, David</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>4</SequenceNumber>
    <ContributorRole>B15</ContributorRole>
    <PersonNameInverted>Bunney, Sarah</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>5</SequenceNumber>
    <ContributorRole>A24</ContributorRole>
    <PersonNameInverted>Dawkins, Richard</PersonNameInverted>
</Contributor>
using Short tags
<contributor>
    <b034>1</b034>Contributor 1
    <b035>B01</b035>Edited by
    <!-- <nameidentifier> omitted for brevity -->
    <b037>Jones, Steve</b037>
</contributor>
<contributor>
    <b034>2</b034>Contributor 2
    <b035>B01</b035>Edited by
    <b037>Martin, Robert</b037>
</contributor>
<contributor>
    <b034>3</b034>Contributor 3
    <b035>B01</b035>Edited by
    <b037>Pilbeam, David</b037>
</contributor>
<contributor>
    <b034>4</b034>Contributor 4
    <b035>B15</b035>Editorial coordination by
    <b037>Bunney, Sarah</b037>
</contributor>
<contributor>
    <b034>5</b034>Contributor 5
    <b035>A24</b035>Introduction by
    <b037>Dawkins, Richard</b037>
</contributor>
It is normal to provide a combined biographical note for multiple contributors in Group P.14 (<TextType> code 12 from List 153).
an unnamed contributor
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A01</ContributorRole>
    <UnnamedPersons>02</UnnamedPersons>
</Contributor>
using Short tags
<contributor>
    <b034>1</b034>First contributor
    <b035>A01</b035>Written by
    <b249>02</b249>Anonymous
</contributor>
no contributors
using Reference names
<NoContributor/>
using Short tags
<n339/>
P.7.1 Contributor sequence number

<SequenceNumber> contains a simple sequential integer: the first contributor listed on the product’s title page (or its equivalent on a non-book product) should be number 1, and subsequent contributors numbered 2, 3 and so on. The sequence number should always reflect the order used on the product, and not, for example, alphabetical order or an order based on their roles.

This sequence number should always be used by ONIX recipients to collate and display contributors. Do not rely on the order that contributors are listed within the ONIX file, although where possible ONIX senders should arrange files so that the sequence numbers do occur in order in the Product record.

If there are also contributors listed in P.5, their sequence numbers should form part of the same sequence – sequence numbers should be unique within the whole Product record – and this may mean the order contributors occur within the file is not their correct collation order.

P.7.2 Contributor role

The contributor role specifies the function of a contributor in the creation of the product. Note that the <ContributorRole> element is repeatable, and it should be repeated where a single contributor has multiple roles in relation to the product – do not list the same contributor twice.

using Reference names
<ContributorRole>A24</ContributorRole>Introduced and
<ContributorRole>B06</ContributorRole>translated by…
using Short tags
<b035>A01</b035>Written and
<b035>A12</b035>illustrated by…
Repetition of roles can conflict with the arrangement of contributors in their correct sequence – for example, ‘Written by Charles Smith and Jonathan Green, read by Charles Smith’. In this case, Charles Smith is Contributor 1 (with two roles), and Jonathan Green is Contributor 2. There should be no Contributor 3: Charles Smith should not be repeated. This is a case where the <ContributorStatement> data element is valuable.
Name identifier composite

There is as yet no well-established public standard for name identifiers. There is ongoing development of the relatively new ISNI (International Standard Name Identifier), and support for it is growing. There are also many well-developed library name identifiers such as the LCCN or VIAF (to which ISNI is closely linked), but none is widely used within the book trade.

NameIdentifier NameIDType IDTypeName IDValue must include if ID is proprietary, otherwise omit

Over seven million contributor names have been assigned an ISNI, and this is the preferred name identifier within ONIX.

The ISNI (like some other ‘name identifiers) is in fact an identifier for a ‘public identity’ or a ‘persona’, rather than for a single name: one ISNI can be associated with many names or name variations that are used by a single persona, and a single public identity may in fact be shared by more than one real person (eg a writing team may use a single pseudonym). A named ONIX contributor is a single persona, and an ISNI identifies that persona, rather than a particular name or name variation used by that contributor – so it is best practice to associate an ISNI with the primary name of a contributor. Note that an ISNI does not directly identify the ‘person’ or ‘party’ that uses the name in the way that an ID like a Social Security number does.

However, since few publishers have yet adopted the ISNI, even internal identifiers used by publishers can be helpful to ONIX data recipients who wish to collocate products by a contributor, or distinguish products by two contributors with identical names. The form of a name may change on successive products by a single contributor (Steve Jones, Dr. Steve Jones, Prof. Steve Jones etc). A name may be very similar or identical to that used by a quite different contributor (one UK publisher publishes two separate authors called Professor Richard Holmes), or a contributor may use one or more pseudonyms without any particular wish for anonymity (eg Ruth Rendell and Barbara Vine). Names change over time (eg through marriage, legal changes, or through the addition of titles, qualifications and honors), and the exact form of a name used on a product may also be influenced by marketing requirements – a popular dieting book and an academic treatment of nutrition may use subtly different forms of name for the same contributor. In all of these cases, collocation is aided even by proprietary identifiers. Of course, a public identifier such as an ISNI is of greater utility for cases where multiple publishers are involved or when a contributor is active in other creative fields too (a specific feature of ISNI is that it bridges between books, music and other fields). When providing a proprietary identifier, always include a consistent name for the identifier in the <IDTypeName> element, so it can be distinguished from proprietary identifiers controlled by other organizations.

P.7.9, P.7.10 Personal names

It is strongly preferred to supply personal contributor names in a fully-structured manner, using the elements P.7.11 to P.7.18, ideally with both <PersonName> and <PersonNameInverted> as well. However, publisher’s systems may not support such granularity. If structured names are not available, then there is great value in providing both <PersonName> and <PersonNameInverted>. Recipients may then use the former for display and the latter for collation (and possibly both for search). And if only one type of unstructured name can be included, the ‘inverted’ form of the name with family name before given name in <PersonNameInverted> is preferred to the use of <PersonName> alone.

The ‘inverted’ order is the normal order for Chinese, Japanese and many other Eastern naming systems, and for Hungarian names – and for these names, <PersonName> and <PersonNameInverted> would be identical.

<PersonName> alone is the ‘worst case’ option, used only when contributor names cannot be supplied in any other form. Note that names provided in this form cannot reliably be collated (ie sorted into alphabetical order).

The primary name supplied should be the name as it is used on the product, and should be the name used by recipients for display purposes. If the publisher wishes, alternative forms of the same name – or alternative names for the same contributor – can be provided in the <AlternativeName> composite. Providing alternative names and name identifiers can help collocation, where different forms of the same name are used on different products. In the rare case where two names for the same contributor are used on the same product, then the primary name should be the name used most prominently, with the less prominent name supplied as an <AlternativeName>.

<PersonName><PersonNameInverted>(key names in bold)
Dame Ngaio MarshMarsh, Dame Ngaio
Henry of HuntingdonHuntingdon, Henry of
Henriette d’AngevilleAngeville, Henriette d’
Mary O’ConnorO’Connor, Mary
Máire ó ConchobhairConchobhair, Máire ó
Gabriel Garcia MárquezGarcia Márquez, Gabriel
Megan Lloyd GeorgeLloyd George, Megan
Luo GuanzhongLuo Guanzhong(family name first)
村上 春樹村上 春樹(Haruki Murakami, family name first)
Björk GuðmundsdóttirBjörk Guðmundsdóttir(patronymic is not key name)
प्रेमचंदप्रेमचंद(Premchand, only a single name)
שפרה הורןהורן, שפרה(Shifra Horn, text runs right to left)
Abu al-Hasan Ali
ibn al-Husayn ibn Ali
al-Mas'udi
Mas'udi, Abu al-Hasan
Ali ibn al-Husayn
ibn Ali al-
(أبو الحسن علي بن الحسين بن علي المسعودي)
For names and any other text in a script written right to left (Arabic, Hebrew etc), no special care need be taken over the character ordering in the data file: the first character of an RTL name occurs immediately following the > of <PersonName> in the file, even though when the ONIX data is printed or viewed on screen, the first character of the name appears immediately before the < of </PersonName>. So for שפרה הורן (Shifra Horn), the order of characters in the ONIX message is <PersonName>שפרה הורן</PersonName>. Those characters ש, פ, ר, ה etc are then displayed from right to left as שפרה הורן

Author names (and product titles) often have to be presented in alphabetical order, and <PersonNameInverted> or the various parts of a structured name usually facilitate this. But sorting text in a variety of scripts – and for text other than names – languages, and determining the correct ‘alphabetical’ order is dependent on the locale and expected user experience when the text is displayed. There is no single, canonical order: Ö sorts after Z in a Swedish context, but immediately follows O words in a German context, and in a English language context, O and Ö are likely to be intermixed. A Swedish reader will expect a name beginning with Ö to appear after Z in a list even if the name is that of a German author. The sort order is not entirely intrinsic to the data itself, is not purely a feature of language or script, and in particular it’s almost never exactly the same as the numerical order of Unicode characters. Sort order is a feature of the user interface. Even converting between lower and upper case prior to sorting can be problematic – in Turkish, the upper case version of i is İ (with a dot above), and the lower case version of I is ı (a dotless i). For more on sorting, see the Unicode Collation Algorithm specification and the Common Locale Data Repository.

For non-alphabetic writing systems such as Chinese or Japanese, the collationkey attribute is important, and it should be used to provide phonetic or other sort order information for machine processing.

providing a sort order in Japanese
using Reference names
<PersonNameInverted collationkey="むらかみ はるき">村上春樹</PersonNameInverted>
using Short tags
<b037 collationkey="むらかみ はるき">村上春樹</b037>
Japanese personal names are written in Kanji and are conventionally sorted according to their phonetic reading. The name may be spelled out phonetically using Hiragana or Katakana characters. But many Kanji characters have multiple readings, so 彰 can be read as しよう (‘Shō’) or あきら (‘Akira’), and the chosen reading affects the sort order. The collationkey attribute provides the correct phonetic rendering. (It is also the case that one phonetic reading such as あきら (‘Akira’) may have many Kanji renderings including 明 or 晃). For Chinese names, Pinyin or Zhuyin phonetics would be used.

In cases where the reading of the name needs to be clarified for human readers, rather than for automated sorting, the phonetic information is provided in a ruby gloss. A gloss can be incorporated using either the XHTML <ruby> markup element – in ONIX data elements that allow XHTML – or using Unicode interlinear annotation delimiters in data elements that do not. Where phonetics are needed for both human reading and for sorting purposes, both a collationkey and a ruby text should be included.

<b037 collationkey="むらかみ はるき">&#xfff9;村上春樹&#xfffa;むらかみ はるき&#xfffb;</b037>
This uses Unicode interlinear annotation delimiters (&#xfff9;, &#xfffa; and &#xfffb;) to separate the Kanji from the ruby text Hiragana, as the <PersonNameInverted> element cannot use XHTML <ruby> markup. In other data elements where it is allowed, the equivalent XHTML markup must be used. The name above should be displayed as 村上春樹むらかみ はるき.

Unicode interlinear annotation delimiters cannot be rendered on-screen unless an application provides specific support for them. In order to display the name with its gloss in a web browser using XHTML 1.1 (or non-standard HTML4), an application might:

  1. replace &#xfff9; with ‘<ruby><rb>’;
  2. change &#xfffa; to ‘</rb><rp> (</rp><rt>’;
  3. and substitute ‘</rt><rp>)</rp>’ for ‘&#xfffb;’.

For HTML5, correct practice would be to omit the <rb> and </rb> tags. However, this will prevent validation of the ONIX XML – implementation of <ruby> in ONIX is based on XHTML 1.1, and the <rb> tag is required.

In summary, there are three ways of incorporating phonetic glosses into ONIX data:

  • if the gloss is intended to be used for automated sorting purposes;
    • method 1 – use the collationkey attribute;
    • the attribute can only be included on elements that are likely to be used for sorting;
  • if the gloss is intended to be used for display purposes, either:
    • method 2 – use <ruby> where XHTML markup is allowed within the ONIX data element; or
    • method 3 – use Unicode interlinear annotation delimiters where XHTML is not allowed;
    • glosses for display can be incorporated into any textual data element.

Occasionally, as in the example above, a single data element may need a gloss provided as an attribute for sorting, and as a display gloss.

P.7.11 Titles before names

Parts of names in bold would be carried in this element.

Pope Benedict XVI
His Holiness the Dalai Lama
HRH Prince Charles
Dame Ngaio Marsh
General Sir Peter de la Billière
P.7.12 Names before key names
Dame Ngaio Marsh
General Sir Peter de la Billière
Henriette d’Angeville
Gabriel Garcia Márquez
Robert Louis Stephenson
Megan Lloyd George
Фёдор Достоевский(Fyodor Dostoyevsky)
Haruki Murakami(when name is Westernized)
علاء الأسواني(Alaa al-Aswany)
שפרה הורן(Shifra Horn)
P.7.13 Prefix to key names
His Holiness the Dalai Lama
General Sir Peter de la Billière
Henriette d’Angeville
Máire ó Conchobhair
علاء ال أسواني(Alaa Al Aswany)
Eric van Lustbader
Melissa de la Cruz
P.7.14 Key names

All names presented in fully-structured style must have a Key name – this is the part of the name that is used first for collation purposes. Note that although names are conventionally sorted using the family or inherited name, there are exceptions for names that include titles, where for example a given name (Charles), a taken name (Benedict) or a title (Dalai Lama) may be used as the key, and for other names that do not have a family component (such as Icelandic names, which consist of given name and patronymic). When constructing a structured name, always start with the key name element and arrange other parts of the name around it.

Pope Benedict XVI
His Holiness the Dalai Lama
HRH Prince Charles
Dame Ngaio Marsh
General Sir Peter de la Billière
Henriette d’Angeville
Mary O’Connor
Máire ó Conchobhair
Gabriel Garcia Márquez
Robert Louis Stephenson
Megan Lloyd George
Фёдор Достоевский(Fyodor Dostoyevsky)
Haruki Murakami(Westernized name order)
村上 春樹(Haruki Murakami, name order not Westernized)
علاء ال أسواني(Alaa Al Aswany)
שפרה הורן(Shifra Horn)
Eric van Lustbader
Melissa de la Cruz
Madonna
Björk Guðmundsdóttir(given name is key name)
प्रेमचंद(Premchand)
Luo Guanzhong(贯中, name order not Westernized)
Petőfi Sándor(Hungarian, name order not Westernized)

Exactly what constitutes the key name varies according to the culture from which the name arises. Prefixes like de, ó or van often get incorporated into the key name as the name passes from its original culture into another, so for example Máire ó Conchobhair becomes Mary O’Connor, the Dutch name ‘de Groot’ becomes ‘De Groot’ or the French name ‘de la Mare’ becomes the English ‘Delamere’. But this is a cultural shift, not a matter of language, and many also choose to retain their name structure (for example, English poet Walter de la Mare).

Except with names such as Madonna or Premchand which have no other parts, if <KeyNames> is used, all other parts of the name should be included in the relevant elements – <NamesBeforeKey>, <PrefixToKey> and so on. <KeyNames> is not intended merely to indicate which part of the name in <PersonName> is to be used for sorting.
P.7.15 Names after key names
村上 春樹(Haruki Murakami, name order not Westernized)
Björk Guðmundsdóttir(patronymic follows key name)
Luo Guanzhong(罗贯中, name order not Westernized)
Petőfi Sándor(name order not Westernized)
P.7.16 Suffix after key names

This element is only used for simple suffixes such as ‘Jr.’, ‘fils’ or ‘XVI’.

P.7.17 Qualifications and honors after names

This element is used to list qualifications (‘PhD’, ‘MD’ etc), and for honors and memberships (‘OBE’, ‘TD’, ‘FRCS’ etc).

P.7.18 Titles after names

This element is used to list titles and honorifics that follow a person’s names. Note that often, there is some choice about how names that include titles may be presented:

King Philip II of Spain
Philip II, King of Spain

Presentation of primary names should follow the form used on the product. A publisher may provide an alternative name when another form is more familiar.

P.7.19, P.7.20 Corporate contributor names

Corporate names should be carried in <CorporateName>, unless they begin with a prefix that should be ignored for collation purposes – in which case it is best practice to (also) include <CorporateNameInverted>. If there is a prefix, there is some value in providing both forms (as with personal names). Corporate names should not include suffixes such as ‘Inc’, ‘SA’ or ‘Ltd’, unless they are used on the product itself.

no prefix (on the product)
<CorporateName>World Health Organization</CorporateName>
name with a prefix
<CorporateName>The editors of Rolling Stone Magazine</CorporateName>
<CorporateNameInverted>Rolling Stone Magazine, The editors of</CorporateNameInverted>
Alternative name composite

The <AlternativeName> composite provides a way of delivering another name (or an alternative form of the same name) for a contributor. Potential uses might include providing a previous name (where the contributor’s name has changed by marriage, for example, or the corporation has undergone a rebranding), an authority-controlled or otherwise standardized form of a name where the name on the actual product is presented in an unusual form, or where both a real name and a pseudonym are involved. Such alternative names can provide important search term matches, aiding discovery of the product online or within retailer systems.

Alternative name NameType Personal or Corporate name

While the primary contributor name (in P.7.9 to P.7.20) should always be the name as it appears on the product, the alternative name may be one of several types. It is mandatory to specify a <NameType> within <AlternativeName>. In contrast, the P.7.5 <NameType> element is not required (and is usually omitted) for the primary name.

The remainder of <AlternativeName> is identical to the data elements that carry the primary name, and similar best practice applies. Note that <AlternativeName> can include an alternative <NameIdentifier> composite, as well as any of the data elements holding the fully-structured, inverted or normal contributor name.

In general, each <AlternativeName> composite should contain a name with a different name type. However, it is also possible to provide transliterations of a name in two repeats with the same name type, or a single <AlternativeName> may contain a transliteration of the primary name. Transliterations are often useful where the recipient of the data may not be able to cope with the characters in the native script – for example selling a book in Russian, with a Cyrillic author name, in a country where many data recipients might only be able to deal with Latin characters. It is obviously preferable to supply the primary name in Cyrillic, for those recipients that can make use of it, and a Latin transliteration for those that cannot.

transliterated version of the primary name
using Reference names
<Contributor>
    <ContributorRole>A01</ContributorRole>
    <PersonName>다니엘 돔샤이트-베르크</PersonName>
    <PersonNameInverted>돔샤이트-베르크, 다니엘</PersonNameInverted>
    <AlternativeName>
        <NameType>05</NameType>
        <PersonName textscript="Latn">Daniel Domscheit-Berg</PersonName>
        <PersonNameInverted textscript="Latn">Domscheit-Berg, Daniel​</PersonNameInverted>
    </AlternativeName>
</Contributor>
using Short tags
<contributor>
    <b035>A01</b035>Written by
    <b036>다니엘 돔샤이트-베르크</b036>Hangul script used on the book
    <b037>돔샤이트-베르크, 다니엘</b037>
    <alternativename>
        <x414>05</x414>Transliteration of primary name
        <b036 textscript="Latn">Daniel Domscheit-Berg</b036>Latin script
        <b037 textscript="Latn">Domscheit-Berg, Daniel</b037>
    </alternativename>
</contributor>

ONIX takes the pragmatic view that while most textual metadata can be expressed in different languages, organizational or contributor names are not ‘in’ a particular language. However, names can be expressed in different scripts – فيودور دوستويفسكي and Фёдор Достоевский and Fyodor Dostoyevsky. So while <BiographicalNote> may carry a language attribute (see below), <PersonName> carries a textscript attribute instead.

Although in this particular case the author (Daniel Domscheit-Berg) is German by nationality, the book is in the Korean language, the majority of the metadata is in Korean, and the author name is listed on the book in Hangul (Korean script). Thus the primary name is in Hangul, and the name is also provided transliterated into Latin script in the <AlternativeName> composite. By contrast, a German or English language copy of the book would list the primary author’s name in Latin script, and if that copy were for sale in Korea, the transliteration into Hangul might be given: for this book, the Hangul and Latin versions of the name would be switched around.

Provision of transliterated names is quite different from provision of phonetic information for sorting (which would use the collationkey attribute), or provision of ruby glosses to disambiguate the reading of a name (which would use Unicode interlinear annotation delimiters): see the notes on Personal names in Group P.7.

Contributor date composite

Birth and death dates are often use in libraries to distinguish between authors of the same name. They are not typically made explicit in book trade metadata, but equivalent information may be incoporated into any biographical notes. In general, providing a name identifier such as an ISNI is better for disambiguation purposes.

ContributorDate ContributorDateRole ContributorDateRole DateFormat Date use dateformat attribute on <Date> element instead
Jane Austen (1775–1817)
using Reference names
<ContributorDate>
    <ContributorDateRole>50</ContributorDateRole>
    <Date dateformat="05">1775</Date>
</ContributorDate>
<ContributorDate>
    <ContributorDateRole>51</ContributorDateRole>
    <Date>18170717</Date>
</ContributorDate>
using Short tags
<contributordate>
    <x417>50</x417>Born…
    <b306 dateformat="05">1775</b306>Format is YYYY
</contributordate>
<contributordate>
    <x417>51</x417>Died…
    <b306>18170717</b306>Default format is YYYYMMDD
</contributordate>
Professional affiliation composite

Affiliations – effectively a job title and organization name – are used when the professional position of the author is important to the market positioning of the product – for example to emphasize the credentials of academic author, or the credibility of any other expert.

ProfessionalAffiliation ProfessionalPosition ProfessionalPosition Affiliation must include at least one
multiple affiliations for an academic author
using Reference names
<ProfessionalAffiliation>
    <ProfessionalPosition>Emeritus Professor of Social Geography</ProfessionalPosition>
    <Affiliation>School of Geography and the Environment, University of Oxford</Affiliation>
</ProfessionalAffiliation>
<ProfessionalAffiliation>
    <ProfessionalPosition>Professor of Social Geography</ProfessionalPosition>
    <Affiliation>Institute for Social Change, Manchester University</Affiliation>
</ProfessionalAffiliation>
using Short tags
<professionalaffiliation>
    <b045>Emeritus Professor of Social Geography</b045>
    <b046>School of Geography and the Environment, University of Oxford</b046>
</professionalaffiliation>
<professionalaffiliation>
    <b045>Professor of Social Geography</b045>
    <b046>Institute for Social Change, Manchester University</b046>
</professionalaffiliation>
P.7.42 Biographical note

Any named contributor – personal or corporate – may be associated with a short biography. Typically these are relatively short, perhaps no more than 200 to 400 words – around 1500–3000 characters. Although there is no specified maximum, a practical upper limit of perhaps 5000 characters is reasonable for senders. The <BiographicalNote> element is best used when there are a small number of contributors (say three or fewer). A single combined biography for multiple contributors is best provided in the <TextContent> composite in Group P.14.

Simple biographical notes can be provided as plain text. However, because of the differing XML treatment of white space characters (including tabs, line feeds and carriage returns) among different XML parsers and in different databases, multi-paragraph text cannot reliably be delivered into a recipient’s system this way. For this and other reasons, <BiographicalNote> and other ONIX data elements can accept not only plain text but also text with embedded XHTML markup. Embedded markup is the only reliable way to deliver multi-line or multi-paragraph text.

By default, XML data is processed by ‘normalizing’ whitespace characters – spaces, tabs, line breaks and so on – so they are all treated as single spaces. Leading and trailing whitespace is also often ‘trimmed’. ONIX data should be processed this way too (ie with the the xml:space attribute set to its default value). These:

  • <BiographicalNote>    Umberto Eco
    is a    novelist.</BiographicalNote>
  • <BiographicalNote> Umberto Eco is a novelist.</BiographicalNote>
  • <BiographicalNote>Umberto Eco is a novelist.</BiographicalNote>

(where is an explicit new line in the data) are equivalent. Data senders should aim to send only the third option, and recipients should normalize whitespace characters they receive.

This behaviour applies to all ONIX tags, not just to <BiographicalNote>. But where a tag is intended to allow multiple lines or paragraphs of text, embedded markup provides the solution.

If you have only plain text, and want to include multi-paragraph biographies, then you must include some markup within the data element. The simplest process would be to:

  • prefix the text of the biographical note with ‘<p>’ and suffix it with ‘</p>’;
  • replace any paragraph breaks with ‘</p><p>’;
  • add the textformat attribute with value 05.
adding minimal XHTML markup
using Reference names – original plain text
<BiographicalNote>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.
    As well as novels, he also writes children’s books and academic works.</BiographicalNote>
preferred alternative including XHTML markup
<BiographicalNote textformat="05"><p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.</p><p>As well as novels, he also writes children’s books and academic works.</p></BiographicalNote>
using Short tags – original plain text
<b044>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.
    As well as novels, he also writes children’s books and academic works.</b044>
paragraph breaks are likely to be lost
preferred alternative using XHTML markup
<b044 textformat="05"><p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.</p><p>As well as novels, he also writes children’s books and academic works.</p></b044>XHTML markup should ensure paragraphs remain separate as data is processed by the recipient

XHTML markup is strongly preferred to HTML, as it can be properly validated using the ONIX for Books schemas.

But not all XHTML markup tags are usable. First, there are limitations on what XHTML tags can be used within ONIX. Second, recipients will often strip out some tags, or might even ignore the supplied text altogether because they are reluctant to include the supplied tags on their website (even though they might be technically valid). In practice, the following should be usable without problems:

  • <p> and <br /> for paragraphs and newlines;
  • <em>, <strong>, <i>, <b>, <sub> and <sup>;
  • <ul>, <ol> and <li> for unordered and ordered lists;
  • <dl>, <dt> and <dd> for description lists;
  • <ruby>, <rb>, <rt> and <rp> for glosses;
  • <address>, <abbr>, <small>, <q>, <cite>, <code>, <samp>, <var> and <dfn> for limited ‘semantic’ markup of text, plus <rbc> and <rtc> for complex glosses, should be okay, but it is good practice to avoid them;
    • this list is not exhaustive. A complete list is given in the Appendix.

Headings, tables, <div> and <span> containers, image tags, links and imagemaps using the <a> or <map> tags, and attributes like style are also technically valid within the ONIX schema, but will often cause problems with recipients. Avoid them.

Forms, embedded objects, scripts, any tags that can only exist in the <head> section (eg <title>), any tags with event attributes like onclick and – critically – named character entities like ‘&ouml;’ are valid in XHTML itself, but cannot be used in XHTML embedded within ONIX.

The ‘top level’ XHTML elements must be block-level elements:

not valid – no outer block-level element
<BiographicalNote textformat="05">some text</BiographicalNote>
<BiographicalNote textformat="05"><strong>some text</strong></BiographicalNote>
 
valid – <p> and <ul> are both block-level
<BiographicalNote textformat="05"><p>some text</p></BiographicalNote>
<BiographicalNote textformat="05"><p>some text</p><ul><li>a list</li></ul></BiographicalNote>

To be ultra-cautious, stick to the very simplest tags:

  • <p>, <br />
  • <b> and <i>, <strong> and <em>
  • <ul>, <ol> and <li>
  • plus possibly <dl>, <dt>, <dd>
  • <sub> and <sup>
  • and <ruby>, <rb>, <rt> and <rp> if required

…of which <p>, <ul>, <ol> and <dl> are block-level.

Where markup is used in an element with a length limit, it is included in any character count.

While embedding XHTML is the preferred option, it is not the only method of embedding markup in ONIX data elements. If it is impossible to use XHTML for any reason, then ordinary HTML can be included in one of two ways:

embedding HTML markup
using Reference names – HTML in CDATA option
<BiographicalNote textformat="02"><![CDATA[<p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.<p>As well as novels, he also writes children’s books and academic works.</p>]]></BiographicalNote>
escaped HTML option
<BiographicalNote textformat="02">&lt;p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.&lt;p>As well as novels, he also writes children’s books and academic works.&lt;/p></BiographicalNote>
using Short tags – HTML in CDATA option
<b044 textformat="02"><![CDATA[<p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.<p>As well as novels, he also writes children’s books and academic works.</p>]]></b044>Enclose HTML within CDATA
escaped HTML option
<b044 textformat="02">&lt;p>Umberto Eco, professor of semiotics at the University of Bologna, and author of ‘The Name Of The Rose’ and ‘Foucault’s Pendulum’, is one of the world’s bestselling novelists.&lt;p>As well as novels, he also writes children’s books and academic works.&lt;/p></b044>Replace < character with &lt; (optionally, also replace > with &gt;)
The CDATA method is strongly preferred over escaped HTML, though neither is a best practice and XHTML (without either CDATA or escaping of the markup) is preferred to both. HTML markup should be limited to a similar set of simple tags as XHTML – and also limited to those data elements where XHTML is an option. CDATA should never be used except to embed HTML markup.

Although they are not recommended, named character entities like &hellip; or &ouml; should work within HTML text inside CDATA. They are not valid within ‘ordinary’ ONIX 3.0 data, but may be used within HTML (and of course, numerical character references like &#x2026; or &#xf6; will also work).

For the CDATA method, use the named entity or numerical reference as normal.

For the escaped HTML option, named entities are not valid – use numerical references instead.

For the escaped HTML option, it might appear that a named entity such as &hellip; or numerical reference like &#x2026; might require the ampersand to be escaped – just as the < character in the HTML markup is escaped to &lt;, the ampersand character in a named character entity or numerical reference could also be escaped, so a single ellipsis becomes &amp;hellip; or &amp;#x2026;. This is sometimes termed ‘double-escaping’, and it is strongly discouraged. Only the < character in the HTML markup needs do be modified (and you might also replace < that is not part of the HTML markup by <).

If using named character entities or numerical references in this manner (ie escaped within embedded HTML markup), ensure proper end-to-end testing with each ONIX recipient. The likelihood of confusion and error with escaped characters and double-escaped named character entities and numerical references is one reason why the CDATA method is preferred in ONIX 3.0. (In earlier versions of ONIX, the CDATA and escaped HTML methods were equally acceptable.)

Within HTML (but not within XHTML), the unescaped & character is valid. However, it is strongly recommended that the named entity &amp; is used instead, for both the CDATA and escaped HTML options. Using an unescaped & character in escaped HTML (ie without CDATA) will prevent the ONIX message validating.

Like many textual data elements, the <BiographicalNote> data element is repeatable in order that the same text can be carried in multiple languages. Of course, in most cases, textual metadata is provided in the same language used in the book. Parallel metadata in multiple languages might be required in territories where multiple languages are in everyday use (for example, Switzerland), in language learning and gift giving scenarios where a book might be purchased by someone who cannot read the language used in the book itself, and in cases where the product itself contains parallel text. If parallel textual metadata is provided, then each repeat of <BiographicalNote> must carry a language attribute. This allows a recipient to either accept and process the multiple languages, or to select the single language that is most appropriate for their needs.

providing text in parallel languages
using Reference names
<BiographicalNote language="eng" textformat="05"><p><strong>Umberto Eco</strong>, professor of semiotics at the University of Bologna, and author of <em>The Name Of The Rose</em> and <em>Foucault’s Pendulum</em>, is one of the world’s bestselling novelists.</p><p>As well as novels, he also writes children’s books and academic works.</p></BiographicalNote>
<BiographicalNote language="ita" textformat="05"><p><strong>Umberto Eco</strong>, professore di semiotica all’Università di Bologna e autore di <em>Il nome della rosa</em> e <em>Il pendolo di Foucault</em>, è uno dei romanzieri più venduto al mondo.</p><p>Così come romanzi, lui scrive anche libri per bambini e opere accademici.</p></BiographicalNote>
using Short tags
<b044 language="eng" textformat="05"><p><strong>Umberto Eco</strong>, professor of semiotics at the University of Bologna, and author of <em>The Name Of The Rose</em> and <em>Foucault’s Pendulum</em>, is one of the world’s bestselling novelists.</p><p>As well as novels, he also writes children’s books and academic works.</p></b044>
<b044 language="ita" textformat="05"><p><strong>Umberto Eco</strong>, professore di semiotica all’Università di Bologna e autore di <em>Il nome della rosa</em> e <em>Il pendolo di Foucault</em>, è uno dei romanzieri più venduto al mondo.</p><p>Così come romanzi, lui scrive anche libri per bambini e opere accademici.</p></b044>
P.7.47 Unnamed person(s)

A product that has no specific named contributors may still have unnamed contributors – they may be unknown, anonymous, or indeed artificial – for example, a book may be written by ‘Anonymous’, and an audiobook or DAISY title using synthesized but prerecorded speech should carry an <UnnamedPersons> element with a contributor role code indicating ‘read by’.

Unnamed contributors can also be combined with named contributors, and this is relatively common when the number of named contributors is limited by editorial policy – for example, if only three names are allowed, the fourth and fifth may be unnamed, and <UnnamedPersons> would carry the code value 03 (‘et al’). Note that in this case, a single <UnnamedPersons> composite stands for multiple contributors. In general, an unnamed et al contributor should be the last contributor as defined by the <SequenceNumber>, or at least the last contributor with a particular role (where other, named contributors with different roles follow). In very rare cases, a product might have more than one <UnnamedPersons> element, in different repeats of <Contributor> with different contributor roles.

audiobook read by a synthesized voice
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>A01</ContributorRole>
    <PersonNameInverted>Angeville, Henriette d’</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>2</SequenceNumber>
    <ContributorRole>E07</ContributorRole>
    <UnnamedPersons>06</UnnamedPersons>
</Contributor>
using Short tags
<contributor>
    <b034>1</b034>First contributor
    <b035>A01</b035>Written by
    <b037>Angeville, Henriette d’</b037>
</contributor>
<contributor>
    <b034>2</b034>Second contributor
    <b035>E07</b035>Read by
    <b249>06</b249>Female synthesized voice
</contributor>
multiple contributors, with et al
using Reference names
<Contributor>
    <SequenceNumber>1</SequenceNumber>
    <ContributorRole>B01</ContributorRole>
    <PersonNameInverted>Jones, Steve</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>2</SequenceNumber>
    <ContributorRole>B01</ContributorRole>
    <PersonNameInverted>Martin, Robert</PersonNameInverted>
</Contributor>
<Contributor>
    <SequenceNumber>3</SequenceNumber>
    <ContributorRole>Z98</ContributorRole>
    <UnnamedPersons>03</UnnamedPersons>
</Contributor>
<ContributorStatement textformat="05"><p>Edited by Steve Jones, Robert Martin <em>et al</em></p></ContributorStatement>
using Short tags
<contributor>
    <b034>1</b034>First contributor
    <b035>B01</b035>Edited by
    <b037>Jones, Steve</b037>
</contributor>
<contributor>
    <b034>2</b034>Second contributor
    <b035>B01</b035>Edited by
    <b037>Martin, Robert</b037>
</contributor>
<contributor>
    <b034>3</b034>Remaining contributors
    <b035>Z98</b035>(Various roles)
    <b249>03</b249>et al
</contributor>
<b049 textformat="05"><p>Edited by Steve Jones, Robert Martin <em>et al</em></p></b049>
Compare this example with the previous multiple contributor example.

This data element may also be used with products that are compilations or anthologies, where the contributors may be described as ‘Various’ – and in this case, repeats of the <ContentItem> composite in Group P.18 may be used to provide more information on the authorship of each section.

Contributor place composite

Although not a core part of global best practice, the <ContributorPlace> composite is a key part of some national schemes to promote local authors. It should be included, for example, on products sold in Canada where a contributor is a Canadian citizen. Retailers value it – and particularly <LocationName> – as a way of supporting ‘local author’ promotions.

ContributorPlace ContributorPlaceRelator ContributorPlaceRelator CountryCode RegionCode LocationName must include one or the other (and ideally not both)
Although the ONIX schema allows use of both Country and Region codes together, it is strongly recommended that only one or the other is used in a particular <ContributorPlace> composite. Region codes already include the relevant country designation. A location name cannot be used without also including either a country or region code.
designating a contributor as Canadian
using Reference names
<ContributorPlace>
    <ContributorPlaceRelator>08</ContributorPlaceRelator>
    <CountryCode>CA</CountryCode>
</ContributorPlace>
using Short tags, also indicating residency
<contributorplace>
    <x418>08</x418>Citizen of
    <b251>CA</b251>Canada
</contributorplace>
<contributorplace>
    <x418>04</x418>Currently lives in
    <b398>CA-NL</b398>Newfoundland and Labrador
    <j349>Stephenville</j349>and specifically, in Stephenville
</contributorplace>

More detailed information about where a contributor lives may also be incorporated into <BiographicalNote>, but the structured information in <ContributorPlace> can be particularly useful.

P.7.51 Contributor statement

This data element should be used to carry readable text showing how the contributor’s names and roles should be displayed. This is particularly critical when a simple concatenation of the contributor’s roles and names would not give satisfactory results – for example when one contributor has multiple roles, or when for example a familial relationship between two contributors may alter how their names are presented.

The contributor statement should not include any collection-level contributors listed separately in Group P.5 (although best practice is to avoid collection-level contributors altogether, and to include all contributors in Group P.7).

<ContributorStatement>Written and illustrated by Colin and Jacqui Hawkins. Series edited by Cliff Moon</ContributorStatement>
<ContributorStatement>By Richard and Alastair Fitter, illustrated by Marjorie Blamey</ContributorStatement>

Recipients of ONIX records that contain this element should use the supplied contributor statement for display, while using the names supplied in the <Contributor> composite for search, collation, collocation and other processing. The same applies to the equivalent <ContributorStatement> within Group P.5, though that should only be used where specific national practice requires contributors such as series editors to be identified at collection level.

In common with many other textual data elements, <ContributorStatement> is repeatable if supplying parallel text in multiple languages, and may also include XHTML markup.

P.7.52 “No authorship” indicator

This element should be used to give a positive indication that there are no contributors, either named or unnamed, in either P.7 or in P.5 where contributors to a collection can be listed.

This is one of the few ONIX for Books data elements that are defined as empty elements. The ‘self-closing’ XML syntax should always be used (ie <NoContributor/>, not <NoContributor></NoContributor>).

P.8 Conference

If the product is not linked to a particular conference, this Group can be ignored entirely. But if the ONIX record describes a product such as the proceedings of a conference, then the Group P.8 should be used to specify details about the conference – the name, theme, date and so on. These details may be important for certain publishers, particularly in the academic and technical sectors, and for library cataloging.

In many cases, the conference details may repeat information that is also included in the title of the product.

Conference composite
Conference ConferenceRole ConferenceName ConferenceName ConferenceAcronym ConferenceAcronym ConferenceNumber ConferenceNumber ConferenceTheme ConferenceTheme ConferenceDate ConferencePlace ConferenceSponsor ConferenceSponsor Website
Proceedings of the Frontiers in Science Education Research Conference
using Reference names
<Conference>
    <ConferenceName>Frontiers in Science Education Research Conference</ConferenceName>
    <ConferenceAcronym>FISER ’09</ConferenceAcronym>
    <ConferenceDate dateformat="06">2009032220090324</ConferenceDate>
</Conference>
using Short tags
<conference>
    <b052>Frontiers in Science Education Research Conference</b052>
    <b341>FISER ’09</b341>
    <b054 dateformat="06">2009032220090324</b054>22–24 March 2009
</conference>

P.9 Edition

An ‘edition’ generally means all copies of a book that contain the same content, usually published by the same publisher. More loosely, an ‘edition’ may be a product produced for a specific sales channel or market segment – a book club edition, library edition, large print edition and so on. So different editions may be distinguished by their content (by the addition, revision or removal of material), or more occasionally by some other aspect of their product form or nature.

Group P.9 is used to specify three different sorts of edition information:

  • various edition types (of the same work);
    • facsimile, special or prebound editions, where editions generally imply little or no change of content, but emphasise some other aspect of the product that differs from a previous or parallel version of the same work;
    • Braille, large print, digital original, limited (numbered) or media tie-in editions, where the edition type is simply a feature or attribute somewhat related to the product form, and does not necessarily imply the existence of any previous or parallel version – though where such other versions do exist, there is little or no change of content;
    • other distinctions that are merely differences of product form are not distinct ‘editions’: metadata that differentiates a ‘hardback edition’ from a ‘paperback edition’ lies within Group P.3 Product form;
  • various edition types (which imply a different but closely-related work);
    • abridged, illustrated, annotated, enlarged, revised, enhanced, student and teacher’s editions, where editions generally do imply some material difference in content between this and some previous or parallel product;
    • omnibus (combined) editions are works in their own right, derived from multiple ‘parent works’;
    • reprints or reissues that incorporate only minor corrections are not new editions;
  • ordinal numbered editions;
    • in <EditionNumber>, second, third edition etc – where editions imply significant revisions to the content of a previous product (and are always different works);
    • in <EditionVersionNumber>, minor revisions (of a digital product) that denote mostly technical fixes or minor updates that do not significantly revise the content.

The historical meaning of ‘edition’ (which is synonymous with ‘impression’ or ‘print run’, and which is still sometimes used in book collecting – as in ‘a valuable first edition’) is generally not used in ONIX, as ONIX is not normally concerned with individual impressions or manufacturing batches (the exceptions: ‘limited edition’, and sometimes also minor updates in <EditionVersionNumber>).

Group P.9 is also used to convey detailed information about religious texts.

DescriptiveDetail P.8 to P.10 Continued from P.7 Conference EditionType EditionNumber EditionVersionNumber EditionVersionNumber EditionStatement ReligiousText NoEdition Language Continued in Group P.11 should not omit both
Abridged edition
using Reference names
<EditionType>ABR</EditionType>
using Short tags
<x419>ABR</x419>
IVth edition
using Reference names
<EditionNumber>4</EditionNumber>
<EditionStatement>IVth edition</EditionStatement>
using Short tags
<b057>4</b057>
<b058>IVth edition</b058>
Centenary edition
using Reference names
<EditionType>SPE</EditionType>
<EditionStatement language="eng">Centenary edition</EditionStatement>
<EditionStatement language="fre">Édition du centenaire</EditionStatement>
using Short tags
<x419>SPE</x419>
<b058 language="eng">Centenary edition</b058>Parallel text in English
<b058 language="fre">Édition du centenaire</b058>and French
Facsimile first edition
using Reference names
<EditionType>FAC</EditionType>
<EditionNumber>1</EditionNumber>
using Short tags
<x419>FAC</x419>
<b057>1</b057>
Limited and numbered edition
using Reference names
<EditionType>NUM</EditionType>
<EditionStatement>Limited edition of 100 copies, individually numbered and signed by the author</EditionStatement>
using Short tags
<x419>NUM</x419>
<b058>Limited edition of 100 copies, individually numbered and signed by the author</b058>
Use code SPE for a limited and/or signed, but unnumbered, edition.
P.9.1 Edition type code

Most edition type codes from List 21 are self-explanatory.

Codes STU (Student) and TCH (Teacher) should be used for two editions of a school or college textbooks, where only the teacher’s edition contains model answers to questions and/or other notes for the teacher. Code SCH should be used where there is a special school or college edition of a book that is different from the ‘normal’ trade edition – for example, a school version of a work of classic literature that is separate from the normal version might have extra footnotes or annotations, or it may be exactly the same as the main edition but have a lower price and some limitations on how it can be sold (eg may only be sold direct to schools or through education sales channels).

Use codes UBR (Unabridged), ILL (Illustrated) or ENH (Enhanced, for digital products) only to avoid doubt, when an abridged, non-illustrated or ‘un-enhanced’ edition also exists – or when it might be expected to exist. For example, because many audiobooks on CD are abridged, an unabridged product may carry the UBR Edition type code, even when no abridged version is available. Note that an abridged version must carry the ABR code. Conversely, DGO (digital original) should be used to indicate that no physical counterpart of a digital product exists (because the normal expectation is that a physical version does exist), or where a physical counterpart does exist, that the digital version was published a significant time before the physical version (because the normal expectation is that the physical and digital versions were published together or that the physical version came first).

It is a common error to use DGO to indicate ‘digital exclusive’. ‘Digital original’ is a permanent feature of a product – if the digital version was published before a paper version, it remains the original, but is not digital exclusive. For products that are true digital exclusives, see Group P.14. and <TextType> code 21.

Use code SPE (Special edition) when no more specific code applies, and use <EditionStatement> to describe the nature of the edition.

Codes NED (New edition) and REV (Revised) should be used with care, and ideally only when editions are not numbered. If combined with an <EditionNumber>, then the meaning is somewhat unclear: a ‘third revised edition’ is not the same as a ‘revised third edition’. Where such a usage is unavoidable – that is, when the third edition has been revised, but not revised enough to justify naming it the fourth edition, and a distinction needs to be drawn with the unrevised third edition – always include an <EditionStatement> that makes the meaning clear – for example ‘Updated version of 3rd Edition’. When the ‘third revised edition’ simply means the second edition has been revised to create the third, do not use an unnecessary Edition type code like NED or REV – use only <EditionNumber> 3.

Note that <EditionType> is repeatable, to describe an edition that is, as an example, both Abridged (code ABR) and Large type (LTE).

P.9.2 Edition number

This data element should be included (as an integer, 2, 3, 4…) even when editions are numbered using Roman numerals (II, III, IV…) on the product itself. If Roman numerals are used on the product, use a normal integer in <EditionNumber>, and include the Roman edition number in <EditionStatement>.

It is best practice not to include the Edition number for ‘first’ editions, unless this is a particular feature of a product produced after a subsequent edition has been released, for example with a facsimile edition, when the first edition and a later edition are both orderable or available at the same time, or when using <EditionVersionNumber>. Thus occasionally, it may be necessary to populate <EditionNumber> with 1.

Edition number may also be used to carry a version number for a software product, where for example numbered major revisions of the software are released and the number is not considered part of the title. Minor revisions of the software version can also be carried in <EditionVersionNumber> if required.

using Reference names
<EditionNumber>3</EditionNumber>
<EditionVersionNumber>2</EditionVersionNumber>
using short tags
<b057>3</b057>v3.2 of a software product, or
<b217>2</b217>second minor update of a 3rd edition of a book
An e-book publisher may choose to maintain a two- or three-level version numbering scheme, such as is common with software products. A change of edition number would indicate a major revision of the content – and a change of the product and work identifiers – as with the first edition followed by the second edition of a conventional book. Within a given edition, smaller updates can be designated using the edition version number, so a version 3.0.2 is the second minor update of the Third edition. Typically, minor updates would contain only fixes to typos in the text or small technical corrections in the e-book files (eg to fix layout errors). For a version 3.0.2, the <EditionNumber> would be 3 and the <EditionVersionNumber> would be 0.2. A version 3.1 is a more significant update of the Third edition, with some new content or added functionality – more significant than just corrections, though not significant enough to trigger a change of the Edition number and of the product identifiers. Some e-book vendors may choose to push these smaller updates out to previous purchasers automatically, whereas other vendors make them available to previous purchasers if they specifically choose to re-download.
Occasionally – in the case of a minor update to a first or unnumbered edition – the use of <EditionVersionNumber> requires the use of <EditionNumber> 1, where normally edition 1 would be omitted. This serves to emphasize that what has been updated does not constitute a new edition.
P.9.4 Edition statement

An <EditionStatement> element should be included whenever a simple concatenation of the edition type and number is not sufficient for display purposes: it should be complete in itself and incorporate the edition type and number information, and should not occur if both <EditionType> and <EditionNumber> are omitted (it augments rather than replaces them). When provided, recipients should use the <EditionStatement> for display, while using any edition type or number for search and collation purposes.

The Edition statement may be repeated in multiple, parallel languages if required. If it is repeated, the language attribute much be included with each repeat.

P.9.5 “No edition” indicator

This element should be included whenever there is no other edition information. It is one of the few ONIX data elements that is defined as an empty element.

P.10 Language

Group P.10 consists solely of the <Language> composite.

Language composite

For books in a single language, this is entirely straightforward – the <Language> composite specifies the language, and its ‘role’ (language of the text, code 01 from List 22), and for translated works, a second repeat of the composite specifies the original language from which the text was translated.

If a book contains only a few individual phrases, small passages or quotations in a second language, then that second language should not be listed in a <Language> composite.

But if a product contains significant content in more than one language (eg a bilingual dictionary that can be used for translation ‘both ways’, or a book with parallel texts in two or more languages), then separate repeats of the <Language> composite with <LanguageRole> code 01 (Language of text) are appropriate. For example, a full bilingual dictionary with Spanish-to-English and English-to-Spanish sections would have two <Language> composites, specifying <LanguageRole> code 01 for both English and Spanish languages. It would be equally useful for native Spanish readers and native English readers.

Products like phrasebooks or textbooks for foreign language learning need different treatment. While it might contain significant content in both Spanish and English, a book designed for use by Spanish students learning English should be coded using only one <Language> composite, with <LanguageRole> code 01 and <LanguageCode> code spa (Spanish). Such a book would not be useful to an English student trying to learn Spanish!

One way of clarifying this is to ask, ‘Is the book in a particular language, or about a particular language?’ In the former case, the language should be listed in a <Language> composite. If it’s about a language, it should not – although the secondary language may be indicated using some subject classification schemes. A second useful criterion – for such language teaching products and phrasebooks – is the language of the primary readership the product is intended for.

Using this second criterion, it’s also possible to classify a translation dictionary containing, for example, only Spanish-to-English (ie Spanish headwords with their English equivalents), although fine judgement is required about the target market or likely audience for the product. It contains significant text in both languages, and might be useful (in slightly different circumstances) for both English and Spanish readers. But if it’s for casual use (ie it’s akin to a phrasebook) or is a beginner-level dictionary for the teaching of English to Spanish readers), then it’s ‘in’ Spanish. If it’s for professional or scholarly use, then it’s more likely to be used by English translators translating from Spanish, so is ‘in’ English.

Two final uses for the <Language> composite are to define a regional variant of a language using <CountryCode>, and the script that the product uses in <ScriptCode>. Some languages – a well-known example being Serbian – can be expressed in two entirely separate scripts, in this case both Latin and Cyrillic. Other languages have distinct regional variation, for example there are well-known differences between British English and US English, Iberian Spanish and Mexican Spanish, Québécois (ie Canadian) French and Metropolitan French, or Brazilian and Iberian Portuguese. Where such distinctions are an important product attribute, they should be specified.

However, Braille can also be treated as as script in the relevant ISO code list. It is not recommended that Braille books be described or discovered only or primarily via a script code. For Braille, see <EditionType> in Group P.9 and <ProductFormDetail> in Group P.3 for the preferred methods of description.

Language LanguageRole LanguageCode CountryCode ScriptCode
work written and published in Portuguese
using Reference names
<Language>
    <LanguageRole>01</LanguageRole>
    <LanguageCode>por</LanguageCode>
</Language>
using Short tags, additionally specifying Brazilian Portuguese
<language>
    <b253>01</b253>Language of text
    <b252>por</b252>Portuguese
    <b251>BR</b251>Brazilian variant
</language>
work written in Portuguese, translated and published in English
using Reference names
<Language>
    <LanguageRole>01</LanguageRole>
    <LanguageCode>eng</LanguageCode>
</Language>
<Language>
    <LanguageRole>02</LanguageRole>
    <LanguageCode>por</LanguageCode>
</Language>
using Short tags
<language>
    <b253>01</b253>Language of text
    <b252>eng</b252>
</language>
<language>
    <b253>02</b253>Original language
    <b252>por</b252>
</language>
tourist phrasebook for German tourists visiting Japan
using Reference names
<Language>
    <LanguageRole>01</LanguageRole>
    <LanguageCode>ger</LanguageCode>
</Language>
using Short tags
<language>
    <b253>01</b253>Language of text
    <b252>ger</b252>
</language>
Use <Subject> to specify Japan and/or Japanese.
Braille novel in French
using Reference names
<EditionType>BRL</EditionType>
. . .
<Language>
    <LanguageRole>01</LanguageRole>
    <LanguageCode>fre</LanguageCode>
</Language>
using Short tags
<x419>BRL</x419>Braille edition
. . .
<language>
    <b253>01</b253>Language of text
    <b252>fre</b252>
</language>
An English-language Braille edition should additionally use <ProductFormDetail> to indicate the use of Contracted or Uncontracted Braille. Similar codes indicating the grade of Braille for other languages are not yet defined.

P.11 Extents and other content

Group P.11 is primarily used to specify the extent of a product – the number of pages, or the running time of an audiobook. It also covers ancillary content such as illustrations.

DescriptiveDetail P.11 to P.13 Continued from P.10 Extent Illustrated NumberOfIllustrations NumberOfIllustrations IllustrationsNote AncillaryContent Subject NameAsSubject AudienceCode Audience AudienceRange AudienceDescription AudienceDescription Complexity
page count of a simple book with no significant front or back matter
using Reference names
<Extent>
    <ExtentType>11</ExtentType>
    <ExtentValue>245</ExtentValue>
    <ExtentUnit>03</ExtentUnit>
</Extent>
using Short tags
<extent>
    <b218>11</b218>Content page count
    <b219>245</b219>
    <b220>03</b220>Pages
</extent>
running time of an audiobook
using Reference names
<Extent>
    <ExtentType>09</ExtentType>
    <ExtentValue>285</ExtentValue>
    <ExtentUnit>05</ExtentUnit>
</Extent>
using Short tags
<extent>
    <b218>09</b218>Running time
    <b219>285</b219>4 hours 45 mins
    <b220>05</b220>In minutes
</extent>
page count of a book with extensive front and back matter plus a plate section
using Reference names, ancillary content listed in simple <IllustrationsNote>
<Extent>
    <ExtentType>00</ExtentType>
    <ExtentValue>245</ExtentValue>
    <ExtentUnit>03</ExtentUnit>
</Extent>
<Extent>
    <ExtentType>03</ExtentType>
    <ExtentValue>12</ExtentValue>
    <ExtentValueRoman>xii</ExtentValueRoman>
    <ExtentUnit>03</ExtentUnit>
</Extent>
<Extent>
    <ExtentType>04</ExtentType>
    <ExtentValue>14</ExtentValue>
    <ExtentValueRoman>xiv</ExtentValueRoman>
    <ExtentUnit>03</ExtentUnit>
</Extent>
<Extent>
    <ExtentType>12</ExtentType>
    <ExtentValue>16</ExtentValue>
    <ExtentUnit>03</ExtentUnit>
</Extent>
<Extent>
    <ExtentType>11</ExtentType>
    <ExtentValue>287</ExtentValue>
    <ExtentUnit>03</ExtentUnit>
</Extent>
<IllustrationsNote>8 color plates, 15 mono plates, 2 maps. Includes index</IllustrationsNote>
using Short tags, using <AncillaryContent>
<extent>
    <b218>00</b218>Main content page count
    <b219>245</b219>
    <b220>03</b220>Pages
</extent>
<extent>
    <b218>03</b218>Front matter page count
    <b219>12</b219>
    <x421>xii</x421>
    <b220>03</b220>Pages
</extent>
<extent>
    <b218>04</b218>Back matter page count
    <b219>14</b219>
    <x421>xiv</x421>
    <b220>03</b220>Pages
</extent>
<extent>
    <b218>12</b218>Unnumbered insert page count
    <b219>16</b219>
    <b220>03</b220>Pages
</extent>
<extent>
    <b218>11</b218>Content page count
    <b219>287</b219>
    <b220>03</b220>Pages
</extent>
<ancillarycontent>
    <x423>24</x423>Color plates
    <b257>8</b257>
</ancillarycontent>
<ancillarycontent>
    <x423>23</x423>Mono plates
    <b257>15</b257>
</ancillarycontent>
<ancillarycontent>
    <x423>14</x423>Maps
    <b257>2</b257>
</ancillarycontent>
<ancillarycontent>
    <x423>25</x423>Index
    <!-- number of index pages unspecified -->
</ancillarycontent>
The 8 color and 15 mono plates make up the 16-page unnumbered insert (plate section). The maps and index are counted within the main content page count.
filesize of a downloadable e-publication, with extent of print counterpart
using Reference names
<Extent>
    <ExtentType>22</ExtentType>
    <ExtentValue>1.4</ExtentValue>
    <ExtentUnit>19</ExtentUnit>
</Extent>
<Extent>
    <ExtentType>08</ExtentType>
    <ExtentValue>320</ExtentValue>
    <ExtentUnit>03</ExtentUnit>
</Extent>
using Short tags
<extent>
    <b218>22</b218>Filesize
    <b219>1.4</b219>
    <b220>19</b220>In megabytes
</extent>
<extent>
    <b218>08</b218>Extent of print counterpart
    <b219>320</b219>
    <b220>03</b220>Pages
</extent>
Extent composite

The <Extent> composite carries the page count of a book or e-publication, or a similar measure such as running time for an audio product (including downloadable audio) or filesize – and possibly the word count – for a downloadable product. The composite should contain a mandatory <ExtentType>, a number for the extent in <ExtentValue>, and must specify the units the value is measured in (pages, minutes, megabytes etc) in the <ExtentUnit> element.

For books where front and back matter are relatively insignificant and there are no inserts – for example, most novels – it is best practice to provide only a single extent, and the preferred measure is the content page count (code 11 from List 23). In very simple cases, this extent might be equal to the highest page number.

Where a book contains any significant front or back matter, or an insert / plate section, it is best practice to specify the page count for the key parts of a book individually:

  • front matter – for example, the title page, introduction and preface, often termed the ‘prelims’ (code 03 from List 23);
  • the main content (code 00 from List 23);
    • this might include a few blank pages, where for example blanks are included to ensure all chapters start on a recto page, or where pages are intended to be filled in by the consumer, but it does not include any unnumbered pages in an insert or plate section;
  • unnumbered pages in any insert(s) such as plate sections (code 12 from List 23);
  • back matter – notes and appendices, index etc (code 03 from List 23);
    • this excludes any blank pages or advertising pages that are included to ensure a convenient signature size during manufacture;
  • if the front matter, main content and back matter page counts cannot be provided separately, then a total count of numbered pages may be provided (code 05), together with a count of any unnumbered pages in inserts or plate sections (code 12);
  • the content page count (code 11) should always be provided when the more granular combinations of extents are not available, and could be provided in addition to the granular extents for recipients who want just a simple, single number.

Where for any reason a publisher can only provide a single repeat of the <Extent> composite, the preferred measure is the content page count (which includes any unnumbered inserts or plate sections) or – if this is genuinely unavailable – the total numbered page count (which excludes unnumbered inserts or plate sections). Note that neither of these necessarily matches the highest page number in the book (which in most cases will be the total numbered page count minus any Roman-numbered front matter). Nor do they match the production page count.

For front matter (and possibly back matter) that is numbered with Roman numerals, the extent should be stated as an Arabic integer as normal, in <ExtentValue>, but may additionally be stated in Roman numerals in <ExtentValueRoman>. It is not best practice to supply only Roman numerals. Recipients should accept Roman numerals in either upper or lower case, though lower case is more usual when numbering pages.

For audio running times, values in minutes are preferred to hours and minutes or just hours (ie 195 minutes is preferred to 3 hours and 15 minutes or simply three hours). The unit of measure should be included in the <ExtentUnit> element.

For e-publications that have a fixed pagination (ie they are not ‘reflowable‘), the extent should be provided exactly as for printed books, or by specifying the absolute page count (code 07). For reflowable e-publications without a fixed pagination, then the extent should be specified either by providing the extent of the printed counterpart (code 08) or by providing a ‘notional’ extent (ie what would be the extend of the print version, if such a counterpart existed).

For downloadable e-publications such as e-books or downloadable audio, the size of the downloadable file (or total size of some downloadable package of files) should be included, measured in kilobytes for filesizes below 1MB (code 18 from List 24 in the <ExtentUnit> element), and in megabytes (code 19) for larger files. There is no real benefit in a precision greater than two significant digits (ie 220 kilobytes or 1.3 megabytes, not 223.75 kilobytes and 1.325 megabytes).

Extents can also be specified by the number of words. However, while the weightiness of a 200,000 word manuscript is familiar to some within the publishing industry, word counts are not always simple for consumers to interpret. They are not recommended as the only indication of extent – though they may become more useful in future as reflowable publications become the norm.

Extent ExtentType ExtentValue ExtentValueRoman ExtentValueRoman ExtentUnit must not omit both

List 23 allows the number of pages in a book (or an e-publication) to be specified in various ways: the diagram illustrates the relationship between the parts of the book in spine order, and the various page counts. For example, the total page count (List 23 code 05) counts all numbered pages in the front matter, the main content and back matter, but excludes unnumbered pages in any insert or plate section. In contrast, the content page count (code 11) includes those unnumbered pages.

Best practice is to provide one of the combinations of extents A or B in repeats of <Extent> – codes 03, 00, 04 plus 12 (if relevant) make up combination A, and codes 05 plus 12 (if relevant) make up combination B. The diagram is in rough order of preference, so combination A is preferred to combination B, and B is preferred to C (code 11 alone) because the former is more granular, and recipients who receive granular data but wish to use only a single figure can easily sum the various extents provided in a combination.

unnumbered insert / plate section (12) front cover front matter (03) main content (00) back matter (04) ²