CERL File Procedures (FILPROC)

The CERL Heritage of the Printed Book Database: Guidelines concerning its date range and scope
File submitting, vetting, conversion etc.
Maximum length of records and of variable fields and subfields
File updates
Identification of derived records in the HPB database (under revision)
Non-sorting characters and initial articles

The CERL Heritage of the Printed Book Database: Guidelines concerning its date range and scope

CERL File Procedures 1 (revision 1)

This document is a major revision of File Procedure 1 dated 1993-03-10, and is offered as a guide for CERL members and others submitting files for inclusion in the Heritage of the Printed Book Database. The most important changes relate to (a) the cut-off date for files containing records for items published after 1830, (b) the inclusion of materials printed by lithography, engraving, etc., © the optional inclusion of non-textual materials (cartographic materials, music and graphics) and (d) the conditional inclusion of records for microforms and other surrogates. Approved by the Executive Committee, June 2005.

Definition
Dates of coverage of files submitted by CERL members and others
Geographical coverage
- Books
- Other materials

Definition

The Database consists primarily of records of materials printed in Europe in the period c. 1450 to c. 1830.

The choice of 1830 as the cut-off point for the Database is necessarily arbitrary. It was chosen as being an approximate date when machine printing began to supersede hand-printing. Many hand-printed items were published after 1830, and many files containing records of old printed materials continue beyond that date.

The core of the Database therefore remains first and foremost the record of the European printed heritage of the hand-press period, but it is not rigidly defined by date.

The following notes offer guidelines for CERL members and others submitting files. In all cases of doubt, please consult the Secretariat who will be glad to give advice and assistance.

Dates of coverage of files submitted by CERL members and others

The file comprises records within the period c. 1450 to c. 1830 Submit the whole file
If the file has a later date as cut-off point:
- (a) If it is technically feasible for the file provider to separate records for hand-press from mechanically-produced items: Submit all qualifying records up to the cut-off date;
  
  The Secretariat will take into consideration the fact that hand-press printing is known to have continued to a later date in some places (this should be noted in the file description);
- (b) If it is not technically feasible for the file provider to separate records for hand-press from mechanically-produced items: Consult the Secretariat.
  
  In specific instances CERL may be willing to arrange for the Data Conversion Group to undertake to make the extraction on behalf of the file provider and CERL.
Multi-volume monographs with publication dates starting before and continuing after 1830:
- Include all the volumes in the set, including those published after 1830.
- If at all possible, indicate which volumes in the set, if any, were not hand-printed.
Continuing resources (monographic series and serials) with publication dates starting before and continuing after 1830:
- Include records for the sets considered as a whole and also records (if available) for the individual volumes up to1830 or the agreed cut-off date for the file.

Geographical coverage

The primary focus is on materials of European origin, that is, printed and published in Europe. However, materials originating in other parts of the world (for example, the Americas) where Europeans had settled and established printing and publishing houses are also sought for the Database.
CERL does not collect in other fields (for example, Oriental printing and publishing).

Types of material

Books

The Heritage of the Printed Book Database shall consist primarily of records of books; that is, monographic and serial publications printed by hand. (This definition includes broadsheets). Usually these are solely or mainly textual materials. In order to record the full range of European printing of the hand press period, the database should also include books and other publications printed from lithography and engraving.

Other materials

¹⁾ Non-textual materials (cartographic materials, printed music and graphics) shall also be admissible – their inclusion may be encouraged on the grounds that they form part of the printed heritage – but file suppliers must decide initially whether or not to submit records for them. It may be considered that they are still better served by separate, specialist databases.

If the file contains a substantial number of records for other materials:

Before submitting the file, inform the Secretariat, and state, if possible:
- (a) how many records for each type of material are present in the file;
- (b) whether the records are coded so that some categories can, if necessary, be removed from the file automatically before transmission; and
- (c ) whether categories that might otherwise be excluded are regarded as an integral part of the file (for example, if they form part of the records of a special collection) and should not be separated from the records for the books.

In all cases of doubt, please consult the Secretariat.

Printed facsimiles, microforms and digital reproductions
While the database consists essentially of records of original printed materials, records of reproductions may be included, provided the records for the surrogates contain bibliographic descriptions of the originals from which they are derived. The records for the originals may also contain details of or links to the surrogates.
Cartographic materials
Cartographic materials in book form (atlases), and all other books containing maps, charts, etc., in addition to the text, should be included. Other cartographic materials (separately published sheet maps, etc.) may also be included.
Printed music
Books containing music in addition to the text (e.g., hymnals and other liturgical works) should be included. Separately published music (sheet music), whether or not published as bound volumes, may also be included.
Graphics: engravings, lithographs, etc.
Books printed by these processes should be included. Separately published sheets of engravings, etc., may also be included.
Manuscripts
These should be excluded.
Sound recordings, motion pictures of plays, etc.
These should all be excluded.

File submitting, vetting, conversion etc.

CERL File Procedures 2

Summary of revisions: This document is new. There was no corresponding document in the former UNAD series.

Supply format
Delivery of file
Documentation
- Letter of Agreement with CERL
- File description
Vetting process
- Analysis Report
- Pre-processing

Supply format

Pica+ is the HPB's internal format at VZG.

Conversion tables for MARC21 and UNIMARC are available.

CERL will accept data in MARC/UTF-8. The Data Conversion Group (DCG) in Göttingen processes the data for CERL, and they have created UNIMARC and MARC21 conversions which aim to retain as much of the data as possible - in particular elements concerning antiquarian materials.

The level of cataloguing can vary.

Delivery of file

We prefer it if the file is delivered to CERL by FTP. Full details will be provided by the Executive Manager upon request (contact the CERL Secretariat).

Documentation

Letter of Agreement with CERL

CERL sends all file providers a Letter of Agreement to be signed and returned. The text of the Letter of Agreement may be found here.

File description

The file provider is also required to write a short description of the file it contributed to the HPB. For examples of these file descriptions, see http://www.cerl.org/web/en/resources/hpb/content. Clicking on any of the libraries listed will take you to their file description. Images are welcome!

The description of the file is included on the CERL-website to assist users of the HPB database to determine the contents of the database.

Vetting process

The process described here is to ensure that the data submitted for inclusion in the HPB is in the best possible condition. This will prevent the actual file loading process from being a long drawn out and cumbersome procedure.

Analysis Report

Once the file has been submitted it is passed on to the Data Conversion Group in Göttingen where the file will be analysed. Contents as well as format issues will be validated. The findings will be compiled in an Analysis Report and passed on to the file provider.

Pre-processing

The Analysis Report can include recommendations for changes to be carried out by the file provider in time for the next HPB update. It also documents changes to the file that will be applied by the Data Conversion Group in order to optimise the records for the HPB environment and increase the data quality as much as possible (pre-processing). This can also include corrections which can be algorithmically applied to absorb cataloguing deficiencies. This does of course not prevent the file provider from performing such corrections in the local system as well.

The file provider will be asked to agree to the suggested pre-processing and once the file provider, the Data Conversion Group and CERL are agreed that no matters are outstanding with regard to the file vettings, the complete file will be processed by VZG.

Maximum length of records and of variable fields and subfields

CERL recommendation

If a retroconverted source file does contain excessively long fields, the file supplier must seek the advice of the CERL Secretariat before submitting the file for evaluation.

File updates

CERL is happy to received updates - i.e. new additions or amended records. It is important that your Record IDs are the same from one file version to the next.

If you file format (e.g. use of field and subfields) has not changed and your Records IDs have remained stable, the updates will be processed automatically. In that case, you may determine the frequency of delivering updates to CERL.

Identification of derived records in the HPB database

[ Under construction ]

Non-sorting characters and initial articles

CERL File Procedures 11

UNIMARC
- UNIMARC manual
- Use in UNIMARC records
Other Formats
UNIMARC and OCLC-MARC
- Character sets
- Non-sorting articles in titles
Multilingual List of Initial Articles
Examples

UNIMARC

UNIMARC manual

The Manual (section 4.6 and Appendix J) authorises the use of two codes to delimit strings of one or more non-sorting characters. These codes form part of ISO 6630 : Bibliographic Control Set (UNIMARC C1 set) :

8/8 Non-sorting character(s), Beginning, represented here as *NSB*

8/9 Non-sorting character(s), End, represented here as *NSE*

In the Manual and other UNIMARC documentation these codes are conventionally represented by NSB and NSE, both placed between an equal sign that has been crossed through.

Use in UNIMARC records

The codes are most commonly used to mark non-sorting articles at the beginning of title fields, but can also be used to enclose much longer strings and even strings occurring in the middle or at the end of subfields. The Manual places no restrictions on their use (they can be used in any non-coded data field) and offers no guidance : usage is governed by the cataloguing rules employed by the various bibliographic agencies.

Other formats

MARC21 uses indicator values to denote the number of non-filing characters at the beginning of title fields (up to 9 characters, inclusive of spaces), but has no other provisions for longer strings or non-filing characters in other positions.
Other formats (for example INTERMARC, MAB) do not use indicators for this purpose and indicate non-filing characters by other means.

Multilingual List of Initial Articles

'n	De	Eener	Eit	I	O	Une
's	Dei	Eens	El	Il	Os	Uno
't	Dem	Egy	En	L'	The	Unos
A	Den	Ei	Ene	La	Um	Y
Ain	Der	Ein	Et	Las	Uma	Yr
An	Des	Eine	Ett	Le	Umas
As	Det	Einem	Eyn	Les	Un
Az	Die	Einen	Gl'	Li	Un'
D'	Een	Einer	Gli	Lo	Una
Das	Eene	Eines	Het	Los	Unas

Examples

under construction

¹⁾

Historical note to Other materials: The specifications agreed by CERL and RLG in 1995 were based on the understanding that the HPB file was basically a clone of the RLIN Books (BKS) file. Some records for non-book materials were included in the HPB Database but were marked as “error” records although still retrievable. Following harmonization and integration of the former USMARC formats for different materials into a single MARC21 Bibliographic Format, these distinctions ceased to exist.