Skip to main content
kbart_logo_small.png
5.3.1.1 Content providers should provide metadata formatted as tab-delimited values.

This is a generic format that minimizes the effort involved in receiving and loading the data, and reduces the likelihood of errors being introduced during exchange. Tab-delimited formats are preferable to comma-separated formats, as commas appear regularly within the distributed data and, though they can be "commented out", doing so leaves a greater opportunity for error than the use of a tab-delimited format. Tab-delimited formats can be easily exported from all commonly used spreadsheet programs.

5.3.1.2 The file should be entitled "[ProviderName]_AllTitles_[YYYY-MM-DD].txt".
For example, JSTOR_AllTitles_2008-12-01.txt.

5.3.1.3 The provider name should be the web domain at which your data is hosted (but without the punctuation).
For example, jstor or ebscohost. This ensures that your data is clearly distinguished from data provided by others with similar package names. Also, the file name should be consistent for each metadata file deposited.

5.3.1.4 Separate files should be produced for each package of content that the provider offers.
Files should be named as customers would expect to see it labelled in the knowledge base, using the syntax "[ProviderName] _[CollectionName] _[YYYY-MM-DD].txt". For example, JSTOR_Arts&SciencesV_2008-12- 01.txt. Providers and recipients can agree in advance how best to present complex collection names.

5.3.1.5 All metadata should be provided as plain text.
If metadata is provided in a format that does support additional style or formatting, it should be presented without those enhancements. Data should not include colors, typefaces, italics, or other markup.

5.3.1.6 Text should be encoded as UTF-8.
The UTF-8 character set is well supported and encompasses the writing systems of many languages. This is also a common output option for programs such as Microsoft Excel.

5.3.1.7 One publication should be given in each line of the file, with a column for each field given in Section 5.3.2, Data Fields.

5.3.1.8 Data should be provided with column headers (see Section 5.3.2) and without a blank row between the column header and the first row of content.

5.3.1.9 A title should be listed twice if there is a coverage gap of greater than or equal to 12 months, with only the coverage field changing.
Greater granularity in reporting data coverage gaps is desirable, and should be agreed with the link resolver supplier if it can be supported.

5.3.1.10 All rows should be consistent in terms of format.
For example, ISSN should always be expressed as nine characters with a hyphen separator, and date fields should always be in the format described in Section 5.3.2.

5.3.1.11 The metadata file should be supplied in alphabetical order by title to ensure ease of checking and import by knowledge base developers.