Capture paper's metadata

The iReceptor Team uses a standard repertoire metadata spreadsheet to capture repertoire metadata. The repertoire metadata file format that is used in the iReceptor curation process is a UTF-8 encoded comma delimited text files (Metadata CSV files), consisting of a single header line followed by a single line of metadata for each repertoire. The header line should consist of keys that map to AIRR fields as specified in the MiAIRR standard as defined in the "AIRR Formats WG field name" columnn of the MiAIRR specification. Please see the example iReceptor Repertoire Metadata spreadsheet for more information.

To create a metadata sheet:

  • For each repertoire in the paper, fill in values for each of the metadata fields.
    • Refer to both the paper, any supplementary data included, as well as information from where the data is deposited (ex. The Sequence Read Archive)
  • Use a controlled vocabulary and/or ontologies for values of those fields that require controlled vocabularies.
    • Refer to the MiAIRR standards for the most accurate description of particular fields and particular entries for those fields

It should be noted that this step is error prone and time consuming when extracting this data from a paper. Care should be taken at this step to minimize errors, as errors at this stage can adversely affect the value of the data and can lead to scientific errors during data reuse. In order to minimize errors at this stage, the iReceptor curation process includes a data provenance/fidelity validation step (see Step 9).