Glossary of tags

Glossary of tags #

- A - #

  • ac :

    Accession number of the ecoPCR record used as template. ( obiconvert)

  • aho_corasick :

    Total number of occurrences of the provided motifs detected by the Aho–Corasick search. ( obiannotate)

  • aho_corasick_Fwd :

    Number of motif matches on the forward strand when running the Aho–Corasick scanner. ( obiannotate)

  • aho_corasick_Rev :

    Number of motif matches on the reverse complement strand produced by the Aho–Corasick scanner. ( obiannotate)

  • ali_dir :

    Orientation of the alignment window when overlapping reads are aligned ("left" or "right"). ( obikmersim, obipairing)

  • ali_length :

    Length of the aligned region considered when assembling paired reads. ( obikmersim, obipairing)

  • amplicon_length :

    Expected size of the in silico amplicon described in an ecoPCR record. ( obiconvert)

- C - #

  • chimera :

    Flag produced by obiclean identifying sequences detected as chimeric. ( obiclean)

  • count :

    Number of occurrences of the dereplicated sequence in the dataset.

- D - #

  • definition :

    Text description kept from the original FASTA/FASTQ definition line.

  • demultiplex_error :

    Human‑readable reason explaining why a read could not be assigned to a PCR sample. ( obimultiplex, obipcr)

  • direction :

    Orientation of the read with respect to the primer set (forward or reverse). ( obimultiplex, obipcr)

- E - #

  • Entropies :

    Per‑base minimal entropy measured while masking low-complexity regions. ( obilowmask)

  • experiment :

    Experiment identifier assigned to each multiplexed sample in the NGS filter. ( obimultiplex, obitagpcr)

- F - #

  • family_name :

    Scientific name of the family containing the ecoPCR reference sequence. ( obiconvert)

  • family_taxid :

    NCBI taxid of the family containing the ecoPCR reference sequence. ( obiconvert)

  • forward_error :

    Number of mismatches tolerated between the read and the forward primer. ( obimultiplex, obipcr)

  • forward_match :

    Sequence of the forward primer match extracted from the read. ( obimultiplex, obipcr)

  • forward_mismatch :

    Mismatch count reported in ecoPCR reference entries for the forward primer. ( obiconvert)

  • forward_primer :

    Forward primer sequence used while demultiplexing or during in silico PCR. ( obiconvert, obimultiplex, obipcr)

  • forward_tag :

    Sample tag (index) identified on the forward side of a read. ( obimultiplex, obipcr)

  • forward_tm :

    Melting temperature of the forward primer reported by ecoPCR. ( obiconvert)

- G - #

  • genus_name :

    Scientific name of the genus containing the ecoPCR reference sequence. ( obiconvert)

  • genus_taxid :

    NCBI taxid of the genus containing the ecoPCR reference sequence. ( obiconvert)

- L - #

  • landmark_coord :

    Coordinate of a reference sequence in the landmark space produced by obilandmark. ( obilandmark)

  • landmark_id :

    Identifier of the landmark sequence used as axis when building the landmark space. ( obilandmark)

- M - #

  • mask :

    Scale (window size) that triggered masking of each nucleotide during entropy masking. ( obilowmask)

  • merged_sample :

    Map storing the abundance of each sample contributing to a merged record. ( obimatrix, obiuniq)

  • microsat :

    Sequence of the detected microsatellite region. ( obimicrosat)

  • microsat_from :

    1-based start position of the detected microsatellite. ( obimicrosat)

  • microsat_left :

    Left flanking region kept around the detected microsatellite. ( obimicrosat)

  • microsat_right :

    Right flanking region kept around the detected microsatellite. ( obimicrosat)

  • microsat_to :

    End position (1-based, inclusive) of the detected microsatellite. ( obimicrosat)

  • microsat_unit :

    Minimal repeat unit sequence describing the microsatellite. ( obimicrosat)

  • microsat_unit_count :

    Number of times the minimal unit is repeated inside the microsatellite. ( obimicrosat)

  • microsat_unit_length :

    Length (in bases) of the minimal repeat unit in the microsatellite. ( obimicrosat)

  • microsat_unit_normalized :

    Canonical representation of the repeat unit, normalized for strand orientation. ( obimicrosat)

  • microsat_unit_orientation :

    Orientation ("direct" or "reverse") of the repeat unit with respect to the original read. ( obimicrosat)

  • mode :

    Whether a paired-read assembly comes from an actual alignment or from simply joining reads. ( obipairing)

- O - #

  • obiclean_cluster :

    Identifier of the obiclean cluster containing the sequence. ( obiclean)

  • obiclean_head :

    Boolean flag indicating that the sequence is the most abundant variant in its cluster. ( obiclean)

  • obiclean_headcount :

    Abundance of the cluster head sequence. ( obiclean)

  • obiclean_internalcount :

    Cumulated abundance of internal (non-head, non-singleton) variants in the obiclean cluster. ( obiclean)

  • obiclean_mutation :

    Edit description used by obiclean to label the relationship between a sequence and its head. ( obiclean)

  • obiclean_samplecount :

    Total abundance (head + internal + singleton) of the obiclean cluster. ( obiclean)

  • obiclean_singletoncount :

    Number of singleton variants linked to the head in the obiclean graph. ( obiclean)

  • obiclean_status :

    Final obiclean status (head/internal/singleton/bad) attributed to the sequence. ( obiclean)

  • obiclean_weight :

    Relative weight computed for each sequence during obiclean consensus building. ( obiclean)

  • obicleandb_dist :

    Distance distribution between a reference and its neighbors during database cleaning. ( obicleandb)

  • obicleandb_level :

    Taxonomic rank (none/genus/family) reached during the obicleandb validation workflow. ( obicleandb)

  • obicleandb_median :

    Median similarity score within the accepted cluster around a reference sequence. ( obicleandb)

  • obicleandb_scores :

    Raw similarity scores observed for the reference during database validation. ( obicleandb)

  • obicleandb_trusted :

    Probability assigned by obicleandb that the reference sequence is trustworthy. ( obicleandb)

  • obicleandb_trusted_on :

    Number of supporting sequences used to compute the obicleandb trust score. ( obicleandb)

  • obiconsensus_consensus :

    Boolean flag indicating whether a sequence is a consensus sequence produced by obiconsensus. ( obiconsensus)

  • obiconsensus_filtered_graph_size :

    Size of the kmer graph after filtering low-coverage nodes. ( obiconsensus)

  • obiconsensus_full_graph_size :

    Size of the full kmer graph before filtering. ( obiconsensus)

  • obiconsensus_kmer_max_occur :

    Maximum abundance observed for any kmer in the graph used to build the consensus. ( obiconsensus)

  • obiconsensus_kmer_size :

    Kmer size actually used to build the consensus for the sequence. ( obiconsensus)

  • obiconsensus_seq_length :

    Length of the consensus sequence returned by obiconsensus. ( obiconsensus)

  • obiconsensus_weight :

    Cumulated weight (number of supporting reads) of the consensus sequence. ( obiconsensus)

  • obikmer_ali_length :

    Alignment length between the query and the best kmer-matched reference. ( obikmersim)

  • obikmer_fast_count :

    Number of shared kmers discovered during the fast filtering phase. ( obikmersim)

  • obikmer_fast_overlap :

    Estimated overlap length inferred from the kmer matching step. ( obikmersim)

  • obikmer_fast_score :

    Fast alignment score computed before the full alignment stage. ( obikmersim)

  • obikmer_identity :

    Proportion of identical bases in the alignment computed after kmer matching. ( obikmersim)

  • obikmer_kmer_size :

    Size of the kmers that were indexed to perform the similarity search. ( obikmersim)

  • obikmer_match_count :

    Number of references sharing at least the requested number of kmers with the query. ( obikmersim)

  • obikmer_match_id :

    Identifier of the reference sequence selected as best match for a query ("-rev" when aligning the reverse complement). ( obikmersim)

  • obikmer_orientation :

    Orientation (forward/reverse) used when aligning the query to the best kmer match. ( obikmersim)

  • obikmer_residual_id :

    Identity computed after removing the minimal number of shared kmers from the alignment. ( obikmersim)

  • obikmer_score :

    Alignment score resulting from the Smith–Waterman extension after kmer matching. ( obikmersim)

  • obikmer_score_norm :

    Alignment score normalized by alignment length for kmer-based matches. ( obikmersim)

  • obikmer_sparse_kmer :

    Indicates whether sparse kmers were used while building the index. ( obikmersim)

  • obilowmask_error :

    Reason why the sequence could not be processed during low-complexity masking. ( obilowmask)

  • obimultiplex_amplicon_rank :

    Rank of the barcode match among all amplicons compatible with the read. ( obimultiplex)

  • obimultiplex_direction :

    Orientation of the read determined during barcode extraction (forward or reverse). ( obimultiplex, obitagpcr)

  • obimultiplex_error :

    Explanation of the failure encountered during barcode extraction. ( obimultiplex, obitagpcr)

  • obimultiplex_forward_error :

    Mismatch count observed between the read and the forward primer in the NGS filter. ( obimultiplex, obitagpcr)

  • obimultiplex_forward_match :

    Portion of the read matching the forward primer. ( obimultiplex, obitagpcr)

  • obimultiplex_forward_matching :

    Matching model (strict/hamming/indel) used to interpret the forward primer. ( obimultiplex)

  • obimultiplex_forward_mismatches :

    Forward primer mismatch count propagated to paired reads during PCR tagging. ( obitagpcr)

  • obimultiplex_forward_primer :

    Forward primer sequence fetched from the NGS filter entry assigned to the read. ( obimultiplex)

  • obimultiplex_forward_proposed_tag :

    Forward tag proposed by the error-tolerant search before validation. ( obimultiplex)

  • obimultiplex_forward_tag :

    Forward sample tag assigned to the read. ( obimultiplex, obitagpcr)

  • obimultiplex_forward_tag_dist :

    Edit distance between the observed tag and the expected forward tag. ( obimultiplex)

  • obimultiplex_reverse_error :

    Mismatch count observed between the read and the reverse primer in the NGS filter. ( obimultiplex, obitagpcr)

  • obimultiplex_reverse_match :

    Portion of the read matching the reverse primer. ( obimultiplex, obitagpcr)

  • obimultiplex_reverse_matching :

    Matching model (strict/hamming/indel) used to interpret the reverse primer. ( obimultiplex)

  • obimultiplex_reverse_mismatches :

    Reverse primer mismatch count propagated to paired reads during PCR tagging. ( obitagpcr)

  • obimultiplex_reverse_primer :

    Reverse primer sequence fetched from the NGS filter entry assigned to the read. ( obimultiplex)

  • obimultiplex_reverse_proposed_tag :

    Reverse tag proposed by the error-tolerant search before validation. ( obimultiplex)

  • obimultiplex_reverse_tag :

    Reverse sample tag assigned to the read. ( obimultiplex, obitagpcr)

  • obimultiplex_reverse_tag_dist :

    Edit distance between the observed tag and the expected reverse tag. ( obimultiplex)

  • obisplit_frg :

    Index of the fragment produced by obisplit from the original read. ( obisplit)

  • obisplit_group :

    Name of the splitting group (pool or pattern pair) to which the fragment belongs. ( obisplit)

  • obisplit_left_error :

    Mismatch count tolerated on the left delimiting pattern. ( obisplit)

  • obisplit_left_match :

    Sequence matched by the left delimiting pattern. ( obisplit)

  • obisplit_left_pattern :

    Name of the pattern that delimited the left boundary. ( obisplit)

  • obisplit_location :

    Coordinates of the fragment extracted by obisplit (start..end, 1-based). ( obisplit)

  • obisplit_nfrg :

    Total number of fragments produced from the source read. ( obisplit)

  • obisplit_right_error :

    Mismatch count tolerated on the right delimiting pattern. ( obisplit)

  • obisplit_right_match :

    Sequence matched by the right delimiting pattern. ( obisplit)

  • obisplit_right_pattern :

    Name of the pattern that delimited the right boundary. ( obisplit)

  • obisplit_set :

    Label of the pattern set (pool) used to extract a fragment. ( obisplit)

  • obitag_bestid :

    Best identity score obtained when tagging a sequence with reference databases. ( obitag)

  • obitag_bestmatch :

    Identifier of the best matching reference sequence used for taxonomic assignment. ( obitag)

  • obitag_coord :

    Geographic coordinates of the query when using geometric tagging. ( obitag)

  • obitag_geomref_index :

    Distance index (difference → taxon) stored for landmark-based geometric tagging. ( obirefidx, obitag)

  • obitag_match_count :

    Number of reference sequences sharing the best score with the query. ( obitag)

  • obitag_min_dist :

    Minimal geometric distance observed between the query and landmark references. ( obitag)

  • obitag_rank :

    Taxonomic rank associated with the assigned reference taxon. ( obitag)

  • obitag_ref_index :

    Precomputed index linking mismatch counts to reference taxa for LCS-based tagging. ( obirefidx, obitag)

  • obitag_similarity_method :

    Identification method used by obitag ("lcs" or "geometric"). ( obitag)

- P - #

  • pair :

  • pairing_fast_count :

    Number of kmers supporting the initial overlap detection between paired reads. ( obipairing)

  • pairing_fast_overlap :

    Expected overlap length estimated from the fast pre-alignment step. ( obipairing)

  • pairing_fast_score :

    Score computed during the fast overlap detection stage prior to full alignment. ( obipairing)

  • pairing_mismatches :

    Map describing positions and base substitutions observed when merging paired reads. ( obipairing)

  • pattern :

    Pattern sequence tested with obiannotate --pattern; the tag name is prefixed when --pattern-name is provided. ( obiannotate)

  • pattern_error :

    Number of mismatches or indels between the queried pattern and the matched subsequence (prefix follows pattern name). ( obiannotate)

  • pattern_location :

    Genomic coordinates (forward or complement) of the matched pattern (prefix follows pattern name). ( obiannotate)

  • pattern_match :

    Sequence fragment that matched the requested pattern (prefix follows pattern name). ( obiannotate)

- R - #

  • rank :

    Taxonomic rank assigned to the ecoPCR reference sequence. ( obiconvert)

  • reffamidx_cluster_n :

    Number of clusters created when indexing a family-level reference database. ( obirefidx)

  • reffamidx_clusterhead :

    Boolean flag telling whether the reference sequence is the head of its similarity cluster. ( obirefidx)

  • reffamidx_clusterid :

    Identifier of the cluster head that represents a reference sequence. ( obirefidx)

  • reffamidx_clusteridentity :

    Pairwise identity between a reference sequence and the head of its family cluster. ( obirefidx)

  • reffamidx_id :

    Unique identifier assigned to each reference sequence while building the family index. ( obirefidx)

  • reffamidx_in :

    Index structure created for a given reference sequence to accelerate taxonomic assignment. ( obirefidx)

  • reverse_error :

    Number of mismatches tolerated between the read and the reverse primer. ( obimultiplex, obipcr)

  • reverse_match :

    Sequence of the reverse primer match extracted from the read. ( obimultiplex, obipcr)

  • reverse_mismatch :

    Mismatch count reported in ecoPCR reference entries for the reverse primer. ( obiconvert)

  • reverse_primer :

    Reverse primer sequence used while demultiplexing or during in silico PCR. ( obiconvert, obimultiplex, obipcr)

  • reverse_tag :

    Sample tag identified on the reverse side of a read. ( obimultiplex, obipcr)

  • reverse_tm :

    Melting temperature of the reverse primer reported by ecoPCR. ( obiconvert)

- S - #

  • sample :

    Sample identifier assigned by the NGS filter and propagated to annotated reads. ( obimultiplex, obitagpcr)

  • scientific_name :

    Scientific name for the current taxonomic annotation. ( obiannotate, obiconvert, obitag)

  • score :

    Alignment score computed while merging paired reads. ( obipairing)

  • score_norm :

    Alignment score normalized by the overlap length when assembling paired reads. ( obikmersim, obipairing)

  • seq_a_single :

    Number of nucleotides trimmed from sequence A because they are single-stranded in the overlap. ( obikmersim, obipairing)

  • seq_ab_match :

    Number of matching positions observed in the aligned overlap between two sequences. ( obikmersim, obipairing)

  • seq_b_single :

    Number of nucleotides trimmed from sequence B because they are single-stranded in the overlap. ( obikmersim, obipairing)

  • seq_length :

    Length of the sequence in nucleotides. ( obiannotate, obikmersim, obimicrosat)

  • seq_number :

    Ordinal number assigned to sequences when numbering is requested or needed for taxonomy extraction. ( obiannotate, obiconvert)

  • species_name :

    Scientific name of the species associated with the ecoPCR reference sequence. ( obiconvert)

  • species_taxid :

    NCBI taxid of the species associated with the ecoPCR reference sequence. ( obiconvert)

  • strand :

    Strand orientation stored in the ecoPCR record. ( obiconvert)

- T - #

  • taxid :

    Taxonomic identifier of the sequence within the current taxonomy. ( obiconvert, obipcr, obitag)

  • taxonomic_path :

    Ordered list of taxids describing the lineage of the annotated sequence. ( obiannotate)

  • taxonomic_rank :

    Rank name corresponding to the assigned taxon. ( obiannotate)