The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

App::Sandy::Command::Variation::Add - variation subcommand class. Add structural variation to the database.

VERSION

version 0.25

SYNOPSIS

 sandy variation add -a <entry name> [-s <source>] FILE

 Arguments:
  a file (vcf or a genomic-variation file)

 Mandatory options:
  -a, --genomic-variation       genomic-variation entries

 Options:
  -h, --help                    brief help message
  -H, --man                     full documentation
  -v, --verbose                 print log messages
  -s, --source                  genomic-variation source detail for database
  -n, --sample-name             the sample-name present in one of the optional
                                vcf columns SAMPLES from which the genotype
                                will be extracted

DESCRIPTION

Add genomic-variation to the database. A genomic-variation may be represented by a genomic position (seqid, position), a reference sequence at that postion, an alternate sequence and a genotype (homozygous or heterozygous).

INPUT

The input file may be a vcf or a custom genomic-variation file. For vcf files, the user can point out the sample-name present in vcf header and then its column will be used to extract the genotype. if the user does not pass the option --sample-name, then it will be used the first sample.

 ===> my_variations.vcf
 ##fileformat=VCFv4.3
 ...
 #CHROM POS     ID    REF ALT   QUAL FILTER INFO        FORMAT NA001 NA002
 chr20  14370   rs81  G   A     29   PASS   NS=3;DP=14  GT     0/1   0/0
 chr20  17330   rs82  T   AAA   3    PASS   NS=3;DP=20  GT     1/1   0/0
 chr20  110696  rs83  A   GTCT  10   PASS   NS=2;DP=11  GT     0/1   1/1
 ...

In the my_variations.vcf file, if the user does not point out the sample NA002 by passing the options --sample-name=NA002, the sample NA001 will be used by default.

A genomic-variation file is a representation of a reduced VCF, that is, without the columns: QUAL, FILTER, INFO and FORMAT. There is only one SAMPLE column with the genotype for the entry in the format HO for homozygous and HE for heterozygous. See the example bellow:

 ===> my_variations.txt
 #seqid position  id      reference     alternate       genotype
 chr20  14370     rs81  G         A         HE
 chr20  17330     rs82  T         AAA       HO
 chr20  110696    rs83  A         GTCT      HE
 ...

AUTHORS

  • Thiago L. A. Miller <tmiller@mochsl.org.br>

  • J. Leonel Buzzo <lbuzzo@mochsl.org.br>

  • Felipe R. C. dos Santos <fsantos@mochsl.org.br>

  • Helena B. Conceição <hconceicao@mochsl.org.br>

  • Rodrigo Barreiro <rbarreiro@mochsl.org.br>

  • Gabriela Guardia <gguardia@mochsl.org.br>

  • Fernanda Orpinelli <forpinelli@mochsl.org.br>

  • Rafael Mercuri <rmercuri@mochsl.org.br>

  • Rodrigo Barreiro <rbarreiro@mochsl.org.br>

  • Pedro A. F. Galante <pgalante@mochsl.org.br>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2023 by Teaching and Research Institute from Sírio-Libanês Hospital.

This is free software, licensed under:

  The GNU General Public License, Version 3, June 2007