This function is used to obtain the pseudo haplotype phase of the RNA-seq data for a given gene, and align the major alleles across individuals.
phasing(dat, phased = FALSE, n_condition = "one")
Arguments
dat: |
bulk RNA-seq dataset of a given gene. Must contain variables:
One condition analysis:
- `id`: character, individual identifier;
- `ref`: numeric, the snp-level read counts for the reference allele if the haplotype phase of the data is unknown, and the snp-level read counts for allele aligned on paternal/maternal haplotype if haplotype phase is known;
- `total`: numeric, snp-level total read counts for both alleles;
Two conditions analysis:
- `id`: character, individual identifier;
- `snp`: character, the name/chromosome location of the heterzygous genetic variants;
- `ref`: numeric, the snp-level read counts for the reference allele if the haplotype phase of the data is unknown, and the snp-level read counts for allele aligned on the same paternal/maternal haplotype for both conditions if haplotype phase is known;
- `total`: numeric, snp-level total read counts for both alleles;
- `group`: character, the condition each RNA-seq sample is obtained from (i.e., pre- vs post-treatment);
- `ref_condition`: character, the condition used as the reference for pseudo haplotype phasing;
|
phased: |
a logical value indicates whether the haplotype phase of the data is known or not. Default is FALSE |
n_condition: |
a character string indicates whether the RNA-seq data contains data from only one condition or two conditions (i.e. normal vs diseased). Possible values are "one" or "two". Default is "one" |
Value
The psudo-phased RNA-seq data, with one more column "major" indicates the read counts for major alleles aligned across individuals