iss.error_models package¶
Submodules¶
iss.error_models.basic module¶
-
class
iss.error_models.basic.
BasicErrorModel
[source]¶ Bases:
iss.error_models.ErrorModel
Basic Error Model class
Basic error model. The phred scores are based on a normal distribution. Only substitutions errors occur. The substitution rate is assumed equal between all nucleotides.
-
gen_phred_scores
(mean_quality, orientation)[source]¶ Generate a normal distribution, transform to phred scores
Generate a list of phred score according to a normal distribution centered around the ErrorModel quality
Parameters: mean_quality (int) – mean phred score Returns: list of phred scores following a normal distribution Return type: list
-
iss.error_models.cdf module¶
iss.error_models.kde module¶
-
class
iss.error_models.kde.
KDErrorModel
(npz_path)[source]¶ Bases:
iss.error_models.ErrorModel
KDErrorModel class.
Error model based on an .npz files derived from read alignments. the npz file must contain:
- the length of the reads
- the mean insert size
- the size of mean sequence quality bins (for R1 and R2)
- a cumulative distribution function of quality scores for each position
- (for R1 and R2)
- the substitution for each nucleotide at each position (for R1 and R2)
- the insertion and deletion rates for each position (for R1 and R2)
-
gen_phred_scores
(cdfs, orientation)[source]¶ Generate a list of phred scores based on cdfs and mean bins
For each position, draw a phred score from the cdf and append to the phred score list
Parameters: - cdfs (ndarray) – array containing the cdfs
- orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a list of phred scores
Return type: list
Module contents¶
-
class
iss.error_models.
ErrorModel
[source]¶ Bases:
object
Main ErrorModel Class
This class is used to create inheriting classes and contains all the functions that are shared by all ErrorModel classes
-
adjust_seq_length
(mut_seq, orientation, full_sequence, bounds)[source]¶ Truncate or Extend reads to make them fit the read length
When insertions or deletions are introduced to the reads, their length will change. This function takes a (mutable) read and a reference sequence, and extend or truncate the read if it has had an insertion or a deletion
Parameters: - mut_seq (MutableSeq) – a mutable sequence
- orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
- full_sequence (Seq) – the reference sequence from which mut_seq comes from
- bounds (tuple) – the position of the read in the full_sequence
Returns: a sequence fitting the ErrorModel
Return type: Seq
-
introduce_error_scores
(record, orientation)[source]¶ Add phred scores to a SeqRecord according to the error_model
Parameters: - record (SeqRecord) – a read record
- orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a read record with error scores
Return type: SeqRecord
-
introduce_indels
(record, orientation, full_seq, bounds)[source]¶ Introduce insertions or deletions in a sequence
Introduce insertion and deletion errors according to the probabilities present in the indel choices list
Parameters: - record (SeqRecord) – a sequence record
- orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
- full_seq (Seq) – the reference sequence from which mut_seq comes from
- bounds (tuple) – the position of the read in the full_sequence
Returns: a sequence with (eventually) indels
Return type: Seq
-
load_npz
(npz_path, model)[source]¶ load the error profile .npz file
Parameters: - npz_path (string) – path to the npz file
- model (string) – type of model. Could be ‘cdf’ or ‘kde’. ‘cdf’ has been deprecated and is no longer available
Returns: - numpy object containg variables necessary
for error model construction
Return type: ndarray
-
logger
¶
-
mut_sequence
(record, orientation)[source]¶ Introduce substitution errors to a sequence
If a random probability is higher than the probability of the basecall being correct, introduce a substitution error
Parameters: - record (SeqRecord) – a read record with error scores
- orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a sequence
Return type: Seq
-