iss.error_models package¶
Submodules¶
iss.error_models.basic module¶

class
iss.error_models.basic.
BasicErrorModel
[source]¶ Bases:
iss.error_models.ErrorModel
Basic Error Model class
Basic error model. The phred scores are based on a normal distribution. Only substitutions errors occur. The substitution rate is assumed equal between all nucleotides.

gen_phred_scores
(mean_quality, orientation)[source]¶ Generate a normal distribution, transform to phred scores
Generate a list of phred score according to a normal distribution centered around the ErrorModel quality
Parameters: mean_quality (int) – mean phred score Returns: list of phred scores following a normal distribution Return type: list

iss.error_models.cdf module¶
iss.error_models.kde module¶

class
iss.error_models.kde.
KDErrorModel
(npz_path)[source]¶ Bases:
iss.error_models.ErrorModel
KDErrorModel class.
Error model based on an .npz files derived from read alignments. the npz file must contain:
 the length of the reads
 the mean insert size
 the size of mean sequence quality bins (for R1 and R2)
 a cumulative distribution function of quality scores for each position
 (for R1 and R2)
 the substitution for each nucleotide at each position (for R1 and R2)
 the insertion and deletion rates for each position (for R1 and R2)

gen_phred_scores
(cdfs, orientation)[source]¶ Generate a list of phred scores based on cdfs and mean bins
For each position, draw a phred score from the cdf and append to the phred score list
Parameters:  cdfs (ndarray) – array containing the cdfs
 orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a list of phred scores
Return type: list
Module contents¶

class
iss.error_models.
ErrorModel
[source]¶ Bases:
object
Main ErrorModel Class
This class is used to create inheriting classes and contains all the functions that are shared by all ErrorModel classes

adjust_seq_length
(mut_seq, orientation, full_sequence, bounds)[source]¶ Truncate or Extend reads to make them fit the read length
When insertions or deletions are introduced to the reads, their length will change. This function takes a (mutable) read and a reference sequence, and extend or truncate the read if it has had an insertion or a deletion
Parameters:  mut_seq (MutableSeq) – a mutable sequence
 orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
 full_sequence (Seq) – the reference sequence from which mut_seq comes from
 bounds (tuple) – the position of the read in the full_sequence
Returns: a sequence fitting the ErrorModel
Return type: Seq

introduce_error_scores
(record, orientation)[source]¶ Add phred scores to a SeqRecord according to the error_model
Parameters:  record (SeqRecord) – a read record
 orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a read record with error scores
Return type: SeqRecord

introduce_indels
(record, orientation, full_seq, bounds)[source]¶ Introduce insertions or deletions in a sequence
Introduce insertion and deletion errors according to the probabilities present in the indel choices list
Parameters:  record (SeqRecord) – a sequence record
 orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
 full_seq (Seq) – the reference sequence from which mut_seq comes from
 bounds (tuple) – the position of the read in the full_sequence
Returns: a sequence with (eventually) indels
Return type: Seq

load_npz
(npz_path, model)[source]¶ load the error profile .npz file
Parameters:  npz_path (string) – path to the npz file
 model (string) – type of model. Could be ‘cdf’ or ‘kde’. ‘cdf’ has been deprecated and is no longer available
Returns:  numpy object containg variables necessary
for error model construction
Return type: ndarray

logger
¶

mut_sequence
(record, orientation)[source]¶ Introduce substitution errors to a sequence
If a random probability is higher than the probability of the basecall being correct, introduce a substitution error
Parameters:  record (SeqRecord) – a read record with error scores
 orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns: a sequence
Return type: Seq
