iss.error_models package¶
Submodules¶
iss.error_models.basic module¶
- class iss.error_models.basic.BasicErrorModel(fragment_length=None, fragment_sd=None, store_mutations=False)[source]¶
Bases:
ErrorModel
Basic Error Model class
Basic error model. The phred scores are based on a normal distribution. Only substitutions errors occur. The substitution rate is assumed equal between all nucleotides.
- gen_phred_scores(mean_quality, orientation)[source]¶
Generate a normal distribution, transform to phred scores
Generate a list of phred score according to a normal distribution centered around the ErrorModel quality
- Parameters:
mean_quality (int) – mean phred score
- Returns:
list of phred scores following a normal distribution
- Return type:
list
iss.error_models.cdf module¶
iss.error_models.kde module¶
- class iss.error_models.kde.KDErrorModel(npz_path, fragment_length=None, fragment_sd=None, store_mutations=False)[source]¶
Bases:
ErrorModel
KDErrorModel class.
Error model based on an .npz files derived from read alignments. the npz file must contain:
the length of the reads
the mean insert size
the size of mean sequence quality bins (for R1 and R2)
- a cumulative distribution function of quality scores for each position
(for R1 and R2)
the substitution for each nucleotide at each position (for R1 and R2)
the insertion and deletion rates for each position (for R1 and R2)
- gen_phred_scores(cdfs, orientation)[source]¶
Generate a list of phred scores based on cdfs and mean bins
For each position, draw a phred score from the cdf and append to the phred score list
- Parameters:
cdfs (ndarray) – array containing the cdfs
orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
- Returns:
a list of phred scores
- Return type:
list
Module contents¶
- class iss.error_models.ErrorModel[source]¶
Bases:
object
Main ErrorModel Class
This class is used to create inheriting classes and contains all the functions that are shared by all ErrorModel classes
- adjust_seq_length(mut_seq, orientation, full_sequence, bounds)[source]¶
Truncate or Extend reads to make them fit the read length
When insertions or deletions are introduced to the reads, their length will change. This function takes a (mutable) read and a reference sequence, and extend or truncate the read if it has had an insertion or a deletion
- Parameters:
mut_seq (MutableSeq) – a mutable sequence
orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
full_sequence (Seq) – the reference sequence from which mut_seq comes from
bounds (tuple) – the position of the read in the full_sequence
- Returns:
a sequence fitting the ErrorModel
- Return type:
Seq
- introduce_error_scores(record, orientation)[source]¶
Add phred scores to a SeqRecord according to the error_model
- Parameters:
record (SeqRecord) – a read record
orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
- Returns:
a read record with error scores
- Return type:
SeqRecord
- introduce_indels(record, orientation, full_seq, bounds)[source]¶
Introduce insertions or deletions in a sequence
Introduce insertion and deletion errors according to the probabilities present in the indel choices list
- Parameters:
record (SeqRecord) – a sequence record
orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
full_seq (Seq) – the reference sequence from which mut_seq comes from
bounds (tuple) – the position of the read in the full_sequence
- Returns:
a sequence record with indel errors
- Return type:
SeqRecord
- load_npz(npz_path, model)[source]¶
load the error profile .npz file
- Parameters:
npz_path (string) – path to the npz file
model (string) – type of model. Could be ‘cdf’ or ‘kde’. ‘cdf’ has been deprecated and is no longer available
- Returns:
- numpy object containg variables necessary
for error model construction
- Return type:
ndarray
- property logger¶
- mut_sequence(record, orientation)[source]¶
Introduce substitution errors to a sequence
If a random probability is higher than the probability of the basecall being correct, introduce a substitution error
- Parameters:
record (SeqRecord) – a read record with error scores
orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
- Returns:
the read record with substitution errors
- Return type:
SeqRecord