iss.error_models package¶

Submodules¶

iss.error_models.basic module¶

class iss.error_models.basic.BasicErrorModel[source]¶

Bases: iss.error_models.ErrorModel

Basic Error Model class

Basic error model. The phred scores are based on a normal distribution. Only substitutions errors occur. The substitution rate is assumed equal between all nucleotides.

gen_phred_scores(mean_quality, orientation)[source]¶

Generate a normal distribution, transform to phred scores

Generate a list of phred score according to a normal distribution centered around the ErrorModel quality

Parameters:	mean_quality (int) – mean phred score
Returns:	list of phred scores following a normal distribution
Return type:	list

random_insert_size()[source]¶

Fake random function returning the default insert size of the basic arror model

Returns:	insert size
Return type:	int

iss.error_models.cdf module¶

iss.error_models.kde module¶

class iss.error_models.kde.KDErrorModel(npz_path)[source]¶

Bases: iss.error_models.ErrorModel

KDErrorModel class.

Error model based on an .npz files derived from read alignments. the npz file must contain:

the length of the reads
the mean insert size
the size of mean sequence quality bins (for R1 and R2)
a cumulative distribution function of quality scores for each position

(for R1 and R2)
the substitution for each nucleotide at each position (for R1 and R2)
the insertion and deletion rates for each position (for R1 and R2)

gen_phred_scores(cdfs, orientation)[source]¶

Generate a list of phred scores based on cdfs and mean bins

For each position, draw a phred score from the cdf and append to the phred score list

Parameters:	cdfs (ndarray) – array containing the cdfs orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns:	a list of phred scores
Return type:	list

random_insert_size()[source]¶

Draw a random insert size from the insert size cdf

Parameters:	i_size_cdf – cumulative distribution function of the insert size
Returns:	an insert size
Return type:	int

Module contents¶

class iss.error_models.ErrorModel[source]¶

Bases: object

Main ErrorModel Class

This class is used to create inheriting classes and contains all the functions that are shared by all ErrorModel classes

adjust_seq_length(mut_seq, orientation, full_sequence, bounds)[source]¶

Truncate or Extend reads to make them fit the read length

When insertions or deletions are introduced to the reads, their length will change. This function takes a (mutable) read and a reference sequence, and extend or truncate the read if it has had an insertion or a deletion

Parameters:	mut_seq (MutableSeq) – a mutable sequence orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’ full_sequence (Seq) – the reference sequence from which mut_seq comes from bounds (tuple) – the position of the read in the full_sequence
Returns:	a sequence fitting the ErrorModel
Return type:	Seq

introduce_error_scores(record, orientation)[source]¶

Add phred scores to a SeqRecord according to the error_model

Parameters:	record (SeqRecord) – a read record orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns:	a read record with error scores
Return type:	SeqRecord

introduce_indels(record, orientation, full_seq, bounds)[source]¶

Introduce insertions or deletions in a sequence

Introduce insertion and deletion errors according to the probabilities present in the indel choices list

Parameters:	record (SeqRecord) – a sequence record orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’ full_seq (Seq) – the reference sequence from which mut_seq comes from bounds (tuple) – the position of the read in the full_sequence
Returns:	a sequence with (eventually) indels
Return type:	Seq

load_npz(npz_path, model)[source]¶

load the error profile .npz file

Parameters:

npz_path (string) – path to the npz file
model (string) – type of model. Could be ‘cdf’ or ‘kde’. ‘cdf’ has been deprecated and is no longer available

Returns:

numpy object containg variables necessary: for error model construction

Return type:

ndarray

logger¶

mut_sequence(record, orientation)[source]¶

Introduce substitution errors to a sequence

If a random probability is higher than the probability of the basecall being correct, introduce a substitution error

Parameters:	record (SeqRecord) – a read record with error scores orientation (string) – orientation of the read. Can be ‘forward’ or ‘reverse’
Returns:	a sequence
Return type:	Seq

iss.error_models package¶

Submodules¶

iss.error_models.basic module¶

iss.error_models.cdf module¶

iss.error_models.kde module¶

Module contents¶

InSilicoSeq

Navigation

Related Topics