bioino package
Submodules
bioino.cli module
Command-line interface to bioino.
bioino.fasta module
Input and output functions and classes for FASTA files.
- class bioino.fasta.FastaCollection(sequences: ~typing.Iterable[~bioino.fasta.FastaSequence] = <factory>)[source]
Bases:
objectCollection of FASTA sequences for reading and writing.
- sequences
Iterable of FastaSequence
- Type:
str, optional
- --------
- >>> seq1 = FastaSequence("example", "This is a description", "ATCG")
- >>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA")
- >>> fasta_stream = FastaCollection([seq1, seq2])
- >>> print(fasta_stream)
- >example This is a description
- ATCG()
- >example2 This is another sequence
- GGGAAAA()
- classmethod from_file(file: str | TextIOWrapper)[source]
Read sequences from a FASTA file.
Takes a file handle or filename and creates a new FastaCollection of a FastaSequence for each sequence.
- Parameters:
file (TextIO or str) – String or file handle such as on generated by open(f, mode=’r’).
- Return type:
Examples
>>> from io import StringIO >>> seq1 = FastaSequence("example", "This is a description", "ATCG") >>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA") >>> fasta_stream = FastaCollection([seq1, seq2]) >>> fasta_file = StringIO() >>> fasta_stream.write(fasta_file) >>> fasta_file.seek(0) # rewind file 0 >>> fasta_stream2 = FastaCollection.from_file(fasta_file) >>> print(fasta_stream2) >example This is a description ATCG >example2 This is another sequence GGGAAAA
- classmethod from_pandas(data: DataFrame, sequence: str, names: str | Iterable[str], descriptions: str | Iterable[str] | None = None, name_sep: str = '_', desc_sep: str = ';')[source]
Create a `FastaCollection from a Pandas DataFrame.
The FASTA sequence is taken from the sequence column, and the names is taken from the names columns, concatenated separated by name_sep. If provided, description columns values are added to the description field as ‘key=value’ pairs, separated by desc_sep.
- Parameters:
data (pd.DataFrame) – Input data. Must contain columns named as sequence, names, and (optionally) descriptions.
sequence (str) – Name of column containing sequences.
names (list) – Names of columns to use as sequence names in FASTA.
descriptions (list, optional) – Names of columns to add as metadata to the description in FASTA.
name_sep (str, optional) – Separator between name values. Default: ‘_’.
desc_sep (str, optional) – Separator between description values. Default: ‘;’.
- Yields:
FastaSequence – Object representing a single FASTA sequence.
Examples
>>> import pandas as pd >>> df = pd.DataFrame(dict(seq=['atcg', 'aaaa'], ... title=['seq1', 'seq2'], ... info=['SeqA', 'SeqB'], ... score=[1, 2])) >>> df seq title info score 0 atcg seq1 SeqA 1 1 aaaa seq2 SeqB 2 >>> FastaCollection.from_pandas(df, sequence='seq', ... names=['title'], ... descriptions=['info', 'score']).write() >seq1 info=SeqA;score=1 atcg >seq2 info=SeqB;score=2 aaaa >>> FastaCollection.from_pandas(df, sequence='seq', ... names=['title', 'info'], ... descriptions=['score']).write() >seq1_SeqA score=1 atcg >seq2_SeqB score=2 aaaa
- sequences: Iterable[FastaSequence]
- write(file: TextIOWrapper | None = None)[source]
Stream sequences to a FASTA file.
Takes an iterable of FastaSequence and writes them to the given file.
- Parameters:
stream (Sequence) – Iterable of FastaSequence objects.
file (TextIO) – File handle such as on generated by open(f, mode=’w’).
- Return type:
None
Examples
>>> seq1 = FastaSequence("example", "This is a description", "ATCG") >>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA") >>> fasta_stream = FastaCollection([seq1, seq2]) >>> fasta_stream.write() >example This is a description ATCG >example2 This is another sequence GGGAAAA
- class bioino.fasta.FastaSequence(name: str, description: str, sequence: str)[source]
Bases:
objectObject which gives a fasta-formatted sequence when printed.
- name
Name of the sequence.
- Type:
str
- description
Sequence description string.
- Type:
str
- sequence
The actual sequence.
- Type:
str
Examples
>>> s = FastaSequence("example", "This is a description", "ATCG") >>> print(s) >example This is a description ATCG
- description: str
- name: str
- sequence: str
bioino.gff module
bioino.tables module
Utilities for working with tables.