bioino package

Submodules

bioino.cli module

Command-line interface to bioino.

bioino.cli.main() None[source]

bioino.fasta module

Input and output functions and classes for FASTA files.

class bioino.fasta.FastaCollection(sequences: ~typing.Iterable[~bioino.fasta.FastaSequence] = <factory>)[source]

Bases: object

Collection of FASTA sequences for reading and writing.

sequences

Iterable of FastaSequence

Type:

str, optional

from_file()[source]

Instantiate by reading a FASTA file.

write()[source]

Write sequences to FASTA file.

Examples

--------
>>> seq1 = FastaSequence("example", "This is a description", "ATCG")
>>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA")
>>> fasta_stream = FastaCollection([seq1, seq2])
>>> print(fasta_stream)
>example This is a description
ATCG()
>example2 This is another sequence
GGGAAAA()
classmethod from_file(file: str | TextIOWrapper)[source]

Read sequences from a FASTA file.

Takes a file handle or filename and creates a new FastaCollection of a FastaSequence for each sequence.

Parameters:

file (TextIO or str) – String or file handle such as on generated by open(f, mode=’r’).

Return type:

FastaCollection

Examples

>>> from io import StringIO
>>> seq1 = FastaSequence("example", "This is a description", "ATCG")
>>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA")
>>> fasta_stream = FastaCollection([seq1, seq2])
>>> fasta_file = StringIO()
>>> fasta_stream.write(fasta_file)
>>> fasta_file.seek(0)  # rewind file
0
>>> fasta_stream2 = FastaCollection.from_file(fasta_file)
>>> print(fasta_stream2)
>example This is a description
ATCG
>example2 This is another sequence
GGGAAAA
classmethod from_pandas(data: DataFrame, sequence: str, names: str | Iterable[str], descriptions: str | Iterable[str] | None = None, name_sep: str = '_', desc_sep: str = ';')[source]

Create a `FastaCollection from a Pandas DataFrame.

The FASTA sequence is taken from the sequence column, and the names is taken from the names columns, concatenated separated by name_sep. If provided, description columns values are added to the description field as ‘key=value’ pairs, separated by desc_sep.

Parameters:
  • data (pd.DataFrame) – Input data. Must contain columns named as sequence, names, and (optionally) descriptions.

  • sequence (str) – Name of column containing sequences.

  • names (list) – Names of columns to use as sequence names in FASTA.

  • descriptions (list, optional) – Names of columns to add as metadata to the description in FASTA.

  • name_sep (str, optional) – Separator between name values. Default: ‘_’.

  • desc_sep (str, optional) – Separator between description values. Default: ‘;’.

Yields:

FastaSequence – Object representing a single FASTA sequence.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame(dict(seq=['atcg', 'aaaa'],
...                  title=['seq1', 'seq2'],
...                  info=['SeqA', 'SeqB'],
...                  score=[1, 2]))
>>> df  
    seq title  info  score
0  atcg  seq1  SeqA      1
1  aaaa  seq2  SeqB      2
>>> FastaCollection.from_pandas(df, sequence='seq',
...                             names=['title'],
...                             descriptions=['info', 'score']).write()  
>seq1 info=SeqA;score=1
atcg
>seq2 info=SeqB;score=2
aaaa
>>> FastaCollection.from_pandas(df, sequence='seq',
...                             names=['title', 'info'],
...                             descriptions=['score']).write()  
>seq1_SeqA score=1
atcg
>seq2_SeqB score=2
aaaa
sequences: Iterable[FastaSequence]
write(file: TextIOWrapper | None = None)[source]

Stream sequences to a FASTA file.

Takes an iterable of FastaSequence and writes them to the given file.

Parameters:
  • stream (Sequence) – Iterable of FastaSequence objects.

  • file (TextIO) – File handle such as on generated by open(f, mode=’w’).

Return type:

None

Examples

>>> seq1 = FastaSequence("example", "This is a description", "ATCG")
>>> seq2 = FastaSequence("example2", "This is another sequence", "GGGAAAA")
>>> fasta_stream = FastaCollection([seq1, seq2])
>>> fasta_stream.write()
>example This is a description
ATCG
>example2 This is another sequence
GGGAAAA
class bioino.fasta.FastaSequence(name: str, description: str, sequence: str)[source]

Bases: object

Object which gives a fasta-formatted sequence when printed.

name

Name of the sequence.

Type:

str

description

Sequence description string.

Type:

str

sequence

The actual sequence.

Type:

str

__str__()[source]

Show the FASTA-formatted sequence.

Examples

>>> s = FastaSequence("example", "This is a description", "ATCG")
>>> print(s)
>example This is a description
ATCG
description: str
name: str
sequence: str
write(file: TextIOWrapper | None = None) None[source]

bioino.gff module

bioino.tables module

Utilities for working with tables.

Module contents