Skip to content

create_db() does not parse directives from GFF files starting in v0.11.0 #213

@dtdoering

Description

@dtdoering

I am trying to read in a GFF that doesn't adhere to spec so I can use gffutils to fix it and then write out a corrected file. I've gotten everything working, except that the GFF header directive(s) don't appear in the output file. It appears they are not being parsed upon creation of the database.

Interestingly, this only happens when a FeatureDB is created from a file, not from e.g. a dedent()ed string as in the existing parser_test.py.

This behavior is exhibited in v0.11.1 and v0.11.0, but not v0.10.1 (thus, a workaround is to downgrade to v0.10.1).

To reproduce:

Use gffutils > v0.10.1.

Create a test file test.gff with the following:

##gff-version 3
.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.

Example code:

import gffutils
db = gffutils_create_db('test.gff', dbfn='test.db', force=True)
print(len(db.directives))

Expected output:

1

Observed output:

0

System/environment info:

OS: GNU/Linux

Python: 3.8.5 (still happens with 3.8.13)

Conda environment:

name: gffutils-py3_8_5
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_gnu
  - argcomplete=3.0.5=pyhd8ed1ab_0
  - argh=0.27.2=pyhd8ed1ab_0
  - biopython=1.81=py38h1de0b5d_0
  - ca-certificates=2022.12.7=ha878542_0
  - gffutils=0.11.1=pyh7cba7a3_0
  - ld_impl_linux-64=2.40=h41732ed_0
  - libblas=3.9.0=16_linux64_openblas
  - libcblas=3.9.0=16_linux64_openblas
  - libffi=3.2.1=he1b5a44_1007
  - libgcc-ng=12.2.0=h65d4601_19
  - libgfortran-ng=12.2.0=h69a702a_19
  - libgfortran5=12.2.0=h337968e_19
  - libgomp=12.2.0=h65d4601_19
  - liblapack=3.9.0=16_linux64_openblas
  - libopenblas=0.3.21=pthreads_h78a6416_3
  - libsqlite=3.40.0=h753d276_0
  - libstdcxx-ng=12.2.0=h46fd767_19
  - libzlib=1.2.13=h166bdaf_4
  - ncurses=6.3=h27087fc_1
  - numpy=1.24.2=py38h10c12cc_0
  - openssl=1.1.1t=h0b41bf4_0
  - packaging=23.0=pyhd8ed1ab_0
  - pip=23.0.1=pyhd8ed1ab_0
  - pyfaidx=0.7.2.1=pyh7cba7a3_1
  - python=3.8.5=h1103e12_9_cpython
  - python_abi=3.8=3_cp38
  - pyvcf3=1.0.3=pyhdfd78af_0
  - readline=8.2=h8228510_1
  - setuptools=67.6.1=pyhd8ed1ab_0
  - simplejson=3.18.4=py38h1de0b5d_0
  - six=1.16.0=pyh6c4a22f_0
  - sqlite=3.40.0=h4ff8645_0
  - tk=8.6.12=h27826a3_0
  - wheel=0.40.0=pyhd8ed1ab_0
  - xz=5.2.6=h166bdaf_0
  - zlib=1.2.13=h166bdaf_4
prefix: /home/dtdoering__lbl.gov/.conda/envs/gffutils-py3_8_5

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions