logo
Oil Reservior Gene Database
Microbial Enhancing Oil Recovery (MEOR)
Introduction

Oil Reservior Gene database (ORGdb)™ version 1.0 beta is a professional gene collection of alkane hydroxylase (AH), mainly focused on alkB, ladA, P450 (cyp153) and almA. AH genes in this database were retrieved from genomic data using HMM method. Based on this strategy, linking information of AH genes can be obtained including taxonomy lineage, flanking region, habitat(isolation source), geographic location and so on. Genome sequences of 343,166 species were retrieved from NCBI, with 28,192 out of which were complete, 117,342 at scaffold level and 197,632 at contig level.


LadA (EC 1.14.14.28) is a type of long-chain alkane monooxygenase, converting long-chain alkanes (up to C(36)) to corresponding primary alcohols. Originally, nucleotide sequences of 1,521 ladA gene were downloaded from KEGG (Kyoto Encyclopedia of Genes and Genomes) database under KEGG Orthology of K20938 for nhmmer search. A total of 48,645 genomes contained at least one ladA and 87,788 corresponding ladA gene sequences were maintained in ORGdb. 50.76% (24,694 out of 48,645) of these genomes contained multiple copies of ladA genes. Mycolicibacterium septicum DSM 44393 (high GC Gram+) and Agrobacterium rhizogenes (a-proteobacteria), both had up to 13 copied of ladA gene.


AlkB (EC 1.14.15.3) is a type of integral-membrane alkane monooxygenase, catalyzing the hydroxylation of n-alkanes and fatty acids in the presence of a NADH-rubredoxin reductase and rubredoxin, distributed in both gram-negative and gram-positive bacteria. Originally, nucleotide sequences of 890 alkB gene were downloaded from KEGG database under KEGG Orthology of K00496 for nhmmer search. A total of 29,818 genomes contained at least one alkB gene copy and 37,092 corresponding alkB gene sequences were maintained. 22.28% (6,643 out of 29,818) of these genomes contained multiple copies of alkB gene. Rhodococcus sp. YL-1 (high GC Gram+) had up to 7 copies of alkB gene.


P450 cyp153 (UniProtKB:cyp153) has the ability of degradation of short- and medium-chain-length n-alkanes. Originally, 130 protein sequences of P450 cyp153 family gene, described in article of Nie Y. et. al 2014, were downloaded from NCBI database. The corresponding nucleotide sequence were reproduced using blastx with procaryotic genome sequence, with both >99% identity and coverage. Totally 2,851 nucleotide sequence of cyp153 family gene were retained, all of which further used for nhmmer search. A total of 7,412 genomes contained at least one cyp153 gene copy and 12,588 corresponding cyp153 gene sequences were maintained. 46.83% (3,471 out of 7,412) of these genomes contained multiple copies of cyp153 gene. Gammaproteobacteria bacterium UWMA-0282, derived from Guaymas Basin Hydrothermal plume metagenome, had up to 13 copies of cpy153 gene.


AlmA encoding flavin-binding monooxygenase, be involved in the bacterial degradation of long-chain n-alkanes of 32 C's and longer. Originally, nucleotide sequence of almA gene were downloaded from NCBI database using keyword search strategy. Briefly, the keyword "flavin-binding monooxygenase AND almA" was used for preliminary search. Then, manual filtering using "flavin-binding monooxygenase (almA) gene" was performed against function description for each gene in preliminary search results. Finally, 123 nucleotide sequence of almA gene were retained for nhmmer search. A total of 30,862 genomes contained at least one almA gene copy and 61,563 corresponding almA gene sequences were maintained. 62.97% (19,435 out of 30,862) of these genomes contained multiple copies of almA gene. Smaragdicoccus niigatensis DSM 44881 isolated from soil from an oil spring and Gordonia asplenii TBRC 11910 isolated from organic soil had up to 7 copies of almA gene.