[InterMine Dev] kegg-pathway load: genespathways table is empty

Julie Sullivan julie at flymine.org
Mon Aug 3 09:14:24 BST 2015


Here are the docs on the kegg source:

http://intermine.readthedocs.org/en/latest/database/data-sources/library/pathways/kegg/

KEGG uses its own prefix, which InterMine does not know. You have to 
configure this in the config file.

e.g. KEGG uses "dme" for Drosophila melanogaster and the data file is 
named "dme_gene_map.tab".

The reason why malaria worked is that is already configured:

https://github.com/intermine/intermine/blob/master/bio/sources/kegg-pathway/main/resources/kegg_config.properties#L37

You have two options:

1. remove the taxon ID from your project XML file, all genes will be loaded

2. configure the taxon ID in the kegg_config.properties



On 03/08/15 08:55, Pengcheng Yang wrote:
> Hi Julie Sullivan,
>
> Thank you for your reply.
>
> I listed the kegg-pathway part of the project.xml file for the two mine.
> It seems they have no difference except the path and organisms.
>
> [1] The project.xml of my mine:
> ----------------------
> <source name="kegg-pathway" type="kegg-pathway">
>         <property name="kegg.organisms" value="1111"/>
>        <property name="src.data.dir" location="/path/to/mymine/kegg/"/>
>      </source>
>
> [2] The project.xml of malariamine
>      <source name="kegg-pathway" type="kegg-pathway">
>        <property name="kegg.organisms" value="36329"/>
>        <property name="src.data.dir" location="/path/to/malaria/kegg/"/>
>      </source>
>
> I have checked the file org_gene_map.tab file, its format indeed is:
> GeneID<tb>mapid<space>mapid<space>mapid
>
> Best,
> Pengcheng Yang
>
> On 2015/8/3 15:32, Julie Sullivan wrote:
>> Sorry you are having problems with the kegg source!
>>
>> Can you clarify what is different about the two project XML files?
>>
>> On 02/08/15 10:01, Pengcheng Yang wrote:
>>> Hi InterMiner developers,
>>>
>>> Thank you all who answered my questions. Here is another question that
>>> blocked my way to deploy my InterMine.
>>>
>>> To load kegg-pathway data, I set the project.xml as that in malariamine
>>> and prepared the two files map_title.tab and org_gene_map.tab. When I
>>> load the data using "ant -Dsource=kegg-pathway -v 1> kegg-pathway.log1
>>> 2> kegg-pathway.log2", the kegg-pathway.log1 said at the end [1].
>>> However, when I query in the postgres database using SQL language:
>>> "select * from genespathways", nothing returned.
>>>
>>> But when I do the same thing for malariamine after loading kegg-pathway
>>> data, I got the pathways to genes information as [2] listed. So I
>>> compared the log information between my mine and malariamine, and found
>>> my mine hasn't build several the indexes as [3] listed.
>>>
>>> Because I have used the same sources kegg-pathway as malariamine, so
>>> what the problem here?
>>>
>>> Any suggestions and comments are welcom! Thanks a lot!
>>>
>>> Best,
>>> Pengcheng Yang
>>>
>>>
>>> ---------------------------------
>>> [1] build successful log information from my mine after load
>>> kegg-pathway
>>> /BUILD SUCCESSFUL//
>>> //Total time: 21 seconds//
>>> //[Thread-16] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>> db.common-tgt-items is being shutdown.//
>>> //[Thread-8] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>> db.common-tgt-items is being shutdown.//
>>> //[Thread-16] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>> db.production is being shutdown./
>>>
>>> [2] genespathways table from malariamine database.
>>>
>>>   pathways |  genes
>>> ----------+---------
>>>    2000002 | 1002796
>>>    2000002 | 1003874
>>>    2000002 | 1004075
>>>
>>> [3] the log information not appeared in my mine but in malariamine.
>>>   [integrate] Creating index: CREATE INDEX Gene__key_secondaryidentifier
>>> ON Gene (secondaryIdentifier, organismid)
>>>   [integrate] Creating index: CREATE INDEX Gene__key_symbol_org ON Gene
>>> (symbol, organismid)
>>>   [integrate] Creating index: CREATE INDEX Gene__key_primaryidentifier
>>> ON Gene (primaryIdentifier)
>>>   [integrate] Creating index: CREATE INDEX Organism__key_taxonid ON
>>> Organism (taxonId)
>>>   [integrate] Creating index: CREATE INDEX SOTerm__key ON SOTerm (name,
>>> ontologyid)
>>>
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at intermine.org
>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>>
>>
>
>
>



More information about the dev mailing list