[InterMine Dev] kegg-pathway load: genespathways table is empty

Pengcheng Yang yangpc at biols.ac.cn
Mon Aug 3 10:06:36 BST 2015


Hi Chen Yian and Julie Sullivan,

Thank you for your reply and the information.

I have tried both the following two methods, the talbe "genespathways" 
remains empty.
1) Adding the org.taxonId=1111 to the 
bio/sources/kegg-pathway/resources/kegg-pathway_keys.properties file. 
Here the taxonId and organism name were coined for confidential reason.
2) remove the "kegg.organisms" property from project.xml file.

I have checked the related information maybe useful:
1) "select  * from pathway" return expected information.
2) I compared the kegg-pathway.log1 file from mymine and malariamine and 
found the following that specific to mymine, not existed in 
malariamine's kegg-pathway.log1

<     [javac] org/intermine/model/bio/ProteinDomainRegion.java added as 
org/intermine/model/bio/ProteinDomainRegion.class is outdated.
<     [javac] org/intermine/model/bio/ProteinDomainRegionShadow.java 
added as org/intermine/model/bio/ProteinDomainRegionShadow.class is 
outdated.
3172,3173d3169
<     [javac] org/intermine/model/bio/ProteinRegion.java added as 
org/intermine/model/bio/ProteinRegion.class is outdated.
<     [javac] org/intermine/model/bio/ProteinRegionShadow.java added as 
org/intermine/model/bio/ProteinRegionShadow.class is outdated.
3217c3213
<     [javac] 
/home/pengchy/Soft/05.SystemBiology/intermine/mymine/dbmodel/build/gen/src/org/intermine/model/bio/ProteinDomainRegion.java
<     [javac] 
/home/pengchy/Soft/05.SystemBiology/intermine/mymine/dbmodel/build/gen/src/org/intermine/model/bio/ProteinDomainRegionShadow.java
3352,3353d3345
<     [javac] 
/home/pengchy/Soft/05.SystemBiology/intermine/mymine/dbmodel/build/gen/src/org/intermine/model/bio/ProteinRegion.java
<     [javac] 
/home/pengchy/Soft/05.SystemBiology/intermine/mymine/dbmodel/build/gen/src/org/intermine/model/bio/ProteinRegionShadow.java
3540,3541d3531
<   [lib:jar] org/intermine/model/bio/ProteinDomainRegion.class added as 
org/intermine/model/bio/ProteinDomainRegion.class is outdated.
<   [lib:jar] org/intermine/model/bio/ProteinDomainRegionShadow.class 
added as org/intermine/model/bio/ProteinDomainRegionShadow.class is 
outdated.
3543,3544d3532
<   [lib:jar] org/intermine/model/bio/ProteinRegion.class added as 
org/intermine/model/bio/ProteinRegion.class is outdated.
<   [lib:jar] org/intermine/model/bio/ProteinRegionShadow.class added as 
org/intermine/model/bio/ProteinRegionShadow.class is outdated.
<   [lib:jar] adding entry org/intermine/model/bio/ProteinDomainRegion.class
<   [lib:jar] adding entry 
org/intermine/model/bio/ProteinDomainRegionShadow.class
3720,3721d3705
<   [lib:jar] adding entry org/intermine/model/bio/ProteinRegion.class
<   [lib:jar] adding entry org/intermine/model/bio/ProteinRegionShadow.class
<     [javac] org/intermine/model/bio/ProteinDomainRegion.java added as 
org/intermine/model/bio/ProteinDomainRegion.class is outdated.
<     [javac] org/intermine/model/bio/ProteinDomainRegionShadow.java 
added as org/intermine/model/bio/ProteinDomainRegionShadow.class is 
outdated.
7230,7231d7211
<     [javac] org/intermine/model/bio/ProteinRegion.java added as 
org/intermine/model/bio/ProteinRegion.class is outdated.
<     [javac] org/intermine/model/bio/ProteinRegionShadow.java added as 
org/intermine/model/bio/ProteinRegionShadow.class is outdated.
7275c7255

Thanks a lot!

Best,
Pengcheng Yang



On 2015/8/3 16:14, Julie Sullivan wrote:
> Here are the docs on the kegg source:
>
> http://intermine.readthedocs.org/en/latest/database/data-sources/library/pathways/kegg/ 
>
>
> KEGG uses its own prefix, which InterMine does not know. You have to 
> configure this in the config file.
>
> e.g. KEGG uses "dme" for Drosophila melanogaster and the data file is 
> named "dme_gene_map.tab".
>
> The reason why malaria worked is that is already configured:
>
> https://github.com/intermine/intermine/blob/master/bio/sources/kegg-pathway/main/resources/kegg_config.properties#L37 
>
>
> You have two options:
>
> 1. remove the taxon ID from your project XML file, all genes will be 
> loaded
>
> 2. configure the taxon ID in the kegg_config.properties
>
>
>
> On 03/08/15 08:55, Pengcheng Yang wrote:
>> Hi Julie Sullivan,
>>
>> Thank you for your reply.
>>
>> I listed the kegg-pathway part of the project.xml file for the two mine.
>> It seems they have no difference except the path and organisms.
>>
>> [1] The project.xml of my mine:
>> ----------------------
>> <source name="kegg-pathway" type="kegg-pathway">
>>         <property name="kegg.organisms" value="1111"/>
>>        <property name="src.data.dir" location="/path/to/mymine/kegg/"/>
>>      </source>
>>
>> [2] The project.xml of malariamine
>>      <source name="kegg-pathway" type="kegg-pathway">
>>        <property name="kegg.organisms" value="36329"/>
>>        <property name="src.data.dir" location="/path/to/malaria/kegg/"/>
>>      </source>
>>
>> I have checked the file org_gene_map.tab file, its format indeed is:
>> GeneID<tb>mapid<space>mapid<space>mapid
>>
>> Best,
>> Pengcheng Yang
>>
>> On 2015/8/3 15:32, Julie Sullivan wrote:
>>> Sorry you are having problems with the kegg source!
>>>
>>> Can you clarify what is different about the two project XML files?
>>>
>>> On 02/08/15 10:01, Pengcheng Yang wrote:
>>>> Hi InterMiner developers,
>>>>
>>>> Thank you all who answered my questions. Here is another question that
>>>> blocked my way to deploy my InterMine.
>>>>
>>>> To load kegg-pathway data, I set the project.xml as that in 
>>>> malariamine
>>>> and prepared the two files map_title.tab and org_gene_map.tab. When I
>>>> load the data using "ant -Dsource=kegg-pathway -v 1> kegg-pathway.log1
>>>> 2> kegg-pathway.log2", the kegg-pathway.log1 said at the end [1].
>>>> However, when I query in the postgres database using SQL language:
>>>> "select * from genespathways", nothing returned.
>>>>
>>>> But when I do the same thing for malariamine after loading 
>>>> kegg-pathway
>>>> data, I got the pathways to genes information as [2] listed. So I
>>>> compared the log information between my mine and malariamine, and 
>>>> found
>>>> my mine hasn't build several the indexes as [3] listed.
>>>>
>>>> Because I have used the same sources kegg-pathway as malariamine, so
>>>> what the problem here?
>>>>
>>>> Any suggestions and comments are welcom! Thanks a lot!
>>>>
>>>> Best,
>>>> Pengcheng Yang
>>>>
>>>>
>>>> ---------------------------------
>>>> [1] build successful log information from my mine after load
>>>> kegg-pathway
>>>> /BUILD SUCCESSFUL//
>>>> //Total time: 21 seconds//
>>>> //[Thread-16] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>>> db.common-tgt-items is being shutdown.//
>>>> //[Thread-8] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>>> db.common-tgt-items is being shutdown.//
>>>> //[Thread-16] INFO com.zaxxer.hikari.pool.HikariPool - HikariCP pool
>>>> db.production is being shutdown./
>>>>
>>>> [2] genespathways table from malariamine database.
>>>>
>>>>   pathways |  genes
>>>> ----------+---------
>>>>    2000002 | 1002796
>>>>    2000002 | 1003874
>>>>    2000002 | 1004075
>>>>
>>>> [3] the log information not appeared in my mine but in malariamine.
>>>>   [integrate] Creating index: CREATE INDEX 
>>>> Gene__key_secondaryidentifier
>>>> ON Gene (secondaryIdentifier, organismid)
>>>>   [integrate] Creating index: CREATE INDEX Gene__key_symbol_org ON 
>>>> Gene
>>>> (symbol, organismid)
>>>>   [integrate] Creating index: CREATE INDEX Gene__key_primaryidentifier
>>>> ON Gene (primaryIdentifier)
>>>>   [integrate] Creating index: CREATE INDEX Organism__key_taxonid ON
>>>> Organism (taxonId)
>>>>   [integrate] Creating index: CREATE INDEX SOTerm__key ON SOTerm 
>>>> (name,
>>>> ontologyid)
>>>>
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev at intermine.org
>>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>>>
>>>
>>
>>
>>
>





More information about the dev mailing list