[InterMine Dev] failed data integration for malariamine example

Pengcheng Yang pengchy at gmail.com
Wed Jul 15 02:22:14 BST 2015


Hi,

I have successfully loaded uniprot and gff3 following the tutorial in 
intermine documentation. However, when I run the commands to check data 
integration, the results show that the two data sets were not integrated 
through primaryidentifier.

The attached is the psql commands and the file project.xml.

Best,
Pengcheng Yang

[1] the psql commands:
malariamine=# select id, primaryidentifier, secondaryidentifier, symbol, 
length , chromosomeid, chromosomelocationid, organismid from gene where 
primaryIdentifier = 'PFL1385c';
    id    | primaryidentifier | secondaryidentifier | symbol | length | 
chromosomeid | chromosomelocationid | organismid
---------+-------------------+---------------------+--------+--------+--------------+----------------------+------------
  1000581 | PFL1385c          |                     | ABRA   | 
|              |                      |    1000026
(1 row)



malariamine=# select * from gene where primaryIdentifier = 'PFL1385c';
  briefdescription | score | description | scoretype |   id    | symbol 
| length | name | primaryidentifier | secondaryidentifier | ups
treamintergenicregionid | downstreamintergenicregionid | 
sequenceontologytermid | organismid | chromosomelocationid | sequenceid 
| chr
omosomeid |            class
------------------+-------+-------------+-----------+---------+--------+--------+------+-------------------+---------------------+----
------------------------+------------------------------+------------------------+------------+----------------------+------------+----
----------+------------------------------
                   |       |             |           | 1000581 | ABRA   
|        |      | PFL1385c          | |
                         | |                1000081 |    1000026 | 
|            |
           | org.intermine.model.bio.Gene
(1 row)

[2] the project.xml file content:
<project type="bio">
   <property name="target.model" value="genomic"/>
   <property name="source.location" location="../bio/sources/"/>
   <property name="common.os.prefix" value="common"/>
   <property name="intermine.properties.file" 
value="malariamine.properties"/>
   <property name="default.intermine.properties.file" 
location="../default.intermine.integrate.properties"/>
   <sources>
                 <source name="uniprot-malaria" type="uniprot">
                         <property name="uniprot.organisms" value="36329"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/uniprot/"/>
                 </source>
                 <source name="go-malaria" type="go">
                         <property name="go.organisms" value="36329"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/go/"/>
                 </source>
                 <source name="go-annotation-malaria" type="go-annotation">
                         <property name="go-annotation.organisms" 
value="36329"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/go-annotation/"/>
                 </source>
                 <source name="malaria-chromosome-fasta" type="fasta">
                         <property name="fasta.taxonId" value="36329"/>
                         <property name="fasta.dataSourceName" 
value="PlasmoDB"/>
                         <property name="fasta.dataSetTitle" 
value="PlasmoDB chromosome sequence"/>
                         <property name="fasta.className" 
value="org.intermine.model.bio.Chromosome"/>
                         <property name="fasta.sequenceType" value="dna"/>
                         <property name="fasta.includes" value="MAL*fasta"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/genome/fasta/"/>
                 </source>
                 <source name="gff-malaria" type="gff">
                         <property name="gff3.taxonId" value="36329"/>
                         <property name="gff3.seqClsName" 
value="Chromosome"/>
                         <property name="gff3.dataSourceName" 
value="PlasmoDB"/>
                         <property name="gff3.seqDataSourceName" 
value="PlasmoDB"/>
                         <property name="gff3.dataSetTitle" 
value="PlasmoDB P.falciparum genome"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/genome/gff/"/>
                 </source>
                 <source name="interpro-malaria" type="interpro">
                         <property name="interpro.organisms" value="36329"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/interpro/"/>
                 </source>
                 <source name="kegg-pathway-malaria" type="kegg-pathway">
                         <property name="kegg-pathway.organisms" 
value="36329"/>
                         <property name="src.data.dir" 
location="/home/pengchy/Soft/05.SystemBiology/malaria/kegg/"/>
                 </source>


   </sources>

   <post-processing>



   </post-processing>

</project>



More information about the dev mailing list