[InterMine Dev] failed data integration for malariamine example

Julie Sullivan julie at flymine.org
Wed Jul 15 11:01:33 BST 2015


Your project XML is different to the one in the tutorial. Maybe use that 
one and see if that works for you?

Specifically, the tutorial says to copy over the malariamine project XML 
like so:

	$ cp ../bio/tutorial/project.xml .
	$ less project.xml

Try that? Or edit your project XML file so it matches the one in the 
tutorial - your GFF source has a different name.

That should fix it!
Julie

On 15/07/15 02:22, Pengcheng Yang wrote:
> Hi,
>
> I have successfully loaded uniprot and gff3 following the tutorial in
> intermine documentation. However, when I run the commands to check data
> integration, the results show that the two data sets were not integrated
> through primaryidentifier.
>
> The attached is the psql commands and the file project.xml.
>
> Best,
> Pengcheng Yang
>
> [1] the psql commands:
> malariamine=# select id, primaryidentifier, secondaryidentifier, symbol,
> length , chromosomeid, chromosomelocationid, organismid from gene where
> primaryIdentifier = 'PFL1385c';
>     id    | primaryidentifier | secondaryidentifier | symbol | length |
> chromosomeid | chromosomelocationid | organismid
> ---------+-------------------+---------------------+--------+--------+--------------+----------------------+------------
>
>   1000581 | PFL1385c          |                     | ABRA   |
> |              |                      |    1000026
> (1 row)
>
>
>
> malariamine=# select * from gene where primaryIdentifier = 'PFL1385c';
>   briefdescription | score | description | scoretype |   id    | symbol
> | length | name | primaryidentifier | secondaryidentifier | ups
> treamintergenicregionid | downstreamintergenicregionid |
> sequenceontologytermid | organismid | chromosomelocationid | sequenceid
> | chr
> omosomeid |            class
> ------------------+-------+-------------+-----------+---------+--------+--------+------+-------------------+---------------------+----
>
> ------------------------+------------------------------+------------------------+------------+----------------------+------------+----
>
> ----------+------------------------------
>                    |       |             |           | 1000581 | ABRA
> |        |      | PFL1385c          | |
>                          | |                1000081 |    1000026 |
> |            |
>            | org.intermine.model.bio.Gene
> (1 row)
>
> [2] the project.xml file content:
> <project type="bio">
>    <property name="target.model" value="genomic"/>
>    <property name="source.location" location="../bio/sources/"/>
>    <property name="common.os.prefix" value="common"/>
>    <property name="intermine.properties.file"
> value="malariamine.properties"/>
>    <property name="default.intermine.properties.file"
> location="../default.intermine.integrate.properties"/>
>    <sources>
>                  <source name="uniprot-malaria" type="uniprot">
>                          <property name="uniprot.organisms" value="36329"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/uniprot/"/>
>                  </source>
>                  <source name="go-malaria" type="go">
>                          <property name="go.organisms" value="36329"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/go/"/>
>                  </source>
>                  <source name="go-annotation-malaria" type="go-annotation">
>                          <property name="go-annotation.organisms"
> value="36329"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/go-annotation/"/>
>                  </source>
>                  <source name="malaria-chromosome-fasta" type="fasta">
>                          <property name="fasta.taxonId" value="36329"/>
>                          <property name="fasta.dataSourceName"
> value="PlasmoDB"/>
>                          <property name="fasta.dataSetTitle"
> value="PlasmoDB chromosome sequence"/>
>                          <property name="fasta.className"
> value="org.intermine.model.bio.Chromosome"/>
>                          <property name="fasta.sequenceType" value="dna"/>
>                          <property name="fasta.includes"
> value="MAL*fasta"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/genome/fasta/"/>
>                  </source>
>                  <source name="gff-malaria" type="gff">
>                          <property name="gff3.taxonId" value="36329"/>
>                          <property name="gff3.seqClsName"
> value="Chromosome"/>
>                          <property name="gff3.dataSourceName"
> value="PlasmoDB"/>
>                          <property name="gff3.seqDataSourceName"
> value="PlasmoDB"/>
>                          <property name="gff3.dataSetTitle"
> value="PlasmoDB P.falciparum genome"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/genome/gff/"/>
>                  </source>
>                  <source name="interpro-malaria" type="interpro">
>                          <property name="interpro.organisms"
> value="36329"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/interpro/"/>
>                  </source>
>                  <source name="kegg-pathway-malaria" type="kegg-pathway">
>                          <property name="kegg-pathway.organisms"
> value="36329"/>
>                          <property name="src.data.dir"
> location="/home/pengchy/Soft/05.SystemBiology/malaria/kegg/"/>
>                  </source>
>
>
>    </sources>
>
>    <post-processing>
>
>
>
>    </post-processing>
>
> </project>
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>



More information about the dev mailing list