[InterMine Dev] referencing items-xml

Julie Sullivan julie at flymine.org
Wed Mar 20 10:46:21 GMT 2013



On 19/03/13 20:43, David Rhee wrote:
> Hi all,
>
> I was wondering if there is a way to reference to already added data (via project.xml) while custom loading items-xml. For instance, I will be adding an item for class called 'submission' and need to reference 'organism' class. As shown below.
>
> <item id="0_2" class="Submission">
>                  <attribute name="identifier" value="feature2" />
>                  <attribute name="confidence" value="0.37" />
>                  <reference name="organism" ref_id="?" />
> </item>
>
> The tutorial shows that when adding items, I can reference to organism by its unique item id. However I am adding the organism data via project.xml (see below) I do not have the ref_id available. How do I reference to organism in this case? Do I simply add organism to items-xml and hope that they will sync up later?

Yes, exactly. You'll need to add an organism to your Items XML and reference 
that ID in your Submission item. So something like this:


	<item id="0_2" class="Submission">
                   <attribute name="identifier" value="feature2" />
                   <attribute name="confidence" value="0.37" />
                   <reference name="organism" ref_id="1_1" />
	</item>

	<item id="1_1" class="Organism">
                   <attribute name="taxonId" value="9606" />
	</item>

The "syncing up" part of the build is described here:

http://intermine.readthedocs.org/en/latest/database/database-building/data-integration/

If your keys are correct, your data will merge correctly. e.g. For organism, 
taxon ID is usually used as the primary key. If there is already an organism 
with the specified taxon ID, the two records will merge.

Does that help?

Also, the entrez organism source is described here:

http://intermine.readthedocs.org/en/latest/database/data-sources/library/organism/

That source only updates organisms already in the database, it doesn't add any 
entries to the database. This enables you to load only the taxon ID and that 
source will update the name, species etc.


>      <source name="entrez-organism" type="entrez-organism">
>        <property name="src.data.file" location="build/organisms.xml"/>
>      </source>
>
> Thanks,
> David
>
>
>
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev



More information about the dev mailing list