[InterMine Dev] Fetch data from DB while parsing in custom source (II)

Julie Sullivan julie at flymine.org
Mon Mar 18 13:09:54 GMT 2013


Hi Norbert

Your code looks correct to me (creating a reference to/from CellBank and 
fermentation). I think your problem could be here:

 >              Item cb = createItem("Cellbank");

It looks like you are creating an object for every line in your file. However is 
it possible that you may have multiple instances of the same Cell bank? If this 
is the case, you should keep a map of objects you've already created and only 
create a new object if you have a new `key_identifier` value.

Does that make sense? You should only store unique Cell bank objects.

Here is what I think is happening:

1. Load Cellbank data source

  - Source processed successfully
  - Duplicate Cellbank objects loaded that have the same `key_identifier` value

2. Load Fermentation data source

  - error on merging!
  - "Duplicate objects found for pk org.intermine.model.bio.Cellbank.key_identifier"
  - can't merge Cellbank objects. there are 2 (or more) in the database already. 
which object to merge with?

Can you try to update your code and let me know if that fixes your problem?

Julie


On 13/03/13 16:52, Norbert Auer wrote:
> Dear dev team,
>
> I`m now working with Intermine for approximate a week.
> I have a similar problem like described in http://mail.intermine.org/pipermail/dev/2011-August/001040.html .
>
> I have created two custom sources Cellbank and Fermentation. Fermentation has a reference field linked to Cellbank. Loading the Cellbank data source work fine, but loading the Fermentation source give me an error:
>
> org.intermine.objectstore.ObjectStoreException: Duplicate objects found for pk org.intermine.model.bio.Cellbank.key_identifier
>
> These are the contents of the two additional files:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> And this is the Converter file:
>
> public void process(Reader reader) throws Exception
>      {
>          Iterator lineIter = FormattedTextParser.parseDelimitedReader(reader,',');
>
>          // Loop through csv lines
>          while (lineIter.hasNext())
>          {
>              String[] line = (String[]) lineIter.next();
>
>              // Create organism if not already in database
>              Item ferm = createItem("Fermentation");
>              ferm.setAttribute("fermId",line[0]);
>              ferm.setAttribute("name", line[1]);
>              ferm.setAttribute("description", line[2]);
>
>              Item cb = createItem("Cellbank");
>
>              // primaryIdentifier already in the database
>              cb.setAttribute("cbId",line[3]);
>              store(cb);
>
>              ferm.setReference("cellbank",cb);
>              store(ferm);
>          }
>      }
>
> For both sources the keys.properties file include:
>
> Cellbank.key_identifier=cbId
>
> I still do not understand how to get the id of an object which is already in the database. If I create the referenced object new and store it like in the code above then a get the duplicate objects error even though I have set the integration key for cbId.
>
> Thanks for any help!
>
> Norbert
>
>
>
> DI (FH) Norbert Auer
>
> Department of Biotechnology
> University of Natural Resources and Applied Life Sciences,
> Muthgasse 18, A-1190 Vienna, Austria
>
> Tel: +43 1 47654 6837
> Fax: +43 1 36006 1249
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>



More information about the dev mailing list