[InterMine Dev] source_keys.properties file not read

Julie Sullivan julie at flymine.org
Thu Sep 8 15:52:04 BST 2011


Hi Thomas

> If I may suggest, I think it would be beneficial to enforce the primary
> keys, even if one source only is used, as this is a little bit confusing.

That would be useful I agree, I don't know why it doesn't work this way.

> And it probably happens quite often that a source is composed of several
> files that can contain redundant information. Since *process()* is called
> independently for each file (as far as I can tell),

Yes, that is correct.

> there is not way to detect this in the converter (specially since the converter cannot access
> items in the production db).

You can create a map of unique items in the main class instead of inside the 
process() method.  For instance, the UniProt and BioGRID sources both process 
several files that can contain duplicate data, eg. organisms and publications 
can be mentioned in each of the different files.  So at the top of each 
converter we keep a list of the unique values.

Does that help you at all?

Cheers
Julie



More information about the dev mailing list