[InterMine Dev] How long SHOULD TransferSequence take?

Julie Sullivan julie at flymine.org
Wed Jul 2 10:19:51 BST 2014

Hi Joe

1 million / day is too slow, there is optimisation we can do. For 
comparison, my SNP source (for human intermine) gets speeds of 1 million 
/ minute. The average is 200,000 objects / minute.

To diagnose where the problem is (hardware, postgres, etc), can you 
provide us with a little bit more detail?

1. intermine log files from intermine/integrate
2. the final output printed to the console giving the speeds of each source
3. the results of this query:

  select class, count(*) from intermineobject group by class;

That will tell us what sort of data you are loading and how long each 
source is taking, allowing us to pinpoint areas of your build that can 
be optimised.


On 30/06/14 18:37, Joe Carlson wrote:
> Hi Julie (and others)
> What is your experience with the postprocessing step TransferSequences?
> I'm using it here to add the sequence to the gene flanking region
> objects and am wondering why our performance is not-too-impressive.
> I cut down on the sizes of the different regions; so we have only 500bp
> and 5kb regions (but still up- and down-stream, with and not-with the
> gene). But even with the cutback we still have close to 12 million regions.
> Adding the sequence to these features has been slower than I expected.
> After running a week I didn't notice any sequence actually appearing
> (but I had 100K temporary tables in the db!). Now things are getting
> filled in at a rate of ~ 1 million a day. This is slower than when I had
> run things before from what I can recall.
> I recently re-synced with the master branch 12 days ago. Were there
> changes after 1.2 that may affect this postprocessing steps?
> I'm just curious. And was wondering what sorts of experience you have
> with the transfering sequences. Other than reducing the number of
> regions (will do), or precomputing and storing in our chado backend
> (considering it), do you have any suggestions?
> Thanks,
> Joe
