[InterMine Dev] reducing source loading times

Richard Smith richard at flymine.org
Mon Feb 13 16:30:59 GMT 2012


On 10/02/2012 17:37, Benjamin Hitz wrote:
>
> Not that I have ever loaded an intermine, but... it sort of sounds like
> you guys are not all using the same GO files.
> there are a few versions of the .obo file (at least one of which is
> "reasoning enabled" - which might not be what you want).

Good point.  It looks like we're still fetching the 1.0 format file:
	
http://geneontology.org/ontology/obo_format_1_0/gene_ontology.1_0.obo

I guess we need to update the parser and test it with 1.2 but in the
meantime could JD and Thomas try the 1.0 version to see what the load
time is like.

Thanks,
Richard.


> there is one HUGE gene_association file (gene_association.uniprot) which
> is something like 12M lines. Takes some time to chew through that so be
> sure you want it.
>
> Ben
>
>
> On Feb 10, 2012, at 8:48 AM, Thomas TRIPLET wrote:
>
>> I have the same issue, loading GO is extremely slow (on v0.97), and
>> haven't found any solution yet =/
>> I you find any, please let us know.
>> Thanks
>> Thomas
>>
>>
>> Thomas Triplet, Ph.D.
>> http://www.thomastriplet.net <http://www.thomastriplet.net/>
>>
>> Centre for Structural and Functional Genomics
>> Concordia University
>> 7141 West Sherbrooke St
>> Montreal QC H4B 1R6
>>
>>
>>
>>
>>
>> On Fri, Feb 10, 2012 at 10:55 AM, JD Wong <jdmswong at gmail.com
>> <mailto:jdmswong at gmail.com>> wrote:
>>
>>     I'll update this thread when a solution is found
>>
>>
>>     On Fri, Feb 10, 2012 at 10:54 AM, JD Wong <jdmswong at gmail.com
>>     <mailto:jdmswong at gmail.com>> wrote:
>>
>>         In other words I set ANT_OPTS="... -Xmx 20000m in my .bashrc
>>         file. 20G should be a good amount, and since these values
>>         transfer to the java calls that ant makes this is a strange
>>         problem indeed...
>>
>>         -JD
>>
>>
>>         On Wed, Feb 8, 2012 at 4:04 PM, JD Wong <jdmswong at gmail.com
>>         <mailto:jdmswong at gmail.com>> wrote:
>>
>>             ANT_OPTS has 20GB allocated to it
>>
>>
>>             On Wed, Feb 8, 2012 at 1:07 PM, Richard Smith
>>             <richard at flymine.org <mailto:richard at flymine.org>> wrote:
>>
>>                 Hi JD,
>>                 It looks like it's the OBO edit reasoner that is
>>                 taking all the time:
>>
>>                 2012-02-07 11:06:32 INFO
>>                 org.obo.reasoner.impl.__LinkPileReasoner - Total
>>                 reasoner time = 2130574.717897 ms
>>
>>                 Which is 35 minutes. On the latest FlyMine build it
>>                 took 30 seconds.
>>                 I guess this is a RAM thing. For the FlyMine build we
>>                 had 32GB heap
>>                 allocated to the Java process. How much did you have?
>>
>>                 The rest of the build looks like it ran fast, about 1
>>                 million objects
>>                 loaded in five minutes which is good.
>>
>>
>>                 Cheers,
>>                 Richard.
>>
>>
>>
>>
>>                 On 07/02/2012 16:30, JD Wong wrote:
>>
>>                     Sure
>>
>>                     On Tue, Feb 7, 2012 at 8:34 AM, Richard Smith
>>                     <richard at flymine.org <mailto:richard at flymine.org>
>>                     <mailto:richard at flymine.org
>>                     <mailto:richard at flymine.org>>> wrote:
>>
>>                     JD,
>>                     Could you send us the intermine.log from your
>>                     integrate directory after
>>                     running a build. This is the most helpful thing
>>                     for us to investigate
>>                     performance.
>>
>>                     Thanks,
>>                     Richard.
>>
>>
>>
>>
>>
>>                     On 06/02/2012 19:01, JD Wong wrote:
>>
>>                     I was wondering how the other mods speed up their
>>                     builds. I have
>>                     configured ant, java, and postgres accordingly
>>                     without effect.
>>                     I was
>>                     hoping to get the community's advice on this aspect.
>>
>>                     Cheers,
>>                     -JD
>>
>>                     On Thu, Feb 2, 2012 at 10:11 AM, JD Wong
>>                     <jdmswong at gmail.com <mailto:jdmswong at gmail.com>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>>>> wrote:
>>
>>                     I haven't given ant and postgres enough to consume
>>                     all the
>>                     memory
>>                     when running simultaneously. Also there is plenty
>>                     of free
>>                     memory
>>                     during loading.
>>
>>                     -JD
>>
>>
>>                     On Wed, Feb 1, 2012 at 10:35 PM, Josh Goodman
>>                     <jogoodma at indiana.edu
>>                     <mailto:jogoodma at indiana.edu>
>>                     <mailto:jogoodma at indiana.edu
>>                     <mailto:jogoodma at indiana.edu>>
>>                     <mailto:jogoodma at indiana.edu
>>                     <mailto:jogoodma at indiana.edu>
>>                     <mailto:jogoodma at indiana.edu
>>                     <mailto:jogoodma at indiana.edu>>>__> wrote:
>>
>>                     You need to be careful of the various memory settings
>>                     here. If you
>>                     set ant really high (>25% of total memory) and you
>>                     are also
>>                     setting Pg
>>                     high you could be suffering from the two of them
>>                     fighting over
>>                     system
>>                     resources and causing the swap to get thrashed. I
>>                     would
>>                     run the
>>                     unix
>>                     "free" command while you are running a load to see
>>                     what is going on
>>                     with memory.
>>
>>                     e.g.
>>
>>                     free -m -s 5
>>
>>                     If you have other processes running on this
>>                     machine (tomcat
>>                     instances)
>>                     you also need to adjust ant and Pg to take that into
>>                     account.
>>
>>                     Josh
>>
>>                     On Wed, Feb 1, 2012 at 5:20 PM, JD Wong
>>                     <jdmswong at gmail.com <mailto:jdmswong at gmail.com>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>
>>                     <mailto:jdmswong at gmail.com
>>                     <mailto:jdmswong at gmail.com>>>> wrote:
>>                     > Hi all,
>>                     > Loading my Go source takes on average 2500
>>                     seconds. I have
>>                     tuned the
>>                     > postgres configuration paramaters to the desired
>>                     values and
>>                     gave ant high
>>                     > heap memory to no avail. Is there a way to speed
>>                     this up?
>>                     >
>>                     > -JD
>>                     >
>>                     > ___________________________________________________
>>
>>                     > dev mailing list
>>                     > dev at intermine.org <mailto:dev at intermine.org>
>>                     <mailto:dev at intermine.org <mailto:dev at intermine.org>>
>>                     <mailto:dev at intermine.org
>>                     <mailto:dev at intermine.org>
>>                     <mailto:dev at intermine.org <mailto:dev at intermine.org>>>
>>                     >
>>                     http://mail.intermine.org/cgi-____bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>                     <http://mail.intermine.org/__cgi-bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>>
>>
>>                     >
>>
>>
>>
>>
>>
>>                     ___________________________________________________
>>
>>                     dev mailing list
>>                     dev at intermine.org <mailto:dev at intermine.org>
>>                     <mailto:dev at intermine.org <mailto:dev at intermine.org>>
>>                     http://mail.intermine.org/cgi-____bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>                     <http://mail.intermine.org/__cgi-bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>>
>>
>>
>>
>>                     ___________________________________________________
>>
>>                     dev mailing list
>>                     dev at intermine.org <mailto:dev at intermine.org>
>>                     <mailto:dev at intermine.org <mailto:dev at intermine.org>>
>>                     http://mail.intermine.org/cgi-____bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>                     <http://mail.intermine.org/__cgi-bin/mailman/listinfo/dev
>>                     <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>>
>>
>>
>>
>>
>>
>>
>>
>>     _______________________________________________
>>     dev mailing list
>>     dev at intermine.org <mailto:dev at intermine.org>
>>     http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org <mailto:dev at intermine.org>
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>
> --
> Ben Hitz
> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO
> Consortium
> Stanford University ** hitz at stanford.edu <mailto:hitz at stanford.edu>
>
>
>
>
>
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev




More information about the dev mailing list