[InterMine Dev] reducing source loading times

Benjamin Hitz hitz at stanford.edu
Fri Feb 10 17:37:24 GMT 2012


Not that I have ever loaded an intermine, but... it sort of sounds like you guys are not all using the same GO files.
there are a few versions of the .obo file (at least one of which is "reasoning enabled" - which might not be what you want).
there is one HUGE gene_association file (gene_association.uniprot) which is something like 12M lines.  Takes some time to chew through that so be sure you want it.

Ben


On Feb 10, 2012, at 8:48 AM, Thomas TRIPLET wrote:

> I have the same issue, loading GO is extremely slow (on v0.97), and haven't found any solution yet =/
> I you find any, please let us know.
> Thanks
> Thomas
> 
> 
> Thomas Triplet, Ph.D.
> http://www.thomastriplet.net
> 
> Centre for Structural and Functional Genomics
> Concordia University
> 7141 West Sherbrooke St
> Montreal QC H4B 1R6
> 
> 
> 
> 
> 
> On Fri, Feb 10, 2012 at 10:55 AM, JD Wong <jdmswong at gmail.com> wrote:
> I'll update this thread when a solution is found
> 
> 
> On Fri, Feb 10, 2012 at 10:54 AM, JD Wong <jdmswong at gmail.com> wrote:
> In other words I set ANT_OPTS="...  -Xmx 20000m in my .bashrc file.  20G should be a good amount, and since these values transfer to the java calls that ant makes this is a strange problem indeed...
> 
> -JD
> 
> 
> On Wed, Feb 8, 2012 at 4:04 PM, JD Wong <jdmswong at gmail.com> wrote:
> ANT_OPTS has 20GB allocated to it
> 
> 
> On Wed, Feb 8, 2012 at 1:07 PM, Richard Smith <richard at flymine.org> wrote:
> Hi JD,
> It looks like it's the OBO edit reasoner that is taking all the time:
> 
> 2012-02-07 11:06:32 INFO  org.obo.reasoner.impl.LinkPileReasoner     -   Total reasoner time = 2130574.717897 ms
> 
> Which is 35 minutes.  On the latest FlyMine build it took 30 seconds.
> I guess this is a RAM thing.  For the FlyMine build we had 32GB heap
> allocated to the Java process.  How much did you have?
> 
> The rest of the build looks like it ran fast, about 1 million objects
> loaded in five minutes which is good.
> 
> 
> Cheers,
> Richard.
> 
> 
> 
> 
> On 07/02/2012 16:30, JD Wong wrote:
> Sure
> 
> On Tue, Feb 7, 2012 at 8:34 AM, Richard Smith <richard at flymine.org
> <mailto:richard at flymine.org>> wrote:
> 
>    JD,
>    Could you send us the intermine.log from your integrate directory after
>    running a build.  This is the most helpful thing for us to investigate
>    performance.
> 
>    Thanks,
>    Richard.
> 
> 
> 
> 
> 
>    On 06/02/2012 19:01, JD Wong wrote:
> 
>        I was wondering how the other mods speed up their builds.  I have
>        configured ant, java, and postgres accordingly without effect.
>          I was
>        hoping to get the community's advice on this aspect.
> 
>        Cheers,
>        -JD
> 
>        On Thu, Feb 2, 2012 at 10:11 AM, JD Wong <jdmswong at gmail.com
>        <mailto:jdmswong at gmail.com>
>        <mailto:jdmswong at gmail.com <mailto:jdmswong at gmail.com>>> wrote:
> 
>            I haven't given ant and postgres enough to consume all the
>        memory
>            when running simultaneously.  Also there is plenty of free
>        memory
>            during loading.
> 
>            -JD
> 
> 
>            On Wed, Feb 1, 2012 at 10:35 PM, Josh Goodman
>        <jogoodma at indiana.edu <mailto:jogoodma at indiana.edu>
>        <mailto:jogoodma at indiana.edu <mailto:jogoodma at indiana.edu>>> wrote:
> 
>                You need to be careful of the various memory settings
>        here.  If you
>                set ant really high (>25% of total memory) and you are also
>                setting Pg
>                high you could be suffering from the two of them
>        fighting over
>                system
>                resources and causing the swap to get thrashed.  I would
>        run the
>                unix
>        "free" command while you are running a load to see what is going on
>                with memory.
> 
>                e.g.
> 
>                free -m -s 5
> 
>                If you have other processes running on this machine (tomcat
>                instances)
>                you also need to adjust ant and Pg to take that into
>        account.
> 
>                Josh
> 
>                On Wed, Feb 1, 2012 at 5:20 PM, JD Wong
>        <jdmswong at gmail.com <mailto:jdmswong at gmail.com>
>        <mailto:jdmswong at gmail.com <mailto:jdmswong at gmail.com>>> wrote:
>         > Hi all,
>         > Loading my Go source takes on average 2500 seconds.  I have
>                tuned the
>         > postgres configuration paramaters to the desired values and
>                gave ant high
>         > heap memory to no avail.  Is there a way to speed this up?
>         >
>         > -JD
>         >
>         > _________________________________________________
> 
>         > dev mailing list
>         > dev at intermine.org <mailto:dev at intermine.org>
>        <mailto:dev at intermine.org <mailto:dev at intermine.org>>
>         > http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev
>        <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
> 
>         >
> 
> 
> 
> 
> 
>        _________________________________________________
> 
>        dev mailing list
>        dev at intermine.org <mailto:dev at intermine.org>
>        http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev
>        <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
> 
> 
> 
>    _________________________________________________
> 
>    dev mailing list
>    dev at intermine.org <mailto:dev at intermine.org>
>    http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev
>    <http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
> 
> 
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev

--
Ben Hitz 
Senior Scientific Programmer ** Saccharomyces Genome Database ** GO Consortium
Stanford University ** hitz at stanford.edu




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20120210/0162e931/attachment-0001.html>


More information about the dev mailing list