[InterMine Dev] reducing source loading times

JD Wong jdmswong at gmail.com
Fri Feb 10 18:21:00 GMT 2012


Great point Ben, unfortunately my test only loads the go and go-annotation
sources.  the gaf file I feed it is only 17mb, and the GO obo file I'm
using is 23mb

-JD

On Fri, Feb 10, 2012 at 12:37 PM, Benjamin Hitz <hitz at stanford.edu> wrote:

>
> Not that I have ever loaded an intermine, but... it sort of sounds like
> you guys are not all using the same GO files.
> there are a few versions of the .obo file (at least one of which is
> "reasoning enabled" - which might not be what you want).
> there is one HUGE gene_association file (gene_association.uniprot) which
> is something like 12M lines.  Takes some time to chew through that so be
> sure you want it.
>
> Ben
>
>
> On Feb 10, 2012, at 8:48 AM, Thomas TRIPLET wrote:
>
> I have the same issue, loading GO is extremely slow (on v0.97), and
> haven't found any solution yet =/
> I you find any, please let us know.
> Thanks
> Thomas
>
>
> Thomas Triplet, Ph.D.
> http://www.thomastriplet.net
>
> Centre for Structural and Functional Genomics
> Concordia University
> 7141 West Sherbrooke St
> Montreal QC H4B 1R6
>
>
>
>
>
> On Fri, Feb 10, 2012 at 10:55 AM, JD Wong <jdmswong at gmail.com> wrote:
>
>> I'll update this thread when a solution is found
>>
>>
>> On Fri, Feb 10, 2012 at 10:54 AM, JD Wong <jdmswong at gmail.com> wrote:
>>
>>> In other words I set ANT_OPTS="...  -Xmx 20000m in my .bashrc file.  20G
>>> should be a good amount, and since these values transfer to the java calls
>>> that ant makes this is a strange problem indeed...
>>>
>>> -JD
>>>
>>>
>>> On Wed, Feb 8, 2012 at 4:04 PM, JD Wong <jdmswong at gmail.com> wrote:
>>>
>>>> ANT_OPTS has 20GB allocated to it
>>>>
>>>>
>>>> On Wed, Feb 8, 2012 at 1:07 PM, Richard Smith <richard at flymine.org>wrote:
>>>>
>>>>> Hi JD,
>>>>> It looks like it's the OBO edit reasoner that is taking all the time:
>>>>>
>>>>> 2012-02-07 11:06:32 INFO  org.obo.reasoner.impl.**LinkPileReasoner
>>>>>   -   Total reasoner time = 2130574.717897 ms
>>>>>
>>>>> Which is 35 minutes.  On the latest FlyMine build it took 30 seconds.
>>>>> I guess this is a RAM thing.  For the FlyMine build we had 32GB heap
>>>>> allocated to the Java process.  How much did you have?
>>>>>
>>>>> The rest of the build looks like it ran fast, about 1 million objects
>>>>> loaded in five minutes which is good.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Richard.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 07/02/2012 16:30, JD Wong wrote:
>>>>>
>>>>>> Sure
>>>>>>
>>>>>> On Tue, Feb 7, 2012 at 8:34 AM, Richard Smith <richard at flymine.org
>>>>>> <mailto:richard at flymine.org>> wrote:
>>>>>>
>>>>>>    JD,
>>>>>>    Could you send us the intermine.log from your integrate directory
>>>>>> after
>>>>>>    running a build.  This is the most helpful thing for us to
>>>>>> investigate
>>>>>>    performance.
>>>>>>
>>>>>>    Thanks,
>>>>>>    Richard.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>    On 06/02/2012 19:01, JD Wong wrote:
>>>>>>
>>>>>>        I was wondering how the other mods speed up their builds.  I
>>>>>> have
>>>>>>        configured ant, java, and postgres accordingly without effect.
>>>>>>          I was
>>>>>>        hoping to get the community's advice on this aspect.
>>>>>>
>>>>>>        Cheers,
>>>>>>        -JD
>>>>>>
>>>>>>        On Thu, Feb 2, 2012 at 10:11 AM, JD Wong <jdmswong at gmail.com
>>>>>>        <mailto:jdmswong at gmail.com>
>>>>>>        <mailto:jdmswong at gmail.com <mailto:jdmswong at gmail.com>>>
>>>>>> wrote:
>>>>>>
>>>>>>            I haven't given ant and postgres enough to consume all the
>>>>>>        memory
>>>>>>            when running simultaneously.  Also there is plenty of free
>>>>>>        memory
>>>>>>            during loading.
>>>>>>
>>>>>>            -JD
>>>>>>
>>>>>>
>>>>>>            On Wed, Feb 1, 2012 at 10:35 PM, Josh Goodman
>>>>>>        <jogoodma at indiana.edu <mailto:jogoodma at indiana.edu>
>>>>>>        <mailto:jogoodma at indiana.edu <mailto:jogoodma at indiana.edu>>**>
>>>>>> wrote:
>>>>>>
>>>>>>                You need to be careful of the various memory settings
>>>>>>        here.  If you
>>>>>>                set ant really high (>25% of total memory) and you are
>>>>>> also
>>>>>>                setting Pg
>>>>>>                high you could be suffering from the two of them
>>>>>>        fighting over
>>>>>>                system
>>>>>>                resources and causing the swap to get thrashed.  I
>>>>>> would
>>>>>>        run the
>>>>>>                unix
>>>>>>        "free" command while you are running a load to see what is
>>>>>> going on
>>>>>>                with memory.
>>>>>>
>>>>>>                e.g.
>>>>>>
>>>>>>                free -m -s 5
>>>>>>
>>>>>>                If you have other processes running on this machine
>>>>>> (tomcat
>>>>>>                instances)
>>>>>>                you also need to adjust ant and Pg to take that into
>>>>>>        account.
>>>>>>
>>>>>>                Josh
>>>>>>
>>>>>>                On Wed, Feb 1, 2012 at 5:20 PM, JD Wong
>>>>>>        <jdmswong at gmail.com <mailto:jdmswong at gmail.com>
>>>>>>        <mailto:jdmswong at gmail.com <mailto:jdmswong at gmail.com>>>
>>>>>> wrote:
>>>>>>         > Hi all,
>>>>>>         > Loading my Go source takes on average 2500 seconds.  I have
>>>>>>                tuned the
>>>>>>         > postgres configuration paramaters to the desired values and
>>>>>>                gave ant high
>>>>>>         > heap memory to no avail.  Is there a way to speed this up?
>>>>>>         >
>>>>>>         > -JD
>>>>>>         >
>>>>>>         > ______________________________**___________________
>>>>>>
>>>>>>         > dev mailing list
>>>>>>         > dev at intermine.org <mailto:dev at intermine.org>
>>>>>>        <mailto:dev at intermine.org <mailto:dev at intermine.org>>
>>>>>>         > http://mail.intermine.org/cgi-**__bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>>>>>        <http://mail.intermine.org/**cgi-bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
>>>>>> >
>>>>>>
>>>>>>         >
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>        ______________________________**___________________
>>>>>>
>>>>>>        dev mailing list
>>>>>>        dev at intermine.org <mailto:dev at intermine.org>
>>>>>>        http://mail.intermine.org/cgi-**__bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>>>>>        <http://mail.intermine.org/**cgi-bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>>    ______________________________**___________________
>>>>>>
>>>>>>    dev mailing list
>>>>>>    dev at intermine.org <mailto:dev at intermine.org>
>>>>>>    http://mail.intermine.org/cgi-**__bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-__bin/mailman/listinfo/dev>
>>>>>>    <http://mail.intermine.org/**cgi-bin/mailman/listinfo/dev<http://mail.intermine.org/cgi-bin/mailman/listinfo/dev>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>
>>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>
>
>  --
> Ben Hitz
> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO
> Consortium
> Stanford University ** hitz at stanford.edu
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20120210/310eb078/attachment.html>


More information about the dev mailing list