[InterMine Dev] Fwd: web app restarts

Joshua Heimbach jkh46 at cam.ac.uk
Fri May 8 18:31:19 BST 2015

Hi Joe,

I see your point about /service/regions/fasta. When we discovered that 
/service/regions/sequence had been removed rather than deprecated I 
pointed the old uri to what I thought was a comparable solution but I 
didn't choose the right one. service/regions/fasta is for finding 
features in a given range and that's not what you're looking for.

Can you take a look at 
? Given a query, start, and end parameter, it returns sequence data.

For example, sequence between 10 and 10,000 on chromosome 4:
model="genomic" view="Chromosome.sequence.residues"><constraint 
path="Chromosome" op="LOOKUP" value="4" extraValue="D. 

The result is a JSON object rather than fasta, but if it's close to what 
you need then we can debug from there as it's up-to-date in our 
codebase. There's still a chance that switching URI's could ease the 
memory issues, particularly since GenomicRegionSequenceExportServlet has 
been off our radar since 1.3.1.


On 08/05/15 18:01, Joe Carlson wrote:
> Hi Josh,
> Thanks for the quick response.
> What I’d like to get is the chromosome sequence between two specified 
> position. In earlier code, this was provided by the  URI
> /service/regions/sequence?query={“regions”:[“chromosome\tstart\tend”],”organism”:”short 
> name”}
> (we needed to tweak this a bit for our purposes, but this was the old 
> endpoint in your code.) And it returned chromosome sequence in a fasta 
> format.
> In 1.4, I noticed that you changed this so that it returned the 
> sequence of features - specified in the URI as an extra parameter such 
> as genes, exons, introns, … - contained within these coordinates. This 
> is the same as /service/regions/fasta. This isn’t quite what we want. 
> I had tried to specify ‘chromosome’ as the feature type but that was 
> rejected. I could not find another suitable endpoint.
> There is a routine GenomicRegionFastaService. I don’t know if this is 
> currently enabled in any service call, or how whether is would give me 
> what I want.
> The old code works for us but the caching is causing us heap out of 
> memory errors. We’ve just recently determined that this was a cause of 
> our restarts and are about to turn off caching. But if you know 
> another way to get this information, let me know.
> Thanks,
> Joe
> On May 8, 2015, at 7:57 AM, Josh Heimbach <josh at intermine.org 
> <mailto:josh at intermine.org>> wrote:
>> Hi Joe,
>> While adding web service documentation in intermine 1.3.1, the 
>> endpoint /service/regions/sequence was retired for the reason that it 
>> was duplicating information found elsewhere. Much of the codebase has 
>> been refactored and improved since 1.3, so perhaps using a different 
>> servlet might solve the memory issue.
>> Could you send me an example request that you would make to 
>> /service/regions/sequence along with its parameters? I'll look for a 
>> suitable alternative web service that returns the same information.
>> Thanks,
>> Josh
>>> -------- Forwarded Message --------
>>> Subject: [InterMine Dev] web app restarts
>>> Date: Thu, 07 May 2015 17:28:06 -0700
>>> From: Joe Carlson <jwcarlson at lbl.gov <mailto:jwcarlson at lbl.gov>>
>>> To: dev at intermine.org <mailto:dev at intermine.org> <dev at intermine.org 
>>> <mailto:dev at intermine.org>>, David M. Goodstein <dmgoodstein at lbl.gov 
>>> <mailto:dmgoodstein at lbl.gov>>
>>> Hi Julie and gang
>>> We have just deployed our latest phytozome build based on intermine 1.4.
>>> This is our first public release using Hikari.
>>> Our hopes were that going to hikari would solve some of the problem
>>> we've been seeing about tomcat restarts. We've traded emails about this
>>> in the past where we see that we have to restart every couple of hours
>>> when under load. (We run a 'are you alive' cron job every 3 minuts and
>>> force a restart if we don't get a response.)
>>> At the time I think we had attributed it to the postgres connection and
>>> we looking forward to the hikari pooling. It behaved well in internal
>>> use, but now that we're public we continue to see the restarts.
>>> I'm trying to do a little forensics to see what might be causing them.
>>> I'm seeing "OutOfMemoryError: Java heap space", typically after a call
>>> to retrieving the genomic sequence of a region
>>> (service/regions/sequence). I had noticed that you had removed this
>>> service in 1.4. I restored it since we're making use of it to deliver
>>> sequence to our main web portal. Did you remove this because you had
>>> seen it as being problematic?
>>> At this point, I'm not absolutely sure this is the source of the
>>> restarts but I'm very suspicious of
>>> org.intermine.bio.web.export.GenomicRegionSequenceExporter. There is a
>>> static map of entire chromosomes that is being stored. The substring is
>>> retrieved by calling substring on elements of this map. This may work
>>> for smaller mines but we have enough sequence in our database that I
>>> suspect this is part of our problem.
>>> Was this web service removed deliberately? Is there something to replace
>>> it? As I recall, the other sequence retrieval services I found only
>>> retrieved the sequence for specific features and not chromosome slices.
>>> Thanks,
>>> Joe Carlson
>>> _______________________________________________
>>> dev mailing list
>>> dev at intermine.org <mailto:dev at intermine.org>
>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org <mailto:dev at intermine.org>
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20150508/8617024c/attachment-0001.html>

More information about the dev mailing list