[InterMine Dev] Fwd: web app restarts

Joe Carlson jwcarlson at lbl.gov
Fri May 8 18:48:21 BST 2015


Hi Josh,

This will keep us happy. It’s not in 1.4, right? (I forget what version I had last merged with. 1.4.something)

Is it too late for some enhancements? If start > end, then can you return reverse complement sequence from end to start?

Also, we often get people who want to see sequence upstream of the CDS. Is there a way to provide a web service which is sequence from a specific sequence with a prescribed upstream or downstream offset?

Thanks,

Joe


On May 8, 2015, at 10:31 AM, Joshua Heimbach <jkh46 at cam.ac.uk> wrote:

> Hi Joe,
> 
> I see your point about /service/regions/fasta. When we discovered that /service/regions/sequence had been removed rather than deprecated I pointed the old uri to what I thought was a comparable solution but I didn't choose the right one. service/regions/fasta is for finding features in a given range and that's not what you're looking for.
> 
> Can you take a look at http://iodocs.labs.intermine.org/flymine/docs#/ws-sequence/GET/sequence ? Given a query, start, and end parameter, it returns sequence data.
> 
> For example, sequence between 10 and 10,000 on chromosome 4:
> www.flymine.org/query/service/sequence?start=10&end=1000&query=<query model="genomic" view="Chromosome.sequence.residues"><constraint path="Chromosome" op="LOOKUP" value="4" extraValue="D. melanogaster"/></query>
> 
> The result is a JSON object rather than fasta, but if it's close to what you need then we can debug from there as it's up-to-date in our codebase. There's still a chance that switching URI's could ease the memory issues, particularly since GenomicRegionSequenceExportServlet has been off our radar since 1.3.1.
> 
> Thanks,
> Josh
> 
> 
> On 08/05/15 18:01, Joe Carlson wrote:
>> Hi Josh,
>> 
>> Thanks for the quick response.
>> 
>> What I’d like to get is the chromosome sequence between two specified position. In earlier code, this was provided by the  URI 
>> 
>> /service/regions/sequence?query={“regions”:[“chromosome\tstart\tend”],”organism”:”short name”}
>> 
>> (we needed to tweak this a bit for our purposes, but this was the old endpoint in your code.) And it returned chromosome sequence in a fasta format.
>> 
>> In 1.4, I noticed that you changed this so that it returned the sequence of features - specified in the URI as an extra parameter such as genes, exons, introns, … - contained within these coordinates. This is the same as /service/regions/fasta. This isn’t quite what we want. I had tried to specify ‘chromosome’ as the feature type but that was rejected. I could not find another suitable endpoint.
>> 
>> There is a routine GenomicRegionFastaService. I don’t know if this is currently enabled in any service call, or how whether is would give me what I want.
>> 
>> The old code works for us but the caching is causing us heap out of memory errors. We’ve just recently determined that this was a cause of our restarts and are about to turn off caching. But if you know another way to get this information, let me know.
>> 
>> Thanks,
>> 
>> 
>> Joe
>> 
>> 
>> On May 8, 2015, at 7:57 AM, Josh Heimbach <josh at intermine.org> wrote:
>> 
>>> Hi Joe,
>>> 
>>> While adding web service documentation in intermine 1.3.1, the endpoint /service/regions/sequence was retired for the reason that it was duplicating information found elsewhere. Much of the codebase has been refactored and improved since 1.3, so perhaps using a different servlet might solve the memory issue.
>>> 
>>> Could you send me an example request that you would make to /service/regions/sequence along with its parameters? I'll look for a suitable alternative web service that returns the same information.
>>> 
>>> Thanks,
>>> Josh
>>> 
>>> 
>>>> 
>>>> -------- Forwarded Message --------
>>>> Subject: [InterMine Dev] web app restarts
>>>> Date: Thu, 07 May 2015 17:28:06 -0700
>>>> From: Joe Carlson <jwcarlson at lbl.gov>
>>>> To: dev at intermine.org <dev at intermine.org>, David M. Goodstein <dmgoodstein at lbl.gov>
>>>> 
>>>> Hi Julie and gang
>>>> 
>>>> We have just deployed our latest phytozome build based on intermine 1.4.
>>>> This is our first public release using Hikari.
>>>> 
>>>> Our hopes were that going to hikari would solve some of the problem
>>>> we've been seeing about tomcat restarts. We've traded emails about this
>>>> in the past where we see that we have to restart every couple of hours
>>>> when under load. (We run a 'are you alive' cron job every 3 minuts and
>>>> force a restart if we don't get a response.)
>>>> 
>>>> At the time I think we had attributed it to the postgres connection and
>>>> we looking forward to the hikari pooling. It behaved well in internal
>>>> use, but now that we're public we continue to see the restarts.
>>>> 
>>>> I'm trying to do a little forensics to see what might be causing them.
>>>> I'm seeing "OutOfMemoryError: Java heap space", typically after a call
>>>> to retrieving the genomic sequence of a region
>>>> (service/regions/sequence). I had noticed that you had removed this
>>>> service in 1.4. I restored it since we're making use of it to deliver
>>>> sequence to our main web portal. Did you remove this because you had
>>>> seen it as being problematic?
>>>> 
>>>> At this point, I'm not absolutely sure this is the source of the
>>>> restarts but I'm very suspicious of
>>>> org.intermine.bio.web.export.GenomicRegionSequenceExporter. There is a
>>>> static map of entire chromosomes that is being stored. The substring is
>>>> retrieved by calling substring on elements of this map. This may work
>>>> for smaller mines but we have enough sequence in our database that I
>>>> suspect this is part of our problem.
>>>> 
>>>> Was this web service removed deliberately? Is there something to replace
>>>> it? As I recall, the other sequence retrieval services I found only
>>>> retrieved the sequence for specific features and not chromosome slices.
>>>> 
>>>> Thanks,
>>>> 
>>>> Joe Carlson
>>>> 
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev at intermine.org
>>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> dev mailing list
>>> dev at intermine.org
>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>> 
>> 
>> 
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20150508/f4b95858/attachment.html>


More information about the dev mailing list