[InterMine Dev] PSQL exception

Julie Sullivan julie at flymine.org
Tue Dec 2 09:55:49 GMT 2014


Hi Aditi,

InterPro loads protein domains - their names and descriptions. We load 
the proteins and their associated protein domains from UniProt.

Here is where the description is set:

https://github.com/intermine/intermine/blob/beta/bio/sources/interpro/main/src/org/intermine/bio/dataconversion/InterProConverter.java#L187

Julie

On 01/12/14 17:45, Tayal, Aditi wrote:
> Hello Julie,
>
> I am trying to load Interpro. The below mentioned code is already
> present in Uniprot. Are uniprot and inter pro related? If the below
> mentioned code has to be added to the interproConverter.java file then
> were exactly does it fit it?
>
> Aditi
> On Nov 26, 2014, at 4:06 AM, Julie Sullivan <julie at flymine.org
> <mailto:julie at flymine.org>> wrote:
>
>> Which data source are you loading? InterPro? Which data file are you
>> loading?
>>
>> The protein domain description is longer than the index size for postgres:
>>
>> > Caused by: org.postgresql.util.PSQLException: ERROR: index row size 2864
>> > exceeds maximum 2712 for index "proteindomain__description_equals"
>>
>> I've seen this before, in the past we've just chopped off the end of
>> the description. See here:
>>
>> https://github.com/intermine/intermine/blob/beta/bio/sources/uniprot/main/src/org/intermine/bio/dataconversion/UniprotConverter.java#L486-L494
>>
>> Try that?
>>
>> I've made a ticket here:
>>
>> https://github.com/intermine/intermine/issues/846
>>
>> On 25/11/14 23:44, Tayal, Aditi wrote:
>>> Hi Julie,
>>>
>>> The parameters of PSQL database provided in the manual is enough to
>>> support one instance of intermine ? I have 3 instances running now and
>>> they seem to be giving the following error:
>>>
>>>
>>>
>>>  at
>>> org.intermine.dataloader.ObjectStoreDataLoader.process(ObjectStoreDataLoader.java:201)
>>>         at
>>> org.intermine.dataloader.ObjectStoreDataLoader.process(ObjectStoreDataLoader.java:60)
>>>         at
>>> org.intermine.dataloader.ObjectStoreDataLoaderTask.execute(ObjectStoreDataLoaderTask.java:128)
>>>         at
>>> org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
>>>         at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:616)
>>>         at
>>> org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
>>>         at org.apache.tools.ant.Task.perform(Task.java:348)
>>>         at org.apache.tools.ant.Target.execute(Target.java:435)
>>>         at org.apache.tools.ant.Target.performTasks(Target.java:456)
>>>         at
>>> org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
>>>         at
>>> org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
>>>         at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
>>>         at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:441)
>>>         at org.intermine.task.Integrate.performAction(Integrate.java:223)
>>>         at org.intermine.task.Integrate.performAction(Integrate.java:136)
>>>         at org.intermine.task.Integrate.execute(Integrate.java:127)
>>>         at
>>> org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
>>>         at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:616)
>>>         at
>>> org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
>>>         at org.apache.tools.ant.Task.perform(Task.java:348)
>>>         at org.apache.tools.ant.Target.execute(Target.java:435)
>>>         at org.apache.tools.ant.Target.performTasks(Target.java:456)
>>>         at
>>> org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
>>>         at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
>>>         at
>>> org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
>>>         at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
>>>         at org.apache.tools.ant.Main.runBuild(Main.java:851)
>>>         at org.apache.tools.ant.Main.startAnt(Main.java:235)
>>>         at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
>>>         at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
>>> Caused by: java.sql.SQLException: Error writing to database, running
>>> statement COPY ProteinDomain (class, secondaryIdentifier, symbol,
>>> shortName, type, primaryIdentifier, id, name, description, organismId)
>>> FROM STDIN BINARY, data size = 23584616
>>>         at
>>> org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)
>>>         at
>>> org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:450)
>>>         at java.lang.Thread.run(Thread.java:679)
>>> Caused by: org.postgresql.util.PSQLException: ERROR: index row size 2864
>>> exceeds maximum 2712 for index "proteindomain__description_equals"
>>>   Hint: Values larger than 1/3 of a buffer page cannot be indexed.
>>> Consider a function index of an MD5 hash of the value, or use full text
>>> indexing.
>>>   Where: COPY proteindomain, line 413
>>>         at
>>> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2079)
>>>         at
>>> org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:941)
>>>         at
>>> org.postgresql.core.v3.QueryExecutorImpl.writeToCopy(QueryExecutorImpl.java:848)
>>>         at
>>> org.postgresql.core.v3.CopyInImpl.writeToCopy(CopyInImpl.java:53)
>>>         at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)
>>>         at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:163)
>>>         at
>>> org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)
>>>
>>>
>>> Any advice will be appreciated.
>>>
>>> Thank you,
>>>
>>>
>>>
>>>
>>> On Nov 24, 2014, at 9:40 AM, Tayal, Aditi <tayala at missouri.edu
>>> <mailto:tayala at missouri.edu>
>>> <mailto:tayala at missouri.edu>> wrote:
>>>
>>>> Thank you so much!
>>>>
>>>> On Nov 21, 2014, at 11:30 PM, Tayal, Aditi <tayala at missouri.edu
>>>> <mailto:tayala at missouri.edu>
>>>> <mailto:tayala at missouri.edu>> wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> Does ensemble compara parser read multiple files for the same
>>>>> organism? I ask because I am trying to load multiple >7 orthologues
>>>>> for each species.
>>>>> Does ensemble compare allow more that 2 or 3 set of orthologues at a
>>>>> time(i. e. in the same file)?
>>>>>
>>>>> Thank you
>>>>>
>>>>>
>>>>> On Nov 21, 2014, at 11:27 AM, Julie Sullivan <julie at flymine.org
>>>>> <mailto:julie at flymine.org>
>>>>> <mailto:julie at flymine.org>> wrote:
>>>>>
>>>>>> I can't think of anything! Maybe just check for silly mistakes, the
>>>>>> build system is case sensitive and fragile when it comes to
>>>>>> whitespace punctuation etc.
>>>>>>
>>>>>> On 21/11/14 17:12, Tayal, Aditi wrote:
>>>>>>> Dear Julie,
>>>>>>> I am using the same file as you are. I am loading the Pogonomyrmex
>>>>>>> Solenopsiswith Harpegnathos saltator.
>>>>>>> Pogonomyrmex Solenopsisloads but Harpegnathos saltator does not! I
>>>>>>> tried
>>>>>>> using ant clean. I tested it will all the versions of the files from
>>>>>>> Orthodb.
>>>>>>>
>>>>>>> Is there anything else that can be done?
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Aditi Tayal
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Begin forwarded message:
>>>>>>>
>>>>>>>> *From: *Julie Sullivan <julie at flymine.org <mailto:julie at flymine.org>
>>>>>>>> <mailto:julie at flymine.org><mailto:julie at flymine.org>>
>>>>>>>> *Subject: * *Re: Fwd: Orthodb configuration*
>>>>>>>> *Date: *November 21, 2014 at 10:35:21 AM CST
>>>>>>>> *To: *"Tayal, Aditi" <tayala at missouri.edu
>>>>>>>> <mailto:tayala at missouri.edu>
>>>>>>>> <mailto:tayala at missouri.edu><mailto:tayala at missouri.edu>>
>>>>>>>>
>>>>>>>> Oh yes! It's possible they've changed their file format. I tested
>>>>>>>> using this file:
>>>>>>>>
>>>>>>>> OrthoDB7_ALL_METAZOA_tabtext
>>>>>>>>
>>>>>>>> Which version are you using?
>>>>>>>>
>>>>>>>> On 21/11/14 16:31, Tayal, Aditi wrote:
>>>>>>>>> Hi Julie,
>>>>>>>>>
>>>>>>>>> I still cannot get it to work. Do you think it is dependent on the
>>>>>>>>> version of the file from orthodb? Orthodb has D5 D6 and D7.
>>>>>>>>>
>>>>>>>>> Thank you,
>>>>>>>>>
>>>>>>>>> Begin forwarded message:
>>>>>>>>>
>>>>>>>>>> *From: *Julie Sullivan <julie at flymine.org
>>>>>>>>>> <mailto:julie at flymine.org><mailto:julie at flymine.org>
>>>>>>>>>> <mailto:julie at flymine.org><mailto:julie at flymine.org>>
>>>>>>>>>> *Subject: * *Re: Orthodb configuration*
>>>>>>>>>> *Date: *November 21, 2014 at 5:39:42 AM CST
>>>>>>>>>> *To: *"Tayal, Aditi" <tayala at missouri.edu
>>>>>>>>>> <mailto:tayala at missouri.edu>
>>>>>>>>>> <mailto:tayala at missouri.edu>
>>>>>>>>>> <mailto:tayala at missouri.edu><mailto:tayala at missouri.edu>>
>>>>>>>>>> *Cc: *"dev at intermine.org
>>>>>>>>>> <mailto:dev at intermine.org><mailto:dev at intermine.org>
>>>>>>>>>> <mailto:dev at intermine.org><mailto:dev at intermine.org>"
>>>>>>>>>> <dev at intermine.org <mailto:dev at intermine.org>
>>>>>>>>>> <mailto:dev at intermine.org><mailto:dev at intermine.org><mailto:dev at intermine.org>>
>>>>>>>>>>
>>>>>>>>>> Hi Aditi
>>>>>>>>>>
>>>>>>>>>> Again, that worked for me. I added those lines of config and
>>>>>>>>>> successfully created 15464 genes (and associated homologues).
>>>>>>>>>>
>>>>>>>>>> Maybe do what you did for the harvester ant, run "ant clean"
>>>>>>>>>> and try
>>>>>>>>>> again?
>>>>>>>>>>
>>>>>>>>>> Julie
>>>>>>>>>>
>>>>>>>>>> On 20/11/14 21:45, Tayal, Aditi wrote:
>>>>>>>>>>> Hi Julie,
>>>>>>>>>>>
>>>>>>>>>>> I have been trying to configure orthodb for the following species
>>>>>>>>>>> and my
>>>>>>>>>>> database is empty. I did the same thing you advised me for
>>>>>>>>>>> Pogonomyrmex
>>>>>>>>>>> barbatus. Please refer to the email below.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> taxon.610380.genus= Harpegnathos
>>>>>>>>>>> taxon.610380.species= saltator
>>>>>>>>>>> Thank you
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Nov 18, 2014, at 11:01 AM,julie at flymine.org
>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>> <mailto:julie at flymine.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> That worked for me. I added those lines, and successfully
>>>>>>>>>>>> created
>>>>>>>>>>>> genes.
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe run ant clean first? e.g. you are still using the old
>>>>>>>>>>>> compiled
>>>>>>>>>>>> files.
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Julie,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I added the code you suggested. My database is till empty for
>>>>>>>>>>>>> Pogonomyrmex
>>>>>>>>>>>>> barbatus. Are there any other modifications I might have to
>>>>>>>>>>>>> make?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Nov 18, 2014, at 3:54 AM, Julie Sullivan
>>>>>>>>>>>>> <julie at flymine.org
>>>>>>>>>>>>> <mailto:julie at flymine.org><mailto:julie at flymine.org>
>>>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>>>> <mailto:julie at flymine.org>
>>>>>>>>>>>>> <mailto:julie at flymine.org>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> /Does the parser for orthodb data automatically updates its
>>>>>>>>>>>>>>> lookup
>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>> depending on the organism?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, we use a file we generated to look up organism names. If
>>>>>>>>>>>>>> you add
>>>>>>>>>>>>>> your organism to it, it should work fine. Here is what worked
>>>>>>>>>>>>>> for me:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/intermine/intermine/commit/330309caac713ceb950f5e6efa6a4fdd53ced1ac
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've made a ticket to do this automatically:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/intermine/intermine/issues/814
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 17/11/14 20:04, Tayal, Aditi wrote:
>>>>>>>>>>>>>>> Hi Julie,
>>>>>>>>>>>>>>> I am trying to load orthodb data for "Pogonomyrmex barbatus/"
>>>>>>>>>>>>>>> //Taxonomy
>>>>>>>>>>>>>>> ID: /144034/ in intermine. The gene id for this organism is
>>>>>>>>>>>>>>> present in
>>>>>>>>>>>>>>> the gff. However, the data is from the orthodb file is not
>>>>>>>>>>>>>>> loading  and
>>>>>>>>>>>>>>> the database is empty. I tried doing this with and without
>>>>>>>>>>>>>>> the ID
>>>>>>>>>>>>>>> resolver. /
>>>>>>>>>>>>>>> /Does the parser for orthodb data automatically updates its
>>>>>>>>>>>>>>> lookup
>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>> depending on the organism? Or is there some modifications
>>>>>>>>>>>>>>> we have
>>>>>>>>>>>>>>> to do
>>>>>>>>>>>>>>> to the parser / gene info table.?/
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> / Following is a snip of the orthodb and the gff /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /orthodb file(/OrthoDB7_ALL_METAZOA_tabtext)
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> 286EOG72K577PB22745-PA*PB22745*Pogonomyrmex
>>>>>>>>>>>>>>> barbatus9HYMENULLNULLIPR000608 IPR016135
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>> gff file
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> scf71pbar1.7gene24356122436860.-.*ID=PB22745;Name=PB22745*;Alias=PB22745;
>>>>>>>>>>>>>>> scf71pbar1.7mRNA24356122436860.-.ID=PB22745-RA;Name=PB22745-RA;Alias=PB22745-RA
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Orthodb_config file
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 144034.geneid=primaryIdentifier
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Orthodb_keys file
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Gene.key_primaryidentifier=primaryIdentifier
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Gff keys file
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Gene.key_primaryidentifier=primaryIdentifier
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you
>



More information about the dev mailing list