[InterMine Dev] loading interpro.xml

HongKee Moon moon at mpi-cbg.de
Thu Dec 10 09:08:51 GMT 2015


Hi Joe,

We can think of this problem in two ways.
One is your local setting and the other is postgres client setting.

Firstly, most of client applications depend on user environment setting.
So, I am wondering what is your locale setting in your postgres/intermine machine.
Recently, I have noticed that when the LC_ALL=en_US.UTF-8 is not set, JAVA does not care UTF-8 encoding anymore.

The other option is to change the postgres client setting. 
In this case, please follow the link(http://www.postgresql.org/message-id/4B79EE13.1030906@tpf.co.jp <http://www.postgresql.org/message-id/4B79EE13.1030906@tpf.co.jp>).

Hopefully, it will work for you.

Cheers,
HongKee

> On Dec 10, 2015, at 6:16 AM, Joe Carlson <jwcarlson at lbl.gov> wrote:
> 
> Hello,
> 
> Don’t you just hate it when code that you’ve been using for a year or so suddenly breaks?
> 
> I’ve just updated my version of interpro.xml (to 53.0) and am using your vanilla interpro data loader. I’m getting error messages because there are non-UTF8 characters:
>> 
>> [integrate] Caused by: java.sql.SQLException: Error writing to database, running statement COPY Attribute (name, intermine_value, itemId) FROM STDIN;                                                                                                            
>> [integrate] , data size = 2682870                                                                                               
>> [integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)                
>> [integrate]     at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:461)                                          
>> [integrate]     at java.lang.Thread.run(Thread.java:745)                                                                        
>> [integrate] Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0xfd                
>> [integrate]   Where: COPY attribute, line 42945                                                                                 
>> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)                   
>> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:966)                      
>> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:828)                                 
>> [integrate]     at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:59)                                                
>> [integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)                                                 
>> [integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)                                                 
>> [integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)                
>> [integrate]     ... 2 more                                                                                                      
>>      [null] Exiting /global/u1/j/jcarlson/src/intermine/bio/sources/interpro/build.xml.                     
> 
> 
> The header of interpro.xml says it has encoding ISO-8859-1. I’ve looked at it and it does have some naughty characters in it 
> 
> My server encoding is UTF8. I’ve tried playing around with iconv to make the xml UTF8, but nothing has helped (so far). I’d rather not resort to hand edits.
> 
> Which version of interpro.xml do you load in flymine? Did you have this problem? Or, should I open a ticket?
> 
> Thanks,
> 
> joe
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev


--
HongKee Moon
Software Engineer
Scientific Computing Facility

Max Planck Institute of Molecular Cell Biology and Genetics
Pfotenhauerstr. 108
01307 Dresden
Germany

fon: +49 351 210 2740
fax: +49 351 210 1689
www.mpi-cbg.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20151210/698a3238/attachment-0001.html>


More information about the dev mailing list