[InterMine Dev] loading interpro.xml

Joe Carlson jwcarlson at lbl.gov
Thu Dec 10 05:16:53 GMT 2015


Don’t you just hate it when code that you’ve been using for a year or so suddenly breaks?

I’ve just updated my version of interpro.xml (to 53.0) and am using your vanilla interpro data loader. I’m getting error messages because there are non-UTF8 characters:
> [integrate] Caused by: java.sql.SQLException: Error writing to database, running statement COPY Attribute (name, intermine_value, itemId) FROM STDIN;                                                                                                            
> [integrate] , data size = 2682870                                                                                               
> [integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)                
> [integrate]     at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:461)                                          
> [integrate]     at java.lang.Thread.run(Thread.java:745)                                                                        
> [integrate] Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0xfd                
> [integrate]   Where: COPY attribute, line 42945                                                                                 
> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)                   
> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:966)                      
> [integrate]     at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:828)                                 
> [integrate]     at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:59)                                                
> [integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)                                                 
> [integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)                                                 
> [integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)                
> [integrate]     ... 2 more                                                                                                      
>      [null] Exiting /global/u1/j/jcarlson/src/intermine/bio/sources/interpro/build.xml.                     

The header of interpro.xml says it has encoding ISO-8859-1. I’ve looked at it and it does have some naughty characters in it 

My server encoding is UTF8. I’ve tried playing around with iconv to make the xml UTF8, but nothing has helped (so far). I’d rather not resort to hand edits.

Which version of interpro.xml do you load in flymine? Did you have this problem? Or, should I open a ticket?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20151209/1b229382/attachment.html>

More information about the dev mailing list