[InterMine Dev] UniProt Parser problems.

Julie Sullivan julie at flymine.org
Wed Sep 14 11:39:26 BST 2011



On 14/09/11 11:09, James Blackshaw wrote:
> Hi,
>
> We would like to know if the UniProt parser could be changed to allow loading of
> fragments as a option, as we'd like to include them in MitoMiner and would
> rather not have to hack the parser to do it.

Yes, I'll add a parameter to the config.

> Also, we've noticed a problem with the parser when it runs across entries that
> used to be covered by the same accession number. For example P06748 and P10276
> both used to be known as Q13440, but are now recorded as two separate proteins.
> As the uniprot entries both contain Q13440 as a secondary accession, when the
> parser encounters it again it considers the record to be a duplicate and does
> not populate any fields. It's only a handful of records in our dataset, which is
> why we didn't notice it until now. The problem is that if we take out the
> checking for duplicates, of course we wind up with a lot of duplicat entries.
> Could the parser be changed to have an option to only check against the first
> UniProt accession in a file?

Yes, will fix.  That's not very clever; I don't see a problem with secondary 
accessions having duplicate values.

> Regards,
> James
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>



More information about the dev mailing list