[InterMine Dev] Go-annotation parser

Julie Sullivan julie at flymine.org
Mon Jun 25 10:32:44 BST 2012


James

The GO loader assigns GO to proteins or genes.  The GO post-process (run as part 
of the post-process `do-sources`) assigns protein GO annotation to the genes 
associated with that protein.

The only thing I can think of is that your data files changed.  I assume you are 
loading updated files?  Can you send them to me?  It's possible the GO is being 
assigned to genes directly and never proteins.

By default, GO is assigned to gene.  You can change this to assign to protein by 
editing the GO config file here:

	bio/sources/go-annotation/main/resources/go-annotation_config.properties

eg. human GO:

	9606.typeAnnotated=protein
	9606.identifier=primaryAccession

I'm assuming you have GAF 2.0 files (the current version).  If you have GAF 1.0, 
you can set the type of object created to be determined by column 12:

	http://www.geneontology.org/GO.format.gaf-1_0.shtml#db_object_type

Does that help you at all?

~~~~

Also, you can load GO from UniProt but the data is necessarily going to be less 
up-to-date than the GO files.  The parser assigns the GO directly to the genes 
and not the proteins and for some reason UniProt doesn't include the publication 
with the evidence code.

Here's a list of everything you can do with the UniProt source:

	http://intermine.org/wiki/UniProt#a2.2project.xml

(Two of those options were created for you!)

Thanks
Julie

On 22/06/12 13:16, James Blackshaw wrote:
> We have a separate go-annotation source. We didn't knwo it was possible to load
> from uniprot.
>
> -James
>
>
> On 22/06/2012 12:12, julie at flymine.org wrote:
>> Sorry for the delay James! I'm not in the office this week so I've fallen
>> a bit behind.
>>
>> Do you load GO from UniProt or do you have a separate GO source?
>>
>>> This is currently the only problem holding up my re-release. I don't
>>> think it's connected to the new uniprot format as we are using a version
>>> from last year at the moment. The go_annotation and all_go_annotation
>>> tables are filled, but the respective go_annotationprotein tables are
>>> empty.
>>>
>>> -James
>>>
>>> On 15/06/2012 17:06, James Blackshaw wrote:
>>>> Hi,
>>>> for some reason the Go-annotation parser isn't always creating links
>>>> from protein->goannotation. It worked the last time I ran an
>>>> integration once I'd ran summarise-objectstore in postprocess, but not
>>>> this time. I haven't changed the go-annoation source at all. Are there
>>>> any other factors I'm not considering here?
>>>>
>>>> -James
>>>
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at intermine.org
>>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>
>
>



More information about the dev mailing list