[InterMine Dev] Gff3 loader custom Gff3 Handler questions

Fengyuan Hu fh293 at cam.ac.uk
Wed Aug 28 11:58:46 BST 2013


Hi Pushkala,

I have just pushed a patch, please take a look here:
https://github.com/intermine/intermine/commit/08a07641b14c2b1b759bfb5cb0d3e5cde626e777

I don't have ncbiGeneNumber in my model, could you please test it at 
your side?

Cheers
Fengyuan

On 27/08/13 17:02, Fengyuan Hu wrote:
> Hi Pushkala,
>
> Sorry for the delay. I think I know where goes wrong in the converter, 
> I need some time to fix it. Bear with me.
>
> Fengyuan
>
> On 12/08/13 19:06, Jayaraman, Pushkala wrote:
>>
>> Hello,
>>
>> I have a couple questions regarding the Gff3 handler.
>>
>> Now with a gff3 file that looks like this (see below) with the 
>> gff_config.properties file having the following attributes:
>>
>> 10116.terms=gene, mRNA, Exon, CDS, ThreePrimeUTR, FivePrimeUTR
>>
>> 10116.attributes.ID=primaryIdentifier
>>
>> 10116.attributes.ID=secondaryIdentifier
>>
>> 10116.attributes.Note=description
>>
>> 10116.attributes.Dbxref.EntrezGene=ncbiGeneNumber
>>
>> 10116.attributes.Dbxref.EnsemblGenes=synonym
>>
>> Im having problems loading this data as I find that the description 
>> doesn't get loaded.
>>
>> From what I understoof in the docs, adding your required fileds in 
>> the gff_config.properties will allow the gff3 parser to extract those 
>> values and assign them to the required fields in the gene model.. Am 
>> I using the gff_config.properties file wrongly?
>>
>> Or am I supposed to write a custom gff3 parser irrespective of what I 
>> have in the gff_config.properties file?
>>
>> 10      RGD     gene    4816612 4817340 .       +       . 
>> Name=Tnp2;Alias=RGD3885,3885,transition protein 
>> 2;ID=RGD:3885;Note=ENCODES a protein that exhibits zinc ion bindin
>>
>> g AND  INVOLVED IN acrosome reaction (ortholog) AND  binding of sperm 
>> to zona pellucida (ortholog) AND  penetration of zona pellucida 
>> (ortholog) AND  FOUND IN nucleus AND  INTERA
>>
>> CTS WITH 17alpha-ethynylestradiol AND ammonium chloride AND  cadmium 
>> dichloride;fullName=transition protein 
>> 2;Dbxref=EntrezGene:24840,UniGene:10430,IMAGE_CLONE:7131008,MGC_CLONE
>>
>> :BC078849,EnsemblGenes:ENSRNOG00000002566,UniProt:P11101,UniProt:B3LF38,EnsemblGenes:ENSRNOG00000002566;
>>
>> 10      RGD     gene    56399721 56411150        .       +       . 
>> Name=Tp53;Alias=RGD3889,3889,tumor protein 
>> p53;ID=RGD:3889;Note=ENCODES a protein that exhibits pr
>>
>> otein C-terminus binding AND sequence-specific DNA binding AND  
>> ubiquitin protein ligase binding AND  INVOLVED IN aging AND  cellular 
>> response to organonitrogen compound AND  ne
>>
>> gative regulation of DNA biosynthetic process AND  PARTICIPATES IN 
>> altered p53 signaling pathway AND  endometrial cancer pathway AND  
>> non-small cell lung cancer pathway AND  ASSO
>>
>> CIATED WITH Dementia  Vascular AND Diabetic Nephropathies AND  
>> Ischemia AND  FOUND IN chromatin AND  cytoplasm AND  cytosol AND  
>> INTERACTS WITH (-)-citrinin AND  (-)-epigallocat
>>
>> echin 3-gallate AND  (R)-lipoic acid;fullName=tumor protein 
>> p53;Dbxref=PharmGKB:PA36679,EntrezGene:24842,UniGene:54443,EnsemblGenes:ENSRNOG00000010756,KEGGPathway:04010,KEGGPathw
>>
>> ay:04110,KEGGPathway:04115,KEGGPathway:04210,KEGGPathway:04310,KEGGPathway:04722,KEGGPathway:05014,KEGGPathway:05016,KEGGPathway:05200,KEGGPathway:05210,KEGGPathway:05212,KEGGPat
>>
>> hway:05213,KEGGPathway:05214,KEGGPathway:05215,KEGGPathway:05216,KEGGPathway:05217,KEGGPathway:05218,KEGGPathway:05219,KEGGPathway:05220,KEGGPathway:05222,KEGGPathway:05223,IMAGE
>>
>> _CLONE:7193583,MGC_CLONE:BC081788,IMAGE_CLONE:7384467,MGC_CLONE:BC098663,UniProt:P10361,EnsemblGenes:ENSRNOG00000010756,KEGGPathway:05160,KEGGPathway:05162,KEGGPathway:05166,KEGG
>>
>> Pathway:05168,KEGGPathway:04151,KEGGPathway:05161,KEGGPathway:05203,KEGGPathway:05202,KEGGPathway:05169,KEGGPathway:05205,UniProt:Q9JLD9;
>>
>> 10      RGD     mRNA    4816612 4817340 .       +       . 
>> Name=NM_017057;Parent=RGD:3885;gene=Tnp2;RefSeqStatus=PROVISIONAL;Alias=RGD:2752358;ID=mRNARGD2752358_t00;isNon-Co
>>
>> ding=N;
>>
>> 10      RGD     mRNA    56399721 56411150        .       +       . 
>> Name=NM_030989;Parent=RGD:3889;gene=Tp53;RefSeqStatus=REVIEWED;Alias=RGD:2752318;ID=mRNARGD2752318
>>
>> _t00;isNon-Coding=N;
>>
>> Pushkala Jayaraman
>>
>> Programmer/Analyst - Rat Genome Database
>>
>> Human and Molecular Genetics Center
>>
>> Medical College of Wisconsin
>>
>> 414-955-2229
>>
>> http://rgd.mcw.edu
>>
>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>
>
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20130828/66be00b0/attachment-0001.html>


More information about the dev mailing list