[InterMine Dev] Gff3 loader custom Gff3 Handler questions

Fengyuan Hu fh293 at cam.ac.uk
Tue Aug 27 17:02:39 BST 2013


Hi Pushkala,

Sorry for the delay. I think I know where goes wrong in the converter, I 
need some time to fix it. Bear with me.

Fengyuan

On 12/08/13 19:06, Jayaraman, Pushkala wrote:
>
> Hello,
>
> I have a couple questions regarding the Gff3 handler.
>
> Now with a gff3 file that looks like this (see below) with the 
> gff_config.properties file having the following attributes:
>
> 10116.terms=gene, mRNA, Exon, CDS, ThreePrimeUTR, FivePrimeUTR
>
> 10116.attributes.ID=primaryIdentifier
>
> 10116.attributes.ID=secondaryIdentifier
>
> 10116.attributes.Note=description
>
> 10116.attributes.Dbxref.EntrezGene=ncbiGeneNumber
>
> 10116.attributes.Dbxref.EnsemblGenes=synonym
>
> Im having problems loading this data as I find that the description 
> doesn't get loaded.
>
> From what I understoof in the docs, adding your required fileds in the 
> gff_config.properties will allow the gff3 parser to extract those 
> values and assign them to the required fields in the gene model.. Am I 
> using the gff_config.properties file wrongly?
>
> Or am I supposed to write a custom gff3 parser irrespective of what I 
> have in the gff_config.properties file?
>
> 10      RGD     gene    4816612 4817340 .       +       . 
> Name=Tnp2;Alias=RGD3885,3885,transition protein 
> 2;ID=RGD:3885;Note=ENCODES a protein that exhibits zinc ion bindin
>
> g AND  INVOLVED IN acrosome reaction (ortholog) AND  binding of sperm 
> to zona pellucida (ortholog) AND  penetration of zona pellucida 
> (ortholog) AND  FOUND IN nucleus AND  INTERA
>
> CTS WITH 17alpha-ethynylestradiol AND ammonium chloride AND  cadmium 
> dichloride;fullName=transition protein 
> 2;Dbxref=EntrezGene:24840,UniGene:10430,IMAGE_CLONE:7131008,MGC_CLONE
>
> :BC078849,EnsemblGenes:ENSRNOG00000002566,UniProt:P11101,UniProt:B3LF38,EnsemblGenes:ENSRNOG00000002566;
>
> 10      RGD     gene    56399721 56411150        .       +       . 
> Name=Tp53;Alias=RGD3889,3889,tumor protein 
> p53;ID=RGD:3889;Note=ENCODES a protein that exhibits pr
>
> otein C-terminus binding AND sequence-specific DNA binding AND  
> ubiquitin protein ligase binding AND  INVOLVED IN aging AND  cellular 
> response to organonitrogen compound AND  ne
>
> gative regulation of DNA biosynthetic process AND  PARTICIPATES IN 
> altered p53 signaling pathway AND  endometrial cancer pathway AND  
> non-small cell lung cancer pathway AND  ASSO
>
> CIATED WITH Dementia  Vascular AND Diabetic Nephropathies AND  
> Ischemia AND  FOUND IN chromatin AND  cytoplasm AND  cytosol AND  
> INTERACTS WITH (-)-citrinin AND  (-)-epigallocat
>
> echin 3-gallate AND  (R)-lipoic acid;fullName=tumor protein 
> p53;Dbxref=PharmGKB:PA36679,EntrezGene:24842,UniGene:54443,EnsemblGenes:ENSRNOG00000010756,KEGGPathway:04010,KEGGPathw
>
> ay:04110,KEGGPathway:04115,KEGGPathway:04210,KEGGPathway:04310,KEGGPathway:04722,KEGGPathway:05014,KEGGPathway:05016,KEGGPathway:05200,KEGGPathway:05210,KEGGPathway:05212,KEGGPat
>
> hway:05213,KEGGPathway:05214,KEGGPathway:05215,KEGGPathway:05216,KEGGPathway:05217,KEGGPathway:05218,KEGGPathway:05219,KEGGPathway:05220,KEGGPathway:05222,KEGGPathway:05223,IMAGE
>
> _CLONE:7193583,MGC_CLONE:BC081788,IMAGE_CLONE:7384467,MGC_CLONE:BC098663,UniProt:P10361,EnsemblGenes:ENSRNOG00000010756,KEGGPathway:05160,KEGGPathway:05162,KEGGPathway:05166,KEGG
>
> Pathway:05168,KEGGPathway:04151,KEGGPathway:05161,KEGGPathway:05203,KEGGPathway:05202,KEGGPathway:05169,KEGGPathway:05205,UniProt:Q9JLD9;
>
> 10      RGD     mRNA    4816612 4817340 .       +       . 
> Name=NM_017057;Parent=RGD:3885;gene=Tnp2;RefSeqStatus=PROVISIONAL;Alias=RGD:2752358;ID=mRNARGD2752358_t00;isNon-Co
>
> ding=N;
>
> 10      RGD     mRNA    56399721 56411150        .       +       . 
> Name=NM_030989;Parent=RGD:3889;gene=Tp53;RefSeqStatus=REVIEWED;Alias=RGD:2752318;ID=mRNARGD2752318
>
> _t00;isNon-Coding=N;
>
> Pushkala Jayaraman
>
> Programmer/Analyst - Rat Genome Database
>
> Human and Molecular Genetics Center
>
> Medical College of Wisconsin
>
> 414-955-2229
>
> http://rgd.mcw.edu
>
>
>
> _______________________________________________
> dev mailing list
> dev at intermine.org
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.intermine.org/pipermail/dev/attachments/20130827/906fc8b0/attachment.html>


More information about the dev mailing list