[InterMine Dev] Items xml specification suggestion

Richard Smith richard at flymine.org
Thu Sep 15 16:13:11 BST 2011


There's some inconsistency between the Java API and use of Perl/XML.

An empty string can be a valid field value, and a particular source may
have priority for that field to force that the empty string overwrites
any value provided by other sources (admittedly unlikely).

In the Java code the check is made for an empty string unless a
specific method is used to force the empty string value, but this
isn't possible when writing to XML.

Anyway, Alex is right the current preference is for explicitness as it
helps to catch data problems.

Cheers,
Richard.






On 15/09/2011 15:40, Alex Kalderimis wrote:
> When I raised this question myself the current practice was described as
> favouring explicitness over concision - files should not by this
> reckoning mistakenly contain gaps in the information due to oversights.
>
> I could add a flag to the Perl modules that would cause empty field
> values to not be written, so that would give you that functionality on
> request if you needed it.
>
> Alex Kalderimis
>
> On Wed, 2011-09-14 at 17:01 -0400, JD Wong wrote:
>> Hi, I have a suggestion for a change to the items XML specifications.
>>   Currently we are not allowed to have null values as field values.
>>
>>
>> For example this is allowed:
>> <item id="2_1" class="Gene">
>>       <attribute name="primaryIdentifier" value="CG42703"/>
>>        <attribute name="symbol" value="CG42703"/>
>>        <attribute name="FlyBaseFeatureType"
>> value="protein_coding_gene"/>
>>        <attribute name="FlyBaseAnnotationSymbol" value="CG42703"/>
>>        <attribute name="FlyBaseID" value="FBgn0038156"/>
>>        <attribute name="FlyBaseCytogenicMap" value="59C4-59C4"/>
>> </item>
>>
>> but not this:
>>
>> <item id="2_1" class="Gene">
>>       <attribute name="primaryIdentifier" value="CG42703"/>
>>        <attribute name="symbol" value="CG42703"/>
>>        <attribute name="FlyBaseFeatureType"
>> value="protein_coding_gene"/>
>>        <attribute name="FlyBaseAnnotationSymbol" value="CG42703"/>
>>        <attribute name="FlyBaseID" value=""/>
>>        <attribute name="FlyBaseCytogenicMap" value="59C4-59C4"/>
>> </item>
>>
>> this elongates any code we write that produces items XML files, since
>> we have to include a null contingency for each field, which takes
>> extra time and troubleshooting.  Is there any way the intermine parser
>> can just know not to include FlyBaseID when presented with a null
>> value instead of returning with an error (regarding this example)?
>>
>>
>> Cheers,
>> -JD
>> _______________________________________________
>> dev mailing list
>> dev at intermine.org
>> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>




More information about the dev mailing list