[InterMine Dev] Any way to pass an input file to a post-process task?

Joel Richardson Joel.Richardson at jax.org
Tue Jul 26 14:39:40 BST 2016


You could also do this as a separate source that runs after and adds the
extra data.
And if all you¹re doing is setting simple fields in existing objects, the
easiest way is to do it as a large-item-xml source.
Then your job is to generate the ItemXML-formatted input file. To set the
³foo² attribute of protein ³Q8VBZ1², you¹d generate a record like:
	<item class=³Protein² id=³1001_1²>
	<attribute name="primaryAccession" value=³Q8VBZ1²/>
	<attribute name=³foo² value=³bar² />
	</item>


Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
207-288-6435
joel.richardson at jax.org





On 7/26/16, 8:59 AM, "dev on behalf of Sam Hokin"
<dev-bounces at lists.intermine.org on behalf of shokin at ncgr.org> wrote:

>Yeah, the interpro loader doesn't quite fit the bill. I only want to fill
>some attributes in the protein domains that are loaded
>from my chado database, partly for esoteric design reasons. I'm not
>adding any new items, so in a sense it's not unlike other
>post-processors like CreateReferences. Don't want to store interpro
>records in my mine. It's sort of between a data source (adding
>new data) and a post-processor (not creating any new items).
>
>Anyway, just thought I'd ask. Not a huge deal to leave the file name
>hardcoded. :)
>
>On 07/26/2016 02:02 AM, Julie Sullivan wrote:
>> Hi Sam,
>>
>> We already have a loader for interpro.xml, use that?
>>
>> 
>>http://intermine.readthedocs.io/en/latest/database/data-sources/library/p
>>roteins/interpro/
>>
>> You want to avoid loading new data in the post-processing stage, as you
>>want to include these data in the keyword search etc.
>>
>> Julie
>>
>> On 07/20/2016 09:45 PM, Sam Hokin wrote:
>>> Hi, devs. I'm writing a post-processor that takes an input file
>>> (interpro.xml) and adds a bunch of data from that to proteins and
>>> protein domains (which I already have from a different data source). I
>>> see that PostProcessOperationsTask.java has a method to set an output
>>> file (setOutputFile), presumably from project.xml, but there is none to
>>> set an input file. I naively added a setter to do so, but it does not
>>> work when I use:
>>>
>>>       <property name="input.file"
>>> location="/home/intermine/data/interpro/interpro.xml"/>
>>>
>>> in project.xml. The setter that I added is simply:
>>>
>>>     /**
>>>      * Set the value of inputFile
>>>      *
>>>      * @param inputFile an input file for operations that require one
>>>      */
>>>     public void setInputFile(File inputFile) {
>>>         this.inputFile = inputFile;
>>>     }
>>>
>>> Any suggestions? I thought I'd ask before I start digging deeper.
>_______________________________________________
>dev mailing list
>dev at lists.intermine.org
>https://lists.intermine.org/mailman/listinfo/dev

---

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.



More information about the dev mailing list