Import namespace from text files
Last Post 02 Apr 2014 01:24 PM by jbowie. 20 Replies.
Author Messages
albagjimenezUser is Offline
New Member
New Member
Posts:47


--
13 Feb 2014 06:02 AM
    I'm reading the "Import Wizard User Guide" and now I'm lost.

    My goal is to import SNOMEDCT in Apelon 4.

    There are 2 ways to import namespaces, from Text and Excel Files and from XML Files. I have the SNOMEDCT Text Files so I 've choosen the first option but I need an Import Specification File (.xml).

    My problem is that I don't know how to get the CTS2 .xml specification file. Any piece of advice would be apreciated.

    B.R.
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    18 Feb 2014 10:01 AM
    As we mentioned before, we do not have Import Wizard scripts to load all the desired ternminologies (in all their available formats). You will need to develop your own load specification (XML) file using the Import Wizard. The first step will be to design the logical object model you want for SCT. This process is described for a simple data set in the User Guide Appendix. Then you can author the load specification file using the wizard's first option (create/edit).

    Also, the load specification file has nothing to do with CTS2. CTS2 does not specifiy terminology exchange formats.
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    13 Mar 2014 08:30 AM
    Thank you DiTS but I have another problem. SNOMEDCT is a Ontylog namespace and I can't create Ontylog namespaces with the DTS Editor only Ontylog extensions either Thesaurus namespaces. ¿How am I going to be able to create the spec file for SNOMED CT? ¿Is there another way to create an ontylog namespace that is the previous step befor creating the spec file?


    Thanks in advance!
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    14 Mar 2014 08:22 AM
    If you are looking to load SNOMED CT from the IHTSDO distribution files, then you would be loading this content into a local Thesaurus namespace. You can load a complete representation of SNOMED in this way: all hierarchies, relationships and attributes can be loaded. You will not be able to use Ontylog-specific features, on the hand, such as subsumption queries, etc.

    Regarding your question about Description Logic modeling tools, a number of these tools are available. You might check out Protege from Stanford University or any of a number of OWL tools which are described on the web. If you would like information on Apelon's TDE modeling tool, please send a note to info@apelon.com and someone will get back to you.

    Hope this helps
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    20 Mar 2014 09:00 AM
    Thank you very much.

    I'm trying to import SNOMEDCT as a Thesaurus namespace. First of all I'm going to try importing the concepts. I've created 5 Property Types and I've created the scpecification file to import the concepts from a text file.

    My problem is that I don't understand at all the "Concept Key" attribute and its components's (Name and Namespace) parameters.

    - For "Namespace" attribute I've checked "Use existing Namespace Value: SONMED CT"
    - For "Name" attribute I don't know what to do. ¿What this name refers to?

    Also I'm not sure if the name of the Properties in the first rowof the text file have to be equals to the name of the Property Types that I've created in through the Apelon DTS Editor.

    Cheers

    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    20 Mar 2014 09:30 AM
    The objective of the Import Wizard specification file is to tell the importer where to find the associated information in the import/load file. For the Concept Key element:

    1. For Name, you should set the Field number parameter to the number of the column where the SNOMED Concept Name can be found. Also, be sure that the "Add" checkbox is selected so that the Concept will be created (if you are loading for the first time). You may also want to check some of the filters such as trim whitespace and remove quotes and control characters to ensure data quality.
    2. For the Namespace, assuming you specified SNOMED CT as the spec's namespace, you can just select "Use import namespace" from the dropdown.

    You should be able to add additional element blocks for each of your PropertyTypes, then complete the parameter sections for each of these types to specify the column (in the import file) of the PropertyType's value. The name of the PropertyType can be taken from the file, but normally, the type is just "hard coded" into the spec and the value's column is entered.

    Note that the Import Wizard does not use any information in the first "header" row, if such a row is present. Our experience is that the header information is not always sufficient for accurate loadinng.

    Hope this gets you going.



    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    21 Mar 2014 02:42 AM
    Yes it was very helpful!!

    This was my log when I finished import 2 concepts for testing purposes:

    DTSImport: 3/21/2014 8:33 AM
    [Progress] Encoding is 'UTF8' 0::00:00[11449640]
    [Progress] Delimiter is '	' 0::00:00[11449640]
    [Progress] Sheet name is '' 0::00:00[11336800]
    [Progress] Starting text import from 'C:\Proyectos\Apelon DTS\4.0\Carga SNOMED\concepts2.txt' using specification 'C:\Program Files\Apelon DTS 4.0\bin\load-snomedct2.xml' with options '/D='	'/E='UTF8'/S='''. 0::00:00[11111104]
    [Progress] Header is false 0::00:00[9646016]
    [Progress] Opening input file 'C:\Proyectos\Apelon DTS\4.0\Carga SNOMED\concepts2.txt' 0::00:00[9533176]
    [Progress] Concept 'SNOMED RT Concept (special concept)[SNOMED CT]' created. 0::00:01[18349600]
    [Progress] Property Code In Source='100005' added to 'SNOMED RT Concept (special concept)' 0::00:02[17689560]
    [Progress] Property Effective Time='2002-01-31 00:00:00' added to 'SNOMED RT Concept (special concept)' 0::00:02[17475464]
    [Progress] Property Active='0' added to 'SNOMED RT Concept (special concept)' 0::00:02[17259432]
    [Progress] Property Module='SNOMED CT core module' added to 'SNOMED RT Concept (special concept)' 0::00:02[17035112]
    [Progress] Property Definition Status='Necessary but not sufficient concept definition status' added to 'SNOMED RT Concept (special concept)' 0::00:02[16631720]
    [Progress] Concept 'SNOMED RT Concept[SNOMED CT]' created. 0::00:02[16382696]
    [Progress] Property Code In Source='100005' added to 'SNOMED RT Concept' 0::00:03[16158456]
    [Progress] Property Effective Time='2002-01-31 00:00:00' added to 'SNOMED RT Concept' 0::00:03[15942376]
    [Progress] Property Active='0' added to 'SNOMED RT Concept' 0::00:03[15728280]
    [Progress] Property Module='SNOMED CT core module' added to 'SNOMED RT Concept' 0::00:03[15503848]
    [Progress] Property Definition Status='Necessary but not sufficient concept definition status' added to 'SNOMED RT Concept' 0::00:03[15289752]
    [Completed] Load completed. 2 records processed, 0 warnings and 0 errors. 0::00:03[15108568]
     


    Thnak you very much!!
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    21 Mar 2014 02:56 AM

    Although, I have one more question. What is the maximum size of the file/s to import?

    I'm asking you this because the DTS Editor is stuck analyzing the import file, its size is 79,5 MB.

    Cheers
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    21 Mar 2014 10:06 AM
    We are not aware of any absolute size limitations, but hangs of this type do sometimes occur due to memory constraints and this is a VERY large file. I can make two suggestions:

    1. Run the Import Wizard in 'standalone" mode, by executing ImportWizard.bat in the bin/importwizard folder. This eliminates overhead associated with the DTS Editor GUI and Editor panel synchronization. ImportWizard operation is exactly the same. Large loads run significantly faster using the standalone version. You will need to first modify the connection parameters in the batch file to match your configuration.

    2. Allocate as much virtual memory to the JVM as possible. This is the second argument in the 'call' statement in the batch file. This should be at least 1024. If you are running a 64-bit JVM it can be even higher.

    Good luck
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    24 Mar 2014 03:30 AM
    I'm going to load all the sysnoyms of my concepts. In order to do that I'm creating the spec file (through the Import Wizard Plugin). Previously I have added (through the DTS Editor) all the Term's properties types and an Association Type (Synonym) to link each concept to its synonym term. But when I choose the Association Attribute in the Import Field Specification section I can't see my new Association Type (Synonym) in the Assoaciation Component.

    There are 3 components to fill:
    - Type --> Should be possible to select Synonym but I can't.
    - Name --> ¿Name of what? ¿Term's name?
    - Namespace --> SNOMED CT

    Any piece of advice is welcome.

    Cheers
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    24 Mar 2014 09:16 AM
    Try selecting "Synonym" in the Attribute dropdown ("Select attribute type") rather than "Association". You will need to specify the Term name to be used in the parameters.
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    31 Mar 2014 06:40 AM
    But I'm confused now. I decide to do the load like this:
    1. Load the concepts and its properties
    2. Load the terms and its properties and which is their synonym concept using the association type "synonym"

    In fact I have all the concepts loaded. ¿Am I going to be able to load the synonyms of its concepts now?
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    31 Mar 2014 09:00 AM
    The Import Wizard does not currently support loading of inverse Synonyms, what you would need to do in your step 2. So you will need to load Synonyms from a "concept" load. You could do this either as a step 3 (after loading the Term information) or in the future, load your terms first (Step 1), then load the concepts including their properties and synonyms.

    Does this help?
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    01 Apr 2014 04:22 AM
    Yes it was helpful!!

    1. I have 2 more questions: If we needed to have concept's name in several languages, how would Apelon support that?

    2. ¿How can I set the Status of a Concept/Term in the import process? I can see that a Concept/Term can be ACTIVE, INACTIVED or DELETED.
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    01 Apr 2014 05:19 AM
    Third question

    In the term's import process how can I stablish the 'Is Preferred' field?

    Thanks!
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    01 Apr 2014 08:51 AM
    In the Import Wizard, specification of Preferred is currently a parameter in the creation (or updating) of a Synonym attribute in a Concept load. I'm not sure where your Preferred designation is available, but you may need to create a separate load file with concept name and preferred term name/code (likely need to specify the code, or a unique property, for term disambiguation).

    Import Wizard improvement requests/suggestions are always welcome if they would help your use-case.
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    01 Apr 2014 09:52 AM
    Going back to your two questions:

    1. Concept names must be unique and you can have only one name. We typically use the fully-specified name in the primary language. Alternate languages can be supported by specific Synonym Types on the concept: "English Synonym", "French synonym", etc. Please note that DTS currently only supports one Preferred Term on a concept, not one per Synonym type.

    2. There is currently not a way to import status using the Import Wizard. This capability will be included in Import Wizard Version 4.1 to be released later this spring.
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    01 Apr 2014 11:13 AM
    Thank you very much!

    Referring to the Import Wizard possible improvements I feel that the possibility to import attributes’ values from several files could be useful. For example I could perform the load of Synonyms and Concepts in only one step taking attributes values for concepts from one file and for terms (synonyms) from other file.

    Alluding to the status of a concept. Can be this status updated from the Web Service instance of Apelon DTS?

    Finally for the "Is Preferred" value I'm confused. I have decided to use the PFSN (Preferred Fully Specified Name) as the Concept Name so may be all the concepts could have the "Is preferred" checked on. ¿What is your opinion?


    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    01 Apr 2014 11:46 AM
    The Concept status can be update from the API. See the ThesaurusConceptQuery.updateConcept(...) method (or its web service equivalent). This would be called after a concept has been fetched and its status updated a by call to DTSConcept.setStatus(ItemStatus status) on the local (client) concept object.

    The DTS "preferred" flag is an attribute on a Synonym Association instance. Only one Synonym Association instance (on a given Concept) can have the "preferred" flag set. The preferred Synonym Association instance is returned by DTSConcept.getFetchedPreferredTerm(). This model allows for preferred terms to not be unique in a Namespace (as is required for Concept names).

    Regarding your Import Wizard request, we will take this under consideration, but feel that your case may be most easily handled by "packaging" multiple Import Wizard runs into a "batch" due to potential differences in the characteristics (encodings, delimiters, etc.) of the import files.
    albagjimenezUser is Offline
    New Member
    New Member
    Posts:47


    --
    02 Apr 2014 04:11 AM
    And which tool is faster to import concepts and terms, the Import Wizard or the TQL Editor?

    I have deleted all concepts in my SNOMED namespace and it took 8 minutes while importing all the concepts with the Import Wizard took 5 days!!!

    Thanks
    jbowieUser is Offline
    Basic Member
    Basic Member
    Posts:110


    --
    02 Apr 2014 01:24 PM
    TQL does not support importing from a file. Regarding ImportWizard performance, try to run the IW from ImportWizard.bat, not from the Editor, and use as much memory as possible.


    ---