Import wizard for tab separated lists

With this import routine, you can import data from text files (as tab-separated lists) into the database. For an introduction see a short tutorial . Choose Data -> Import -> Wizard and then the type of data that should be imported, e.g. import Series ... from the menu. A window as shown below will open that will lead you through the import of the data. The window is separated in 3 areas. On the left side, you see a list of possible data related import steps according to the type of data you choose for the import. On the right side you see the list of currently selected import steps. In the middle part the details of the selected import steps are shown.

 

Choosing the File

As a first step, choose the File from where the data should be imported. For an introduction see a short tutorial . The currently supported format is tab-separated text. Then choose the Encoding of the file, e.g. Unicode. The preferred encoding is UTF8. The Start line and End line will automatically be set according to your data. You may change these to restrict the data lines that should be imported. The not imported parts in the file are indicated as shown below with a gray background. If the First line contains the column definition this line will not be imported as well. If your data contains e.g. date information where notations differ between countries (e.g. 31.4.2013 - 4.31.2013), choose the Language / Country to ensure a correct interpretation of your data. Finally you can select a prepared Schema (see chapter Schema below) for the import.

 

Choosing the data ranges

In the selection list on the left side of the window (see below) all possible import steps for the data are listed according to the type of data you want to import.

The import of certain tables can be paralleled. To add parallels click on the button (see below). To remove parallels, use the button. Only selected ranges will appear in the list of the steps on the right (see below).

To import information of logging columns like who created and changed the data, click on button in the header line. This will include an additional substeps for every step containing the logging columns (see below). If you do not import these data, they will be automatically filled by default values like the current time and user.

 

Attaching data

You can either import your data as new data or Attach them to data in the database. For an introduction see a short tutorial . Select the import step Attachment from the list. All tables that are selected and contain columns at which you can attach data are listed (see below). Either choose the first option Import as new data or one of the columns the attachment columns offered like SeriesCode in the table Series in the example below.

If you select a column for attachment, this column will be marked with a blue background (see below and chapter Table data).

 

Merging data

You can either import your data as new data or Merge them with data in the database. For an introduction see a short tutorial . Select the import step Merge from the list. For every table you can choose between Insert, Merge, Update and Attach (see below).

The Insert option will import the data from the file independent of existing data in the database.

The Merge option will compare the data from the file with those in the database according to the Key columns (see below). If no matching data are found in the database, the data from the file will be imported. Otherwise the data will be updated.

The Update option will compare the data from the file with those in the database according to the Key columns. Only matching data found in the database will be updated.

The Attach option will compare the data from the file with those in the database according to the Key columns. The found data will not be changed, but used as a reference data in depending tables. 

Empty content will be ignored e.g. for the Merge or Update option. To remove content you have to enter the value NULL. As long as the column will allow emty values, the content will be removed using the NULL value.

 

Table data

To set the source for the columns in the file, select the step of a table listed underneath the Merge step. All columns available for importing data will be listed in the central part of the window. In the example shown below, the first column is used to attach the new data to data in the database.

A reminder in the header line will show you what actions are still needed to import the data into the table:

The handling of the columns in described in the chapter columns.

 

Testing

To test if all requirements for the import are met use the Testing step. You can use a certain line in the file for you test and then click on the Test data in line: button. If there are still unmet requirements, these will be listed in a window as shown below.

If finally all requirements are met, the testing function will try to write the data into the database and display any errors that occurred as shown below. All datasets marked with a red background, produced some error.  

To see the list of all errors, double click in the error list window in the header line (see below).

If finally no errors are left, your data are ready for import. The colors in the table nodes in the tree indicate the handling of the datasets: INSERT, MERGE, UPDATE, No difference. Attach, No data. The colors of the table columns indicate whether a column is decisive , a key column or an attachment column .  

If you suspect, that the import file contains data already present in the database, you may test this and extract only the missing lines in a new file. Choose the attachment column (see chapter Attaching data) and click on the button Check for already present data. The data already present in the database will be marked red (see below). Click on the button Save missing data as text file to store the data not present in the database in a new file for the import. The import of specimen contains the option Use default duplicate check for AccessionNumber that is selected by default. 

If you happen to get a file with a content as shown below, you may have seleted the wrong encoding or the encoding is incompatible. Please try to save the original file as UTF8 and select this encoding for the import. 

 

Import

With the last step you can finally start to import the data into the database. If you want to repeat the import with the same settings and data of the same structure, you can save a schema of the current settings (see below). You optionally can include a description of you schema and with the button you can generate a file containing only the description.

Schedule for import of tab-separated text files into DiversityCollection
Target within DiversityCollection: Specimen
Schedule version: 1Database version: 02.05.41
Lines:2 - 3215First line contains column definition: ?
Encoding:UnicodeLanguage:de

Tables

CollectionSpecimen (CollectionSpecimen)
Parent: CollectionEvent
Merge handling: Insert
Column in table?KeyCopyPrePostFile pos.TransformationsValueSourceTable
CollectionSpecimenIDDatabase
AccessionNumber??0File

IdentificationUnit_1 (IdentificationUnit)
Parent: CollectionSpecimen
Merge handling: Merge
Column in table?KeyCopyPrePostFile pos.TransformationsValueSourceTable
CollectionSpecimenIDDatabase
IdentificationUnitIDDatabase
LastIdentificationCache?2File
+3File
+4File
+5File
TaxonomicGroup?fishInterface

IdentificationUnitAnalysis_1_1 (IdentificationUnitAnalysis)
Parent: IdentificationUnit_1
Merge handling: Update
Column in table?KeyCopyPrePostFile pos.TransformationsValueSourceTable
CollectionSpecimenIDDatabase
IdentificationUnitIDDatabase
AnalysisID94Interface
AnalysisNumber1Interface
AnalysisResult??39File

Lines that could not be imported will be marked with a red background while imported lines are marked green (see below).

If you want to save lines that produce errors during the import in a separate file, use the Save failed lines option. The protocol of the import will contain all settings according to the used schema and an overview containing the number of inserted, updated, unchanged and failed lines (see below).

  •  

    Description

    A description of the schema may be included in the schema itself or with a click on the button generated as a separate file. This file will be located in a separate directory Description to avoid confusion with import schemas. An example for a description file is shown below, containing common settings, the treatment of the file columns and interface settings as defined in the schema.

    Schedule for import of tab-separated text files into DiversityCollection
    Target within DiversityCollection: Specimen
    Schedule version: 1Database version: 02.05.52
    Lines:2 - 5First line contains column definition: ?
    Encoding:UnicodeLanguage:de
    Description: Import Schema fuer Literaturdaten (Bayernflora) aus Dörr & Lippert mit MTB Daten und max. 4 Sammlern


    File columns
    Merge handling of tableColumn usage
    InsertMergeUpdateAttachDecisiveKey

    Pos.NameTableColumnExampleTransformed
    0ID
    CollectionSpecimen. ExternalIdentifier1
    1originalname
    Identification_1_2. TaxonomicNameOphioglossum vulgatum
    2nameautor
    Identification_1_1. TaxonomicNameOphioglossum vulgatum L.
    3taxnr
    Identification_1_1. NameURI3949
    Prefix:http://tnt.diversityworkbench.de/TaxonNames_Plants/
    4mtb
    CollectionEventLocalisation_6. Location18423
    5qu
    CollectionEventLocalisation_6. Location22
    6unschärfe
    CollectionEventLocalisation_6. LocationAccuracy
    7jahr_von
    CollectionEvent. CollectionYear1902
    8jahr_bis
    CollectionEvent. CollectionDateSupplement
    Prefix:bis?
    9status
    IdentificationUnitAnalysis_1_1. AnalysisResult
    10verwaltungseinheitnot imported
    11fundortnot imported
    12findernot imported
    13ID_collector1
    CollectionAgent_1. CollectorsAgentURI43708
    Prefix:http://snsb.diversityworkbench.de/Agents_BayernFlora/
    14ID_collector2
    CollectionAgent_2. CollectorsAgentURI
    Prefix:http://snsb.diversityworkbench.de/Agents_BayernFlora/
    15ID_collector3
    CollectionAgent_3. CollectorsAgentURI
    Prefix:http://snsb.diversityworkbench.de/Agents_BayernFlora/
    16ID_collector4
    CollectionAgent_4. CollectorsAgentURI
    Prefix:http://snsb.diversityworkbench.de/Agents_BayernFlora/
    17primärquellenot imported
    18ID_primärquelle
    Annotation_1. ReferenceURI
    Prefix:http://id.snsb.info/references/
    19primärquelle_seitennot imported
    20bestand
    IdentificationUnitAnalysis_1_2. AnalysisResult
    21sonstiges
    CollectionSpecimen. OriginalNotes
    22höhe
    CollectionEventLocalisation_7. Location1
    23herbar1not imported
    24herbar2not imported
    25herbar3not imported
    26ID_herbar1
    CollectionSpecimenRelation_1. RelatedSpecimenCollectionID
    27ID_herbar2not imported
    28ID_herbar3not imported
    29detnot imported
    30ID_detnot imported
    31revnot imported
    32ID_revnot imported
    33datenquellenot imported
    34ID_datenquelle
    CollectionSpecimen. ReferenceURI135
    Prefix:http://id.snsb.info/references/
    35project1not imported
    36project2
    CollectionSpecimen. AdditionalNotesOBeobachtung
    Transformations:
    Reglar express.:
    O?Beobachtung
    Reglar express.:
    H?Herbarauswertung
    Reglar express.:
    L?Literaturauswertung

    Interface settings
    TableTable aliasColumnValue
    AnnotationAnnotation_1AnnotationTypeReference
    Annotation_1AnnotationLiteraturauswertung: nach Dörr & Lippert (2004)
    Annotation_1ReferenceDisplayTextAnnotation
    CollectionAgentCollectionAgent_1CollectorsNameCollector1
    CollectionAgent_2CollectorsNameCollector2
    CollectionAgent_3CollectorsNameCollector3
    CollectionAgent_4CollectorsNameCollector4
    CollectionEventCountryCacheGermany
    CollectionProjectCollectionProject_1ProjectID37
    CollectionProject_2ProjectID149
    CollectionSpecimenReferenceTitleReference
    CollectionSpecimenRelationCollectionSpecimenRelation_1RelatedSpecimenURI
    CollectionSpecimenRelation_1RelatedSpecimenDisplayText
    CollectionSpecimenRelation_1NotesHerbarauswertung: nach Dörr & Lippert (2004)
    IdentificationIdentification_1_1IdentificationSequence2
    Identification_1_2IdentificationSequence1
    Identification_1_2NotesOriginalname aus Dörr & Lippert (2004)
    IdentificationUnitIdentificationUnit_1LastIdentificationCacheplant
    IdentificationUnit_1TaxonomicGroupplant
    IdentificationUnitAnalysisIdentificationUnitAnalysis_1_1AnalysisID2
    IdentificationUnitAnalysis_1_1AnalysisNumber1
    IdentificationUnitAnalysis_1_2AnalysisID4
    IdentificationUnitAnalysis_1_2AnalysisNumber2