DeltaAccess –
a SQL interface to DELTA (Description Language for Taxonomy), implemented in Microsoft Access

Appendix 2: Development documentation

   Limitations
   Recognized DELTA directives
   General development information
   Co-development issues
   Multi-language support
   Next table of contents: Appendix 3: Information model and data dictionary

Previous table of contents    Main table of contents

 Limitations

This chapter consists of the following sections:

  Recognized DELTA directives

  Limitations which will be removed

  Problematic limitations

  Limitations which have low priority

Appendix 2: Table of contents    Main table of contents

 Recognized DELTA directives

Directive

Import

Export

Functionality/Remark (s. also Legend and Notes)

Character list =
Character descriptions

Yes

Yes

OK.

Character types

Yes

Yes

OK, default UM is not exported. Imported into the Character Type attribute.

Numbers of states

N/A

Yes

OK, superfluous information, ignored during import. The default value 2 is not exported.

Multistate types

Yes

Yes*

OK. New Directive: and/or/intergrade/exclusive (i. e. EOM, EUM), see MultistateType attribute.

Mandatory characters

Yes

Yes

OK. See mandatory characters.

Implicit values

Yes

Yes

OK. Currently not observed during identification; see option under Analyze to insert implicit values as explicit data.

Character reliabilities
/ Character weights

Yes

Yes

OK. Reliabilities preferred, Weights are optionally converted during import.

Character availabilities

Yes

Yes*

OK. New directive, see Availability attribute

Absolute error

Yes

Yes

For numerical characters only. Imported to/exported from the Fuzziness/ FuzzinessIsPercent attributes.

Percent error

Yes

Yes

For numerical characters only. Imported to/exported from the Fuzziness/ FuzzinessIsPercent attributes.

Link characters

Yes

Yes

OK, see attribute LinkGroup/Link characters.

New paragraphs
at characters

Yes

Yes

OK. In natural-languages descriptions a new paragraph is started before each character where this attribute is set.

Replace semicolon
by comma

Yes

Yes

OK. In natural-languages descriptions, characters linked by the LinkGroup attribute (see Link characters directive) are separated by a comma, instead of a semicolon. Confor would also omit the commas which normally separate multiple character states in a single character. [*** Imported and exported, but not yet used in DeltaAccess natural language output.]

Omit final comma

Yes

Yes

In natural-languages descriptions an item description with more than 2 states (e. g. 1,1/2/3) is normally translated into "... state 1, state 2, or state 3". If this directive is set, the final comma before the "or" is omitted. [*** Imported and exported, but not yet used in DeltaAccess natural language output.]

Omit lower for characters

Yes

Yes

For numerical characters only. Defines in natural language output that only the upper range/max will be output. Imported/exported into OmitValues, but currently not supported in natural language description of DeltaAccess.

Omit upper for characters

Yes

Yes*

The complement to "Omit lower ...". For numerical characters which contain a minimum statement ("at least x mm wide"). Available in DeltaAccess, although not supported by Dallwitz's Confor program (why not?).

Character headings

Yes

Yes

Imported into HeadingName of _CHAR_Heading table; linked through character definition attribute CharHeading. See also Overview of uses of character headings.

Item subheadings

Yes

Yes

Imported into HeadingPhrase of _CHAR_Heading table, linked through character definition attribute ItemSubheading. See also Overview of uses of character headings.

Define characters

Yes

Yes

This directive is used by Intkey and usually found in the intkey.ini file. DeltaAccess imports this directive as named character groups (tables _CHAR_Heading and _CHAR_Heading_Link). Can be edited in the Character Headings form. Recursive definitions (named group defined based on other group names instead of character numbers) are currently not supported. If an imported file contains a Character heading, but no Define characters directive, the character headings are also used as character groups und will be exported in a Define characters directive. See also Overview of uses of character headings.

Decimal places

Yes

Yes

Converted to the much more powerful character definition attribute FormatString during import, converted back during export. Additional FormatString information is currently lost during export because not supported in DELTA.

Character names

Yes

Yes*

New Directive: DELTA does not define a unique character name, the character strings are optimized for natural language output and are, even including the comments not necessarily unique. The export routine will export the attribute CharPhrase as the "normal" character string, and will import/export this string separately. *** Import: Behavior if attempting to import non-unique character names (they must be unique) not yet tested, problems expected!

Character notes

Yes

Yes

OK, see character definition attribute Notes.

Character images

No***

Yes

Not tested, import to be implemented. New feature: character images may be associated with a char. state. Ex-/import of this feature not yet implemented.

Dependent characters =
Inapplicable char

Yes

Yes

OK, see _DEP entity. Enforced in the editing forms, but not in directly in queries/tables.

Applicable characters

Yes

N/A

OK, converted to Inapplicable during import.

Key states

Yes

Yes

OK, but as text only. The string can be edited directly and is exported, e. g., for use by Intkey. No consistency checks are implemented in the editing forms and they are not yet used in the internal identification. See Mapping Character states.

Key character list

No

No

Will be used in a future version for character states names of key states only. Thus, in contrast to Confor, it must be present in addition to the Character list directive.

Item descriptions
= Item list

Yes

Yes

The Item names are imported into ItemName attribute, the item descriptions into the _DESCR table.

Item abundances
/ Item weights

Yes

Yes

Abundances are preferred, Weights are optionally converted during import.

Item notes

Yes

Yes*

New Directive: DELTA-comments after item name are not always author names. This field is used for other item comments. Comments, which differentiate multiple entries of the same taxonomic name, should ultimately be moved to the new literature reference and specimen unit fields. (Note: I am actually not sure how necessary or useful this is; I intend it mainly for internal comments.)

Taxon images

No***

N/A

Images will be fully supported in a later version of DeltaAccess, imported into the resource (_RSC) structure.

Character for
taxon images

*?*

*?*

Images will be fully supported in a later version of DeltaAccess, imported into the resource (_RSC) structure.

Comment

Yes

Yes

Stored in the ProjectComments property, see editing project properties.

Heading

Yes

Yes

Stored in the ProjectTitle property, see editing project properties. The closing literal separator ("/") required by Pankey is not required by DeltaAccess.

Project Authors
Project Editors
Project Copyright
Project Description
Project Version
Project Icon
Project DefaultPath
Project DefaultURL

Yes

Yes*

New Directive: Standard DELTA does not define general descriptor project information. These directives have been added to import and export the following project properties: ProjectAuthors, ProjectEditors, ProjectCopyright, ProjectDescription, ProjectVersion, ProjectIcon, ResourceDefaultPath, and ResourceDefaultURL.

Include characters
Exclude characters

Yes

Yes

OK. DeltaAccess equivalent: Character subsets.

Include items
Exclude items

Yes

Yes

OK. DeltaAccess equivalent: Item subsets.

Emphasize characters

No

No

Formatting command for natural language output, implementation planned in future version.

Emphasize features

No

No

Formatting command for natural language output, implementation planned in future version.

Formatting marks

No

No

Formatting command, very limited support by Dallwitz's Confor program itself.

Typesetting marks
Omit typesetting marks

N/A

N/A

Formatting command for natural language output, specific to the unique typesetting marks used by Dallwitz's Confor program.

Number of characters

N/A

Yes

OK, superfluous information, ignored during import.

Maximum number
of states

N/A

Yes

OK, superfluous information, ignored during import.

Maximum number
of items

N/A

Yes

OK, superfluous information, ignored during import.

Data compression

No

No

(See note.)

State codes
Replace state codes

No

No

(Use of letters instead of numbers for character states; use is not recommended for DELTA programs, see note.)

Input file =
Include file

Yes

N/A

During import, up to 3 files which can be specified directly. Any file referenced in these files using the "Input file" directive is added to this list. All files are treated as a single stream of data, it does not matter to DeltaAccess where which information is stored. Only few restrictions on the sequence of directives exist (much less than in the definition of the DELTA format).

Input delta file

N/A

N/A

(Like Input file, but searches the Confor directory; not applicable to DeltaAccess.)

Previous input file

N/A

N/A

(Return to previous file from which the include file was called; this directive is the default at end of an include file.)

End

N/A

N/A

(Stop processing; default at end of file. Ignored during import.)

Accept duplicate values

N/A

N/A

(Editing tool of Confor, not applicable to DeltaAccess.)

Data buffer size

N/A

N/A

Internal command for Dallwitz's Confor program.

Special storage

N/A

Yes

Internal command for Dallwitz's Confor program.

Error file

N/A

N/A

Error checking command of Dallwitz's Confor program, not applicable to DeltaAccess.

Show

N/A

Yes

(Text printed in Confor's error and listing files. Ignored during import, but supplied in export file.)

Data listing
No data listing

N/A

N/A

Error checking command of Dallwitz's Confor program, not applicable to DeltaAccess

List characters
List heading
List items
Listing file

N/A

N/A

Error checking commands of Dallwitz's Confor program, not applicable to DeltaAccess.

Stop after item

N/A

N/A

(Not applicable; meaningful only for programs using DELTA files as their primary data repository, instead of importing them like DeltaAccess. To ignore certain items use the Exclude Items directives instead, see Item subsets.)

Print all characters

N/A

N/A

(Overrides subset definition with Include/Exclude characters.)

Print summary

N/A

N/A

(A form of data analysis performed by Dallwitz's Confor program. Compare the section on Data analysis.)

Print character list
Print comment
Print file
Print heading
Print item descriptions
Print item names
Print uncoded characters

N/A

N/A

(Compare the export functions of DeltaAccess.)

Print width

No

No

(Export fixed to 75, could be supported if necessary.)

Insert implicit values

N/A

N/A

Action command, not evaluated during import, but available in DeltaAccess as an action query.

Vocabulary

No***

N/A

The current versions of DeltaAccess are English-only. Preparations for multilingual data sets are already being made. The Vocabulary directive will be implemented at that point.

Overview over limitations    Appendix 2: Table of contents    Main table of contents

 Recognized DELTA directives: Legend and Notes

Legend:

N/A = not applicable to the design of DeltaAccess (superfluous information, action commands for Confor which are specific to its command line interface, etc.).

Yes* = (in the export column) new directives defined by DeltaAccess. They are exported with the setting "DELTA text file", but not with "Strict DELTA" (the latter provides compatibility with Confor, which crashes when encountering directives not defined by itself). See also the chapter Variants of the DELTA format.

No*** (boldface with 3 asterisks) = Not yet; implementation has already started

Notes: The documentation of DELTA does not distinguish between directives containing data or formatting information, and commands which control the actions of the programs in Dallwitz's suite of DELTA programs (Confor, Delfor, Intkey). This would be very desirable for a standardization of DELTA. The entry N/A in the table below indicates that a feature is not applicable to the way DeltaAccess handles the data. These are program specific directives of Dallwitz's DELTA suite, which are not needed in DeltaAccess. The list of directives is not complete. I am still working through Dallwitz's DELTA documentation to see which directives are commands for Confor inapplicable to DeltaAccess and which directive contain generally valuable information. Please notify me about any directive you are missing.

The directive table is currently not well sorted (***), I will organize it into topical sections in the future. An alphabetical list of directives can be found in the windows help version of the documentation under DELTA directives by opening the index on DELTA directives.

Overview over limitations    Appendix 2: Table of contents    Main table of contents

 The following limitations will hopefully be removed in a future release:

(if enough people are interested in DeltaAccess to make it worth while ...)

This is clearly not entirely adequate. In general, "fine tuned" data sets, optimized for natural language descriptions will suffer. Pankey-like data sets generated with a structured approach in mind, or data sets suitable for Intkey (interactive identification) should fare better. I would also like to add that comments in the item description (i. e. per character/character state) are extremely difficult to translate. A comment that has to be in front of the character state in one language may have to be after it in another language. In general, if you are using a lot of comments/notes among the character state descriptions, you should consider revising your character definitions.

If you have any ideas how to improve the parsing please tell me. I have been thinking about some heuristics, e. g., assuming that preceding comments in "()" really belong to the previous item, but that is difficult to verify and implement. The parsing and strict association of CS and Notes on the other hand is absolutely necessary to make DELTA data database compatible.

Overview over limitations    Appendix 2: Table of contents    Main table of contents

 The following limitations are problematic, but I think they should not be removed to make DELTA more structured:

a) Instead of using the '&' operator you should consider adding additional combined character states to the character definition. If the color of an insect consist of yellow and black stripes, I would much more prefer to be offered this as a choice during interactive identification, than being asked whether it is either black or yellow, even if the identification process will correctly identify the insect if I only say yellow.

This is exactly what the import procedure will do when it encounters the and-operator (&) together with the or-operator (/). It leaves a combined character state like 1&2 in the data table and adds this state to the list of character states. If you find yourself in a situation where you get very many such states, check whether you did not really want to have all states combined with "and", and used the or-operator only accidentally once or twice. Revise the import file, and re-import it. If only and-operators are present, no combined states are necessary; instead the MultistateType attribute will be set accordingly.

b) You can use an item description Modifier with the Operator attribute set to a value > 0. In the default modifier list modifiers for "and", "or", and "to" are predefined, you only have to add them to those characters where you want to use them. Note that the "to" (= "-") operator for consecutive values of type OM is implicit if MultistateType is set to TO/OR or TO/AND. For UM type characters, import will add the "to" modifier to the list of available modifiers, if the "-" operator is used. For other characters, you have to explicitly enable these modifiers for the character first. If you use modifiers, the sequence of character states becomes relevant. Although most database routines will assume that character states are commutative, i. e. the sequence is irrelevant, you can manipulate it, and natural language output will adhere to the defined sequence.) This method gives you almost all the freedom you used to have in DELTA. The major exception is that '34,2/1-2' is not supported, you would have to be able to add the character state '2' twice, and each character state can be added but once. This last problem is the only one, which I feel real concern about. You can solve it by using method a), i. e. create a new combined character state, but this is not recognized during import.

Having multiple means or modes implies that the distribution of the measurement variable is multimodal, which is very hard to analyze. Although modes and extreme values can be defined, means, ranges, standard deviation, etc. are undefined. There is no statistically meaningful interpretation of multiple modes for a single item (unless you are thinking in terms of performing an ANOVA...). Analytically, multimodal distributions are usually the result of not differentiating objects, which in reality should be treated as two distinct objects. For example, if organism size is multimodal, this could be the result of not distinguishing between male and female organisms, or organisms of different distinct age classes.

Probably the most frequent use of multiple values for statistics of numeric characters results from the way many people code a genus when they intend to construct a key for the generic level alone. A genus might consist of species with different, distinct distributions. It might in fact be advantageous to define a character for a genus as, e. g., '12-15 or 30-45'. This indicates that although the character is quite variable for the whole of the genus, no species falls into the range of 16-29. I suggest in these cases to add the genus as two separate items, and using comments to annotate to which subgroup of the genus the measurements in question actually belong.

Joseph H. Kirkbride, Jr., supplied another example for filament length in flowers of heterostylous species, where he coded the long and the short filaments as multiple values of character type RN (e. g. 0.2-0.4/1.1-1.8). This is fully satisfactorily if the aim is primarily to produce natural language output. In the long run I think it is undesirable to code data in this way, because it creates problems in the analysis of data. Even a simple calculation of average filament length would give a confusing result, and proper similarity or relationship analysis would be impossible because in fact different things would be compared. A preferable way to code heterostyly would be:

#1. Flowers heterostylous (type: UM, MultistateType exclusive)
state 1: Homostylous (Inapplicable: 3 and 4)
state 2: Heterostylous, bistylous (Inapplicable: 3)
state 3: Heterostylous, tristylous
#2. length of normal or short filaments (type RN)
#3. length of intermediate filaments (type RN)
#4. length of long filaments (type RN)

A final example are spores, which indeed have only 1, 3, or 5 septa because the septation occurs always symmetrically. Thus if a spore can have 1 or 3 septa, 1-3 is not as appropriate as 1/3. Strictly speaking, no average can be calculated from such values, although this might seem justifiable. A calculation of variance is definitely statistically meaningless. The character is in fact not of IN, but of OM type.

Do argue if you disagree, I am open to criticism. The only point, which I will not follow, is that everything, which is possible in a book, should be possible in structured data storage. Both media will have advantages and disadvantages, even in the best of all worlds...

Overview over limitations    Appendix 2: Table of contents    Main table of contents

 Removing the following limitations is either not planned or has low priority:


In the following points DeltaAccess handles the data in a different way, while providing a similar functionality:

*** Critical discussion, what still does not work is: Character1<comment>,1. This would result in Character1,1<comment> in the export, while Character1<comment>,1<comment2> is preserved.

Further information on these topics can be found in the following chapter Internal information.

Overview over limitations    Appendix 2: Table of contents    Main table of contents

 General development information

Who should read this: People interested in a discussion about design decisions and the internal processes of DeltaAccess. People interested in adding functions to DeltaAccess should also read the following chapter Co-development issues. Please refer also to the chapter Strategy and future development of DeltaAccess. Reading this chapter is not necessary for the use of DeltaAccess or Identify.

Appendix 2: Table of contents    Main table of contents

 Co-development issues

Who should read this: The following information is only relevant for people considering co-development. Co-development means that you either help in the development of DeltaAccess, or that you develop a separate application which interacts with DeltaAccess. You could help in the testing, as well as in sending me enhancements to be incorporated into a future version of DeltaAccess (compare Conditions of use). Further general development information is given in the previous chapter. Reading this chapter is not necessary for the use of DeltaAccess or Identify.

Appendix 2: Table of contents    Main table of contents

 Reporting errors and other feedback

I appreciate any feedback, like error reports, comments on the general design, or suggestions for improvement. Please do not hesitate to contact me in such cases.

This project has become far more complex than I anticipated. Some originally functional features may have become dysfunctional during development, or may not work under certain circumstances. I can not possibly test every functionality each time over and over again. This application is free software and programmed in my spare time. Any help you can give me in finding these things is most welcome.

If you find errors, please try to reproduce them and describe the exact steps one must perform to recreate the error. Perhaps send me the data set or an excerpt which causes the error. If you are concerned about the copyright of the data, you could fingerprint your data set. Please report the version number of DeltaAccess, which is displayed in the copyright notice at the bottom of the DeltaAccess opening dialog box.

Also, I realize that part of this documentation must be reworked. I am grateful for reports of errors in the documentation, if you point out unclear language, or misunderstandings of concepts of other people's work. Please read the chapter problematic limitations of DeltaAccess if you want to comment on design issues.

Regarding suggestions: I am aware of several shortcomings of the current version. I keep a list of suggestions and constantly try to improve the application. Yet, because of limited time, I can not promise to implement even obvious improvements immediately.

Thank you in advance for any help you can give me!

Gregor Hagedorn

Appendix 2: Table of contents    Main table of contents

 Multi-language support

One of the design goals of DeltaAccess is to allow reports in multiple languages from the same database. There are several options to do this. One would be to introduce an explicit layer for all descriptive strings like character / character state names, CharPhrase, etc. This would be the cleanest model and would have the advantage that changes would automatically occur in all language descriptions. I did not use this model, because I assume that most people will work 99% of their time with their preferred language only. The multilanguage support implemented in such a way would very significantly reduce the execution speed of DeltaAccess when executed on a currently common PC. Therefore, the multilanguage support is implemented using a link between the item data of an existing project and the character definition of a new project. The latter is first copied from the original data set and can then be translated.

It is already possible to do this in the current version, although you would have to create the link by hand. The reason that the combination of character data with item link is disabled in the Management section of the DeltaAccess dialog box is that the lack of relational integrity would be confusing or dangerous. You currently can present multilingual views, but are fully responsible to synchronize any changes yourself. This situation is entirely equivalent to keep several chars DELTA text files for multiple languages for use with Confor or Intkey.

In the future I plan to define special secondary language character definitions, which remain fully linked with the primary character definition in your principal language. A special character definition editing form would allow you to edit all linked character definitions in a single view.

The item description notes are currently not part of the translation scheme. I assume that nobody would be willing to constantly translate them into multiple language. You can of course do it in a one-way process like you would have done

Entirely independent of these information model considerations is the fact that the DeltaAccess application is also largely able to adjust the user interface to a different language, but this functionality is not distributed yet. It will be opened up in a later version. Basically it will require a recompilation of the application with a new resource set. Messages, status area messages/tool tips, the descriptions and text in most pick lists, and most captions on the forms can be fully translated. Some parts are very difficult or impossible to translate: Names of objects like tables, queries, forms, and fields in tables. Because of this, I purposely kept these names non-descriptive, abbreviation-like (e. g. "X_DESCR" instead of "item description data", which Access would allow).

The Vocabulary directive will be supported with the first release of a multilingual version of DeltaAccess.

Appendix 2: Table of contents    Main table of contents    Next