Information model: Detailed attribute information

The following is a list of attributes, sorted by entity. You can either read it sequentially (in the Windows help file you can use the Browse buttons) or jump directly to a specific attribute. Use the overview list of entities and attributes. In the Windows help file you can also use the Index button below the menu.

Character definition and related entities:
   Attributes of the character definition (..._CHAR)
   Attributes of the character state definition (..._CS)
   Attributes of the modifier definition (..._MOD)
   Attributes of the dependent character definition (..._DEP)
   Attributes of the character heading definition (…_CHAR_Heading)
Item definition and description:
   Attributes of the item definition (..._ITEM)
   Attributes of the item description (..._DESCR)
   Attributes of the resource list (..._RSC)

Appendix 3: Table of Contents    Main table of contents

Attributes of the character definition (..._CHAR)

CID (character definition attribute):

The character identifier is a positive integer number which forms the primary key of the _CHAR entity. Each character must have a unique CID, by which the character can be referred to from other entities. For example, in the Character states entity (_CS), a CID reference determines to which character the current state belongs.

The CID determines the sequence in which the character list is displayed, unless an explicit alphabetical or other sorting is requested. To change the sequence, the number can either be edited manually, or the special Reorganize definition form can be used.

If you enter a new character in the CharEdit form, the CID will automatically be preset to the highest CID + 1. You can change the CID number at any time to any positive integer value, which is not yet used by another character's CID.

See also: attributes of the CHAR entity and CharEdit form.

Character Name and Character Phrase (character definition attributes):

The definition of the DELTA format goes back to a time where it was prudent to be as parsimonious with computer resources as possible. One design decision taken at that time was to mix the function of a phrase that concisely identifies a character, with the function of a phrase that makes sense in natural language output.

Although this design works well in a large number of cases, it often generates oblique wording when a character set is used for interactive identification, and displays the strange DELTA comment brackets to the user. Many DELTA users therefore prepare a second version of the character list for compiling an interactive key.

DeltaAccess keeps the functions separate, and requires the CharName attribute to contain a unique value. It may not be missing, should be short and concise, and phrased for use in interactive editing and identification tools. The first letter may be capitalized or not, but capitalizing it usually improves the readability of the dialog boxes.

CharName is limited to 255 characters. You can use any characters you need, but if you want to export your data to a DELTA coded text files, the name may not contain a '# ' or '*'. You can use / or > as long as they are not followed by a space, and you can use < as long as it does not follow a space or come at the beginning.

The CharPhrase attribute may contain any text that leads to the desired natural language output in DeltaAccess or after export in some other DELTA program like Confor or Pankey. It can be up to 255 characters long, or missing. Because natural language output uses a capitalization mechanism, the CharPhrase should start with lowercase letters, unless the first word must always be capitalized. Comments can be entered in the "<>"-brackets here, and the multiple '/' used to separate phrases (defined in DELTA version 4) can be used. CharPhrase is thus an attempt to bridge the gap between the free-text formatting definitions of DELTA and a database. Note that comments in a CharPhrase are not used for natural language output by the programs mentioned, but they may be convenient if data are compiled for Intkey. Since DeltaAccess assumes for its own purposes that if CharPhrase is not used at all, CharName shall be used for natural language output, you may have to enter <> to output no information at all.

During import the comments after the DELTA character string are not moved to the Notes attribute! See What happens to the character name during import? for more information. Compare also the Units attribute.

To remain as compatible with DELTA as possible, the CharPhrase attribute will be preferred during export over the character name CharName. The character names are exported in a second directive to conserve them.

See also: attributes of the CHAR entity and CharEdit form.

 

Character Notes (= comments, character definition attribute)

The character name Notes are optional and can be as long as you like. You should not put the DELTA brackets (< and >) around your text. Usually, the character names contain development information, and are not exported to natural language descriptions. They are imported from and exported to the Character notes directive, i.e. they are not added in DELTA-brackets (< and >) after the character name.

See also: attributes of the CHAR entity and CharEdit form.

 

Measurement Unit (character definition attribute)

For numerical characters, an optional measurement unit like "cm", "mm", or "°C". It is allowable to use non-standard units like "times the width".

In DELTA format, a Unit is defined as text after the character name string and before the character states or the next character. It is thus delimited by the string literal separator (usually "/"). Example: "#69. ascospores/ µm long/". In DeltaAccess, you would enter "µm long" in the Units attribute.

DeltaAccess accepts Units for numerical character only. Units of categorical characters are appended to the character state names during import, see What happened to my units during import?

See also: attributes of the CHAR entity and CharEdit form.

 

DELTA distinguishes five character types:

Categorical (or qualitative) character types; intermediate states are logically impossible or can not be measured:

UM (Unordered Multistate): Nominal scale. A sequence of states is not meaningful. Example: "red", "green", "blue".

OM (Ordered Multistate): Ordinal scale. A sequence of states (ascending or descending order) is meaningful. Examples: "first, second, third" or "absent, rare, frequent". The distance between these states is either logically not defined or not measurable by the observer. No statistics like average, standard deviation, etc. can be calculated from such data.

The "M" stands for multistate. Normally you can add more than one state, but you may also define the MultistateType attribute as exclusive, in which case only a single character state for such characters is allowed in each item.

Using an appropriate categorical character type and multistate type greatly improves the possibility to properly analyze your data and improves their usefulness in the production of printed keys and interactive identification. In binary characters (two states, e. g. present/absent), there is no difference between nominal and ordinal.

See also Overview of categorical character types.

Numerical (or quantitative) character types:

IN (Integer Numeric): Typically used to enter the number (or count) of something. Values must not be fractional, i.e. they must have no decimal point. Since the distance between values is constants (as opposed to the ordinal OM characters), the computation of an average is meaningful.

Note: *** The current version of DeltaAccess does not support IN like other DELTA programs, in that only min/range/mean/max type values are supported, but not character state like multistate numbers combined with OR.

RN (Real Numeric): can be entered. For measurements which may have a decimal point (compare fractional/exponential values), such as "length of spore". Standard DELTA allows to enter a single or mean value, a range of values ("1.2-3.4"), or to add minimum and maximum in brackets ("(1-) 1.2-2.3-3.4 (-5)"). The meaning of the range is not defined in DELTA. DeltaAccess allows more statistical attributes, including sample size, standard deviation or error, and it allows to specify the meaning of the range used (e. g. 95% confidence interval). Compare Standard and Extended Numerical attributes.

Text character types:

TE: standard DELTA definition, any kind of textual information, including annotations and remarks. Currently used frequently for literature references as well, but see below. Text characters can not be used in the construction of printed keys. DeltaAccess does allow to include them in interactive identification, but using them is comparatively slow.

See also: attributes of the CHAR entity and CharEdit form.

 

Standard and extended numerical attributes:

The DELTA standard defines the following statistical attributes for numerical values:

Mean

Mean (= average)

-Low

Lower value of unspecified range (could be µ-s.d., but not known)

+High

Upper value of unspecified range (could be µ+s.d., but not known)

Min

Minimum value

Max

Maximum value

In a DELTA coded import file, these statistical attributes are entered as a single string in the following format: "(Min-) LowerRange-Mean-UpperRange (-Max)", where Min, Max, and either Mean or both range limits, may be missing.

DeltaAccess defines numerical attributes as character states of numerical characters. You define them very similarly to the states of categorical characters, see the CharEdit form. The attributes listed above are called Standard or Minimal set of numerical attributes in DeltaAccess.

DeltaAccess offers a much wider spectrum of additional statistical attributes. Any desired statistical attribute can be added to the list of numerical character states during the definition of a numerical character. The following attributes are added if you select Extended set of numerical attributes during import of a DELTA file or from the add character states pick list in the header of the character state subform:

Median

Median

Mode

Mode

SD

Standard deviation of sample (df = n-1)

SE

Standard error of mean (= SD/Squareroot(N))

Val

Single value (i.e. sample size N=1)

N

sample size, number of measurements used to calculate statistics

+SD

Range: mean plus 1 standard deviation

-SD

Range: mean minus 1 standard deviation

-CI 95

Lower limit of 95% confidence interval for mean

+CI 95

Upper limit of 95% confidence interval for mean

-CI 90

Lower limit of 90% confidence interval for mean

+CI 90

Upper limit of 90% confidence interval for mean

+Q90

Upper limit of 90% quantile (= between 5 and 95 %)

-Q90

Lower limit of 90% quantile (= between 5 and 95 %)

+Q80

Upper limit of 80% quantile (= between 10 and 90 %)

-Q80

Lower limit of 80% quantile (= between 10 and 90 %)

+Q50

Upper limit of 50% quantile (= between 25 and 75 %)

-Q50

Lower limit of 50% quantile (= between 25 and 75 %)


Upon export, all ranges, quantiles, or confidence intervals will become unspecified ranges, and the distinction between mean, median, etc., will be lost. The sample size N and other extra attributes can not be exported because they are not defined in DELTA.

Also, DELTA does not specify whether a single numerical value is a single measurement, a mean, median, or mode. Import assumes that such values are means, but you can later change the attribute for each character separately within DeltaAccess.

 

MultiStateType (character definition attribute):

This attribute is available for categorical (UM/OM) characters only. It is used to define whether the states of a categorical character may have multiple states in a single item. Setting MultistateType to '0, Exclusive' prohibits this.

Multiple states may occur if an item is polymorphic (e.g. different states apply to young and old organisms), variable (phenotypically or genotypically), or if it is a class (e.g. a genus with different species). In these cases, MultistateType defines how the states shall be combined in natural language output. By default, DeltaAccess will set the MultistateType such that the 'OR'-operator is used for UM-type characters and the 'TO' operator for the OM-type characters. For some characters, it will be more logical to use an AND operator one.

The multistate type Exclusive is the equivalent to the DELTA EOM/EUM character types (which are converted accordingly during import). Currently the Exclusive restriction is not enforced during data entry. You can check for violations of this condition using the CS = exclusive-condition (Multistate=0) violated-analysis query. Presence/absence or yes/no data should preferably be coded as UM with Multistate = Exclusive.

See MultiStateType values for a table of possible values of this attribute. This attribute is proposed as a new DELTA directive. See also: character types, attributes of the CHAR entity and CharEdit form.

 

MultiStateType values (character definition attribute):

MultiStateType

Value

Description

Excl.

0

states are exclusive = only a single state may be present in each item

Or

1

combine states with OR operator (DELTA: '/')

And

2

combine states with AND operator (DELTA: '&')

To

3

states intergrade, e.g. '1 to 2 or 5 to 6' (DELTA: '1-2/5-6')

To/And

4

states intergrade, e.g. '1 to 2 and 5 to 6' (DELTA: '1-2&5-6')

With

5

states are a combination, e.g. 'green with some yellow'

 

Why the preferred operator for a character may be consistently either OR or AND: Probably most languages make a difference between the assertion of attributes and the listing of objects. Although the same formal logical operator may be involved, the natural language equivalent will be different depending on what you are referring to. Most people will say: "The flower color is red or orange". The alternative "red and orange" is meaningful preferably if both colors are present in different parts of the flower (which should be specified), or if the colors form a pattern. On the other hand, most people will say: "The plant occurs in France, Spain, and Italy", although, using predicate logic, this really means that the plant occurs in "France or Spain or Italy". Note that when querying a database, you usually have to use this second version, although it might not sound intuitive in natural language. DeltaAccess defines the MultiStateType character definition attribute to allow this distinction to be recognized in natural language output.

 

Mandatory (character definition attribute): The character must be used (= "entered" or "scored") in each item. If the character can not be specified for some items, the special character state 'U' for unknown must be inserted to explicitly state this fact. If you set this attribute, the item description editor can help to avoid that you unintentionally skip characters that should really be readily available for all or almost all items.

DeltaAccess can not enforce the Mandatory attribute on the database engine level. Thus it will be ignored if you edit your data directly in tables or queries. Editing forms issue a warning when mandatory characters are missing.

* Note: The current version of the ItemEdit form supports this check only in the rarely used All character view. In By-character view, only the "!" symbol in the character list indicates that a character is mandatory, but entering it is not enforced.

See also: attributes of the CHAR entity and CharEdit form.

 

Introduction to Reliability, Availability, and character weights:

The usefulness of a character for interactive identification purposes may be influenced by several factors:

In classical DELTA, all three factors are combined into a Character Reliability or Character Weight value. DeltaAccess uses character Reliability for the first two factors. It introduces a new character definition attribute, character Availability for the last factor, because it is relevant only during identification, and not in respect to the quality of the recorded data.

The impact of character Reliability on identification is modified by the following attributes: character Availability, modifier Reliability of the item description modifier, and item Abundance. See also: Calculation of combined reliabilities.

 

Reliability (character definition attribute):

The character Reliability influences the choice of characters for printed keys or during interactive identification. See Introduction to Reliability, Availability, and character weights for a discussion of the different factors influencing the usefulness of a character for identification purposes. Character Reliability is further used by Confor, as a weighting value when exporting data to Nexus, PAUP, etc.

Reliability is a positive integer between 1 and 10. The default value is 5 (= average reliability). Fractional values (1/32 to 32) from the Character Weights directive preferred by older versions of Confor, are normally converted during import. Optionally they may be imported unconverted, but this may lead to problems when using the Identify module of DeltaAccess. The attribute is required, but the default value will automatically be set.

See also: attributes of the CHAR entity and CharEdit form.

 

Availability (character definition attribute):

Availability is a new character attribute introduced by DeltaAccess. It allows to specify the ease with which the character can normally be used for identification purposes. See Introduction to Reliability, Availability, and character weights for a discussion of the different factors influencing the usefulness of a character for identification purposes and the reason to introduce a new attribute Availability.

Availability is a positive integer between 1 and 10. The default value is 5 (= average availability). During Import, the values of Reliability are copied to the Availability attribute as well. Availability is exported only as a special DeltaAccess directive.

See also: attributes of the CHAR entity and CharEdit form.

 

Fuzziness, FuzzinessIsPercent (character definition attributes):

Fuzziness is defined for numerical and OM type characters. It defines the error allowed during information retrieval. It is evaluated during Compilation of item descriptions for interactive identification, see interactive identification.

The DELTA directives Absolute Error and Percent Error are imported to and exported from these attributes. The values of both directives are stored in Fuzziness, while FuzzinessIsPercent distinguishes between the two directives. Standard DELTA (as implemented, e. g., in Confor and Intkey) restrict the use to numerical characters, the extension to OM type characters implemented here is specific to DeltaAccess.

See also:
  
Definition of Fuzziness and FuzzinessIsPercent for numerical characters
  
Definition of Fuzziness for ordinal characters
  and the
attributes of the CHAR entity and CharEdit form.

 

The Fuzziness and FuzzinessIsPercent attributes and numerical (RN/IN) type characters:

If no range is available in an item for a numerical character, but a mean (or single measurement) is available, ranges are created based on the mean plus/minus Fuzziness. FuzzinessIsPercent determines whether the value in Fuzziness is to be interpreted as an absolute or a relative (percent) value. Items which already contain a range remain unchanged.

Example: In a specific item a RN character contains the value 200, and Fuzziness is defined as 20. During the compilation of the item description for identification, the mean (or single value) 200 is not written into the compiled table (ID_DESCR), but the range 180-220 (for FuzzinessIsPercent = False) or 160-240 (for FuzzinessIsPercent = True) instead.

Note that the algorithm used by DeltaAccess differs slightly from the one IntKey uses. The Confor/Intkey documentation says: "A percentage error r applied to a value v produces a range 100v/(100+r) to v(100+r)/100". In contrast, DeltaAccess use the standard mathematical definition of percent and calculates: v*(100+r)/100 and v*(100-7)/100.

The reliability in the compiled item description of the added ranges is identical with the reliability of the mean from which they were derived.

Return to Fuzziness, FuzzinessIsPercent, or Compiling item descriptions for interactive identification).

 

The Fuzziness attribute and ordinal (OM) type characters:

DeltaAccess allows the use of the Fuzziness attribute for OM type characters as well as for numerical characters. The value of Fuzziness must be an integer in this case.

Example: Fuzziness = 1, and a state 3 has been scored in a specific item. During the compilation of the item description for identification, the states 2 and 4 are added to the compiled table (ID_DESCR). No error occurs if a state is not available (e. g. 3 might be the highest available state), these states are not added. The current algorithm limits the highest evaluated Fuzziness value to 4.

The reliability in the compiled item description of the fuzzed states is reduced by 2, i.e. if state 3 had reliability 7, the fuzzed states have reliability 5.

Return to Fuzziness, FuzzinessIsPercent, or Compiling item descriptions for interactive identification).

 

Weights conversion during import: Two different types of weight values are used in the DELTA definition: Values between 0.03125 and 32 (directives: Character weights and Item weights) as used in older versions of Confor, and positive integer values 1, 2, ..., 9, 10 (directives Character reliabilities and Item abundances) as used by Pankey. The latter definition is preferred by DeltaAccess and also in the new versions of Confor.

During import, you have the option to convert values from the Character weights and Item weights directives into Reliability/Abundance values, using the equation: Ln2(Weight)+5. DeltaAccess's export functions always use the reliabilities/abundances directives.

 

FormatString (character definition attribute):

This attribute in the character definitions defines a format for the output. The DELTA directive Decimal Places is imported into this attribute, but FormatString is much more powerful. You can either use a format from the pick list, or use any standard Windows format string (like they are used in most Windows programs, including Microsoft Access). You must use the international/English formatting codes. If you have a localized (non-English) version of Microsoft Access, you should check the help file for the definitions of format strings in Visual Basic, since national format strings can differ. Note: Do not use the standard setting "Currency". This will change depending on the computer on which the data are viewed. For example, 100 SFr entered in Switzerland would be reported as 100 $ by someone using the same data set in the USA, without any respect to real conversion rates. Instead, create you own currency format to preserve the currency information.

See also: attributes of the CHAR entity and CharEdit form.

 

OmitValues (character definition attribute):

This attribute applies to numerical characters only and is used in natural language descriptions. *** Currently DeltaAccess imports/exports this attribute but does not use it in the internal natural language descriptions yet! ***

If you leave OmitValues empty (the default) no special formatting is applied to the numerical character. If the value '-' is set (for "omit lower"), the lower range and the minimum are omitted from the natural language description. Conversely, the value '+' defines that the upper range and the maximum are omitted.

This attribute is used together with characters which define a maximum or minimum limit for some measurement variable. Example: "Conidia at most x µm long". In such characters, setting the lower range is useful for later analysis, but it should not be used in the natural language output.

OmitValues is imported from and exported to the DELTA directives Omit lower for characters and Omit upper for characters. The latter directive is a new directive defined by DeltaAccess.

*** Note: It seems that Confor/Intkey. do not observe this attribute in the compilation for interactive identification. This does not seem to be logical. DeltaAccess might use the attribute for identification in a later version.

See also: attributes of the CHAR entity and CharEdit form.

 

NewParagraph (character definition attribute):

Used in natural language descriptions: A new paragraph is inserted, starting with this character.

The attribute can be edited in the LinkGroup editor and in the CharEdit form. See also: attributes of the CHAR entity.

 

SemicolonToComma (character definition attribute):

Used in natural language descriptions: Linked characters (see LinkGroup) are separated by a comma instead of a semicolon. Compare Dallwitz & al 1995 for further notes and warnings on using this feature. In the new Confor version this directive is planned to be replaced by a PUNCTUATE directive.

The attribute can be edited in the LinkGroup editor and in the CharEdit form. See also: attributes of the CHAR entity.

 

OmitFinalComma (character definition attribute):

Used in natural language descriptions: The final comma between character states is omitted ('1,2 and 3' instead of '1,2, and 3'). Used to adjust the output to languages where putting a comma before the "and" or "or" would be a spelling error.

See also: attributes of the CHAR entity and CharEdit form.

 

CharHeading (character definition attribute):

If set, extra headings are written in front of a single or a group of characters. In contrast to the following attribute ItemSubheading, this one is used during output of the character definition, e.g. in the Character definition report.

Both the CharHeading and ItemSubheading attributes are links to the CHAR_Heading entity. In the character definition editor (CharEdit) you can only select headings already defined there. Use the heading editor (DeltaEditHeadings) to add, edit, or delete headings. See also: attributes of the CHAR entity.

 

ItemSubheading (character definition attribute):

If set, extra headings are written in front of a single or a group of characters in the item description output, esp. reports and natural language description. If you want to have a heading in front of a group of characters, esp. if the first character may not always be present, you have to copy the heading text to each character. If the characters are consecutive, the heading will be printed only once.

Both the CharHeading and ItemSubheading attributes are links to the CHAR_Heading entity. In the character definition editor (CharEdit) you can only select headings already defined there. Use the heading editor (DeltaEditHeadings) to add, edit, or delete headings. The HeadingPhrase is optional and will be used in preference of HeadingName during natural language output of ItemSubheadings. See also: attributes of the CHAR entity.

 

LinkGroup (character definition attribute):

A group of characters may be defined to form a group using this attribute. The DELTA Link characters directive is imported into this attribute. All characters with the same number form a group. If the characters of a group are consecutive in a given item, the natural language description is changed: characters of the same group are output as one sentence, separated by ';'. The phrasing is shortened within a group by the elimination of all leading words identical with the first character.

A special editor is provided to edit LinkGroups more conveniently than in the CharEdit form, see Editing link groups.

Related attributes: SemicolonToComma, NewParagraph. See also: attributes of the CHAR entity

Attributes of the character state definition (..._CS)

 

Character states:

In classical DELTA, only categorical (UM/OM), non-numerical characters can have states. The minimum number of states is 2, and you can have as many states as you need (some programs might limit the number of states; e. g. DEdit/Pankey allows a maximum of 31 states).

The special states (U/V/-) are available without being defined in the character definition of a DELTA coded text file. DeltaAccess defines these states explicitly, i.e. they appear in the list of character states. Nevertheless, these special states are excluded from the count of states defined for a character (NumStates attribute).

DeltaAccess uses character states for additional purposes. The special state TE is used to enter text information instead of a categorical character state. For text characters (Type = TE) this is the only state allowed.

For numerical characters (RN/IN), DeltaAccess treats the different numerical attributes (or statistics) as character states; see Standard and extended numerical attributes above.

Character states can be entered and edited in the character state subform of CharEdit. See also the remaining attributes of the CS entity.

 

Character state name (CharStateName, character state definition attribute):

The name of a character state should be as concise as possible. Normally (i.e. if the CharStatePhrase attribute, which is not accessible from the user interface in DeltaAccess, is not used), it should be make an acceptable sentence when joined with its character name.

The CharStateName is limited to 255 characters. You can use any characters you need, but if you want to export your data to a DELTA coded text files, the name may not contain a '# ' or '*'. You can use / or > as long as they are not followed by a space, and you can use < as long as it does not follow a space or come at the beginning.

See also: Character state Notes, attributes of the CS entity, and character state subform of CharEdit.

 

Character state phrase (CharStatePhrase, character state definition attribute):

This attribute is currently not yet used. It provides a similar functionality to the CharName/CharPhrase pair for the character states. So far only the CharStateName attribute is accessible in the user interface.

See also: attributes of the CS entity.

 

Notes (character state definition attribute):

Character state Notes (or comments) are optional and can be as long as you like. You should not put the DELTA brackets (< and >) around your text.

In DeltaAccess, character state notes are currently used for internal notes only. If the data are exported, they may be displayed by another DELTA program like Intkey. This can be used to display an identification hint to the user, e. g. "at least some" or "use a hand lens". In many cases where state notes are currently used (e. g. CharStateName = "small", Notes = "less than 2 mm"), it is questionable whether the additional information should not go directly into the character state name or phrase.

See also: attributes of the CS entity, and character state subform of CharEdit.

 

Implicit (character state definition attribute):

When a character state is marked as implicit, it is assumed to be the default state for any item where no other state (including a special state U/V/-/TE) has been scored.

Note that although this defines the state as implicit, most analyses will not immediately recognize the implicit state. The calculation of implicit states is very time consuming and an online calculation would be impractical for the very large data sets DeltaAccess is intended for. You must use the analysis option insert implicit states as explicit data before the implicit states are fully recognized in all analyses.

See also: attributes of the CS entity, and character state subform of CharEdit.

Attributes of the dependent character definition (..._DEP)

Introduction to character dependency: Sorry, must still be written [******]

See also: attributes of the DEP entity and the section on defining inapplicable characters of the CharEdit form.

 

InapplicableCID (character dependency definition attribute):

Some characters depend on one another. For example, if leaves are absent, characters relating to leaf characteristics (overall form, size, form of tip, form of base, etc.) are logically impossible. The latter characters are inapplicable, if the controlling character/character state combination is scored in an item. Another way to say this is that leaf form etc. is dependent on leaf presence. Inapplicable and dependent are used as synonyms in this documentation. All three DELTA directives (Inapplicable, Applicable, and Dependent) are imported into this entity.

A dependent or inapplicable character is identified by the attribute InapplicableCID in the ..._DEP entity. The controlling character and state are identified by the attributes CID and CS, respectively, in the same entity. The inapplicable/dependent character can not be used in any item where the controlling CID/CS combination above is used (or scored).

DEdit/Pankey calls the controlling character a "parent character" and the dependent/inapplicable character a "child character".

See also: attributes of the DEP entity and the section on defining inapplicable characters of the CharEdit form.

Attributes of the character heading definition (…_Char_Heading)

 

HID (character heading definition attribute):

Character Heading ID, a positive integer number. When entering a new heading, you must also enter a heading ID. If you enter a new heading in the Edit character headings dialog box, the HID will automatically be preset to the highest HID + 1.

If you import a DELTA coded text file data set containing heading directives, certain standard numbers are used in the import process. These numbers have no special significance beyond import. The HID number can be changed at any time to any positive integer value, which is not yet used by another character heading.

The sequence of HID numbers determines the sequence of the named character groups used in identification. Change the HID number to change that sequence. A convenient method to do this is to switch the Edit character headings dialog box to datasheet view. The HID sequence does not influence the heading sequence when used as CharHeading or ItemSubheading.

See also: attributes of the CHAR_Heading entityand Edit character headings.

 

HeadingName (character heading definition attribute):

Heading, used for character headings (see CharHeading) and identification character groups. Also used for item subheadings (see ItemSubheading), if no HeadingPhrase has been defined.

See also: attributes of the CHAR_Heading entityand Edit character headings.

 

HeadingPhrase (character heading definition attribute):

Optional phrase; if available it is preferred over the HeadingName for the use as ItemSubheadings in natural language output.

See also: attributes of the CHAR_Heading entityand Edit character headings.

 

Notes (character heading definition attribute):

Character heading Notes (or comments) are optional and can be as long as you like. You should not put the DELTA brackets (< and >) around your text.

Use the Notes attribute for internal notes only. Currently, heading notes can not be exported to or imported from DELTA text files.

See also: attributes of the CHAR_Heading entityand Edit character headings.

 

AutoGroup (character heading definition attribute):

Used for the definition of special Named character groups (which are used in identification).If the AutoGroup attribute is set to one of the predefined codes, the Identification will automatically form an group or switch to an alternative mode (e. g. entering ItemName conditions). The advantage of defining a group using this attribute is, that changes in the character definition are immediately reflected in the Named character groups. Therefore, the manual definition of Character groups is not available once AutoGroup is set.

Alternatively, a full SQL-query which returns a set of CID attributes can be entered in this attribute. Note that the length of the query string must less than 256. The SQL code can either be written for the current descriptor project, or for a project with the name "PRX", referring, e. g., to the character definition entity as PRX_CHAR. The latter makes the definition independent of the project name, which has advantages if the character definition is copied to a new project. Example: "Select CID From PRX_CHAR Where Availability>7;" would define a named character group consisting only of characters with high availability.

Note that the AutoGroup attribute is specific to this implementation of DELTA and is currently neither exported to nor imported from DELTA files. Predefined default AutoGroups will be added to each project during import.

See also: attributes of the CHAR_Heading entityand Edit character headings.

 

Currently defined codes for the AutoGroup attribute are:

CODE

Description

Added during import

CATNUM

Categorical/numerical characters

Yes

CAT

Categorical characters (type UM/OM)

Yes

NUM

Numerical characters (type RN/IN)

Yes

OM

Ordinal characters (type OM)

No

UM

Nominal characters (type UM)

No

RN

Real numerical characters (type RN)

No

IN

Integer numerical characters (type IN)

No

TE

Text characters (type TE)

Yes

XNOTE

Search in item description notes

Yes

XITEM

Search in item name (e.g. specify a genus)

Yes

XN

Globally search Notes in all characters

Yes

XT

Globally search text (excl. notes) in all characters

Yes

XA

Globally search any text (incl. notes) in all characters

Yes

ALL

All characters (any type)

No

ILL

All illustrated characters

No

XSEP

-------------------------------------------------------------------

Yes

The XSEP AutoGroup provides a separation line. You can insert multiple separation lines to structure the list of named character groups presented during identification.

Attributes of the item definition (..._ITEM)

 

IID (item definition attribute):

The item identifier is a positive integer number which forms the primary key of the (..._ITEM) entity. Each item must have a unique IID, by which it can be referred to from other entities, e. g., to determine which item is currently described in the item description. Note that the ItemName is not necessarily unique, since the same item may be described multiple times, e. g., when collection information from different literature sources.

The IID determines the sequence in which the item list is displayed, unless an explicit alphabetical or other sorting is requested. To change the sequence, the number can either be edited manually, or the special (*** not yet implemented, will be added in later version of DeltaAccess***) form can be used.

If you enter a new item in the ItemEdit form, the IID will automatically be preset to the highest IID + 1. You can change the IID number at any time to any positive integer value, which is not yet used by another item's IID.

See also: attributes of the ITEM entity and ItemEdit form.

 

Item Name (item definition attribute)

The name of an item can be a taxon name or any phrases you wish. It can be up to 255 characters long and may not be left out. In contrast to the character name, it does not have to be unique. Examples: "Microbotryum violaceum", "Combea De Not. (1846)", or "CBS 123.99 (violet strain variant)". If you want to re-export your data to other DELTA programs, you may not use the special DELTA characters '#' and '*', and the combination of '/' followed by a space.

The item name should identify an item as well as possible. Ultimately, the item name should be the entry point to a taxonomic database subsystem. If the descriptor project is concerned with biological taxa, it may therefore contain a taxon name, with or without authors, place, and date of publication, depending on what information will be available in the taxonomic database subsystem.

Note that the ItemName is not necessarily unique, since the same item may be described multiple times, e. g., when collection information from different literature sources.

The current version of DeltaAccess is still limited in its handling of multiple items with the same taxonomic name. The necessary attributes to distinguish items, information about which was collected from different sources (e. g. literature references or direct data collection from different collection specimen), are already available in the item table, but not yet used in the user interface. The implementation of these attributes may yet change. You can either use the fields through the raw table or query interface of Access, or add such information after the ItemName and revise your data in a later version of DeltaAccess.

See also: ItemNotes, attributes of the ITEM entity, and ItemEdit form.

 

Item Notes (= comments):

The item name Notes are optional, and may contain any text you wish. They can be up to 64000 characters long. You should not put the DELTA brackets (< and >) around it. Usually, the item name notes contain development information, and are not exported to natural language descriptions. They are imported from and exported to the Item notes directive, i.e. during export they are not added in DELTA-brackets (< and >) after the ItemName. They may be imported from comments in brackets after the ItemName though, depending on the setting of Item Name comments are authors in the import dialog box.

See also: character notes, attributes of the ITEM entity, and ItemEdit form.

 

Abundance (item definition attribute):

The Abundance of an item indicates how frequently this item will be encountered by a user using either a printed key generated from the DELTA data set, or an interactive identification program like Identify or Intkey. The abundance of items in nature can then be used to make the identification path for frequently encountered items shorter than that for rare items. See The operational concept of item abundance for further information.

DeltaAccess prefers a positive integer between 1 and 10, but fractional values as preferred by older versions of Confor, may be used as well, compare Weights conversion.. DeltaAccess requires a value in this attribute, but you can always leave the default value 5 (= average reliability) unchanged.

The item abundance is combined with character Reliability and Availability, and with the modifier Reliability; compare Calculation of combined reliabilities.

See also: attributes of the ITEM entity and ItemEdit form.

 

The operational concept of item abundance

To estimate the best sequence of identification questions in printed or interactive keys, one tries to find the shortest path for the average identification. The information about the frequency with which certain items will be identified ("abundance") can be used to weigh this path.

Abundance should be interpreted operationally, i.e. how often will the item be identified. Some items may be very frequent, but unlikely to be encountered. They are unlikely to be identified and should receive a low abundance value. For instance, some deep see fish species may be very abundant, but a key of fishes will profit more if abundance is interpreted as abundance in fishermen's catches.

Similarly, frequently encountered items may be relatively rarely identified using the software. A pathogenic organism which must be identified several times each day is probably immediately recognized even by technical personal, without consultation of either a printed, or a computer-aided identification tool.

Attributes of the item description (..._DESCR)

 

IID reference:

This attribute refers to the item identifier (IID attribute) of the (..._ITEM) entity. Such a reference or relation is used to link information from the current entity with information about a certain item. For example, in the item description the IID reference determines, which item is described by the current record. All further information about the item, including the name, can then be found in the attributes of the ITEM entity.

 

CID reference:

This attribute refers to the character identifier (CID attribute) of the _CHAR entity. Such reference or relation is used to link information from the current entity with information about a certain character. For example, in the item description the CID reference determines, to which character the description in the current record belongs to. All further information about the character, e. g. its type or the measurement unit in which the values stored in the item description attribute X are recorded, can then be found in the CHAR entity.

 

CS reference:

This attribute refers to the character state code. Together with the reference to the character identifier (CID attribute) it forms a relation to the character state (_CS) entity. Such reference or relation is used to link information from the current entity with information about a certain character. For example, in the item description the CS reference determines, to which character state the code in the CS attribute of the item description entity belongs. All further information about the character state, e. g., the character state name, can then be found in the CS entity.

 

Modifier (Item description attribute, DESCR):

A modifier entry modifies the meaning of a character state. Typical entries are "rarely", "frequently", "slightly", "very", etc. To avoid using synonymous modifiers and to facilitate a translation of the item description into multiple languages, only values from the associated pick-list can be entered into the modifier attribute of the item description. You can use the Modifiers subform of the CharEdit form to define which modifiers are available for which character. For more information on modifiers see the Introduction to the concept of modifiers and Modifier definition and attributes, overview.

See also: attributes of the DESCR entity and ItemEdit form.

 

TXT (Item description attribute, DESCR):

Text information of text characters, or other character types used as a text character. Up to 64000 characters can be entered ("memo"-field). If the text becomes too large for the field on the screen, double-click to zoom the contents of the TXT attribute into a larger text-editing dialog box. In some situations you may have to use the key combination Shift-F2 for a similar dialog box.

TXT may be used only, if the character state is set to the special character state TE. The use of the TXT and the X attribute is exclusive. These conditions are observed on the database engine level.

Note that the more obvious name "Text" for this attribute can cause problems on some SQL server systems, because TEXT is a reserved word of SQL.

See also: attributes of the DESCR entity and ItemEdit form.

 

X value (Item description attribute, DESCR):

A real numeric number, into which the numerical values for the numerical/statistical attributes defined for this character can be added. This attribute is used both for real and integer numerical characters (RN/IN). See Standard and extended numerical attributes for a list of the predefined numerical/statistical attributes with which this attributes is normally used. The use of the TXT and the X attribute is exclusive.

See also: attributes of the DESCR entity and ItemEdit form.

 

Notes (Item description attribute, DESCR):

Each use of a character or character state in a specific item (e. g., scoring a character state for an item) may be accompanied by a Note (or comment). In contrast to character notes or character state notes, these notes will be included in natural language descriptions. DeltaAccess preserves new lines (line break, "return character") in the text, but they will be lost during export to the DELTA format.

In classical DELTA, this attribute is frequently used to add estimates of frequency such as 'often', 'rarely', and other modifiers to the item description. In DeltaAccess you should use the Modifier attribute instead.

Note that when you have to translate a data set into multiple languages, each item description note must be translated individually, while the character and modifier definitions are translated only once for the whole project. Also, information added in the item description notes is not usually available during identifications. For these reasons, item description notes should be used as sparingly as possible.

See also: attributes of the DESCR entity and ItemEdit form.

 

SEQ (Item description attribute, DESCR):

Occasionally users are interested to record the sequence in which they have added multiple character states to an item description. The attribute SEQ stores a such a sequence as a long integer number.

Note that the import procedure usually creates a sequence global to the entire descriptor project, but that this is not required. The sequence may start in each item/character combination with 1.

The sequence can be changed either by changing the SEQ attribute directly, or using the controls on top of the item description table (right hand side) in the ItemEdit form.

See also: attributes of the DESCR entity.

Attributes of the resource list (..._RSC)

 

Resource (Resources attribute, RSC):

The Resource attribute contains the name of the external resource. For an illustration, this would be the name of the file containing the illustration. The filename should contains an extension which identifies the type of the file.

Resource can optionally contain a path to the resource. The use of paths should be avoided if possible. Use the project properties ResourceDefaultPath and ResourceDefaultURL instead. If it is not desirable to keep all resources in a single folder (= directory), you can create multiple folders in the default path and enter the resource with a relative paths, e. g. "maps/guanacaste.gif".

Recommendation: UNIX filenames are case sensitive. Even though you are currently working on a non-case sensitive operating system, you might later want to move information onto a UNIX web server. You therefore should consider following a rule in the naming of your files, e. g., making them all lowercase.

See also: attributes of the RSC entity.

 

Default pathnames to media resources are stored in the Property table under ResourceDefaultPath (compare Editing project properties). If no default pathname has been defined, the directory in which DeltaAccess is stored is used. It will only be used, if the Resource attribute or the RSC entity contains not a complete path or URL.

ResourceDefaultURL allows to define a second path, which is preferred for internet connections to the media-directory. Example: A local path "N:\users\xyz\internet\images" could look like "http:// www.organisation.domain.edu/ Syrphidae/Identify/images" when accessed through the internet. Note that both paths should point to the to same resources. This can be achieved by pointing to the same physical directory or folder, or by keeping identical copies of all resources in different places.

Attributes of the modifier definition (..._MOD)

Modifiers are a new concept introduced by DeltaAccess. The following information is available:

  Introduction to the concept of modifiers
  Modifier definition and attributes, overview
  Reliability (modifier definition attribute)

Appendix 3, Table of contents    Main table of contents

 

Reliability (modifier definition attribute):

For each modifier a positive integer between 1 and 10 can be defined. The default value is 5 (= average reliability). The reliability of a the modifier modulates the character reliability and availability; compare Calculation of combined reliabilities. Modifier attributes like "occasionally" can be defined to reduce the reliability of the character in this item, while optionally a modifier like "strongly" can be defined to increase the reliability.

See also: attributes of the MOD entity and CharEdit form.

 Introduction to the concept of modifiers

Descriptive data are commonly organized and analyzed using a character times item matrix. Most phylogenetic analyses packages use such a matrix (e.g. PAUP, Swofford 1990). If multiple character states are allowed for each item ("multistate characters"), the character states take the place of characters and the matrix becomes a character state times item matrix. For each character state either the presence or absence, or a probability value can be recorded.

DELTA in its current state supports only presence/absence of character states and not probability values. This was a major reason for the developers of LucID (@. @@@, pers. comm.) to develop a proprietary exchange format instead of using the DELTA data standard.

However, already in the original DELTA publication (Dallwitz 1980) the need for a modification of each character state x item instance was recognized, but at that time the provision of free-text comments were considered sufficient. A proposed new version of DELTA (Dallwitz et al. 1995b) offers several additional mechanisms, but the current article tries to show that these are at most partial solutions to the problem and that a different approach would be more promising.


In addition to free-text notes as much information as possible should be collected and analyzed in a more structured format, called modifiers.

Modifiers add an additional dimension to the character/character states concept. Conceptually there are several, quite different kind of modifiers:

The first group is called frequency modifiers. Whenever more than a single object is described it frequently happens that a description does not apply to all of them. This is not necessarily related to the description of abstract objects like taxa, but can happen just as well if on a single herbarium sheet several flowers are present. In the current definition of DELTA the only way to express information about inconsistency is to add a note to the character state. This is not very desirable for several reasons: Notes can not be analyzed, esp. not used in an interactive identification (which is essentially a form of analysis). Note tend to use different, but essentially homologous phrases for the same situation, which can be confusing. Also, it is difficult to generate multilingual data sets with many item description notes, because each note has to be translated separately. If a consistent set of modifiers is used, which is not confounded with further notes, this translation can be produced just as automatically as the translation of the character names and character states descriptions. A database can store multilingual descriptions and generate the data in any supported language. People from, e. g., Australia, Japan, or France can concurrently work with the same database in the internet.

While the frequency modifiers are not covered at all by DELTA, the following modifiers could all potentially be integrated into the character state list. Introducing them is mainly done for practical reasons, to avoid an undue inflation of characters or character states in large data sets that cover very diverse organisms.

The second group is called location modifiers. You can use them to indicate the place where a character occurs. Frequently an object is inhomogeneous in itself, so that you need to express that, e. g., a plant is hairy at the base of the stem, but glabrous at the very tip. Alternatively it would be possible to create two separate characters for the base and the tip of the stem. The disadvantage is that for the majority of plants which are, e. g., homogeneously glabrous, two characters would have to be scored. Also, the character list during interactive identification becomes more confusing, and you would have to introduce yet another character if you ever find a plant, which is actually hairy in the middle of the stem only...

The third group are quality modifiers. Examples are "very", "weakly", "slightly", etc. These modifiers should be used with care, since often it may indeed be advisable to include them in the list of character states. The KeyStates-mechanism offers a possibility to map very detailed character state sets to a more coarsely grained set for use during identification. Yet, analyzing existing DELTA data sets I found that notes are frequently used to express fine quality modifications. It should be left to the researcher to decide which steps of a continuum to include in the character state definition and which additional information she or he wants to express using the modifier mechanism.

Sometimes quality modifiers may be abused to give a character state an entirely different meaning (e. g. "almost never"). Although this may work well in natural language descriptions, it will usually lead to undesirable results during data analysis and interactive identification and should therefore be avoided. In DeltaAccess, the effect on identification can be compensated to some extent by setting the reliability value for this modifier to zero. Also, using a modifier for such purposes will make a detection of such problems much easier than if this information would have been hidden inside free text notes.


In the current implementation of modifiers (see the following chapter Modifier definition and attributes, overview) these modifier types are merged for the sake of simplicity. I want to avoid the reaction that it is all so confusing, having so many fields to fill in (although you always can leave them empty). This is clearly a compromise, and I would like to hear your comments about it. You currently can not use a combination of frequency, location, and quality modifiers. Another option instead of defining different modifiers separately would be to allow several modifiers per character state.


In short, the advantages of formalizing the concept of modifiers, instead of using free text comments in their place, are: Modifiers are part of the character definition. This allows easy translation into multiple languages for a descriptor project as a whole, the definition of additional properties (formatting properties like whether they are to be printed before or after the states, a translation of frequency or probability wording into exact ranges, and a reliability value, which gives an indication of the influence of the modifier on the reliability of a given state in a given item. Since the modifier/character association can be defined, concise pick lists for modifiers are possible. A single set of color modifiers can be used for all color characters, but will not be visible in other characters. Pick lists are a considerable improvement of the user interface. Comments in contrast are specific to a single item/character(state) combination. The are necessary, whenever unstructured information, particular to this or only a few occurrences must be added. They should be used sparingly, because they are difficult to translate, and inaccessible to any form of analysis, including interactive identification.

Appendix 3, Table of contents    Main table of contents

Modifier definition and attributes, overview

A new attribute Modifier is defined in item descriptions (_DESCR), which allows to mark item data with predefined modifications or indications of uncertainty. The concept of Modifiers is discussed in more detail in the previous chapter.

No modifiers are active by default. For each character, a list of applicable modifiers is defined. Only these modifiers are selectable when entering the item descriptions, thus 'Color'-modifiers are applicable only to colors. The list of applicable modifier-character definitions is defined in the _MOD_Link table.

The definition of Modifiers is part of the character definition. Each descriptor project has a global list of modifiers, which is initialized with a default list during creation of a project. The following attributes are defined for each global modifier:

During "Export as DELTA", UseBlank and Operator are ignored, they are used only for the natural language output. The following rules are used for export: Notes are output always after the character state. If both a Postfix modifier and a Note are present, the modifier is output after the note. (*** Which makes sense with '?', but is it meaningful otherwise? Perhaps add second flag 'AfterNotes'?)

Appendix 3, Table of contents    Main table of contents    Next