DescriptionsModel v3.00.10

From Workbench
Jump to: navigation, search

DiversityDescriptions Information Model (version 3.00.10, 17 July 2014)

Authors G. Hagedorn, A. Plank, A. Link, G. Rambold & D. Triebel 2014
License 80x15.png
Suggested citation G. Hagedorn, A. Plank, A. Link, G. Rambold & D. Triebel (2014). DiversityDescriptions information model (version 3.00.10). http://www.diversityworkbench.net/Portal/DescriptionsModel_v3.00.10
Notes The models currently reside in MS SQL Server, so knowledge of some SQL Server ER-diagram conventions will be helpful.

Besides the screen shots below, a Microsoft T-SQL-Script for the generation of the tables is provided.


ER Diagrams

Overview over all entities and relations used in the database model

ER

The colors in the diagram indicate the logical group of the tables as described in the legend. Some “housekeeping” fields and tables for the documentation when and by whom the data were inserted or changed are not shown in the overview or the detail diagrams.

A short introduction

  • The "Descriptive terminology" with the main tables "Descriptor" and "CategoricalState" provides the means to express the description details. The two fixed enumeration tables "DataStatus_Enum" and "StatisticalMeasure_Enum" provide values defined in the SDD standard 1.1 rev 5.
  • The "Descriptions" part with the main table "Description" reflects the items themselves. Each description must be assigned to a project (see part 3) and references entries from the descriptive terminology. These references are reflected by the tables "CategoricalSummaryData", "QuantitativeSummaryData" and "TextDescriptorData", where additional item specific data are stored.
  • The "Projects" part with the main tables "Project", "DescriptorTree" and "DescriptorTreeNode" build the anchor for the descriptions, which must be assigned to a project. Further functions are a hierarchical organization of the descriptors and restriction of the descriptive terminology to certain projects. Additionally it is possible, to define recommended values of statistical measure, modifier and frequency associated to certain branches of a descriptor tree or single descriptors. If for a certain project no descriptor tree is created, there is no restriction, i. e. all descriptors may be used.
  • The "Resources" part supports management of media links for descriptions, descriptors and categorical states.
  • The "Auxilliary" part provides unique key values for the main tables of the model by table BaseEntity. All relations are built by using those unique keys. By connecting support tables to the BaseEntity (here "Translation") it is possible to provide specific centralized features to all corresponding objects.


Tables

Table: BaseEntity

The BaseEntity is used within the database to provide unique keys

Column Data type Description
id int Database-internal object ID, unique across all tables in database (primary key)
table_id int References the table name of table entries associated with the BaseEntity; NULL indicates orphaned entries and may be used for garbage collection of otherwise deleted object ids
specific_rights_text nvarchar (255) To be used where the licence for an object differs from the default project licence
specific_licence_uri nvarchar (255) URI of licence, where different from project licence


Table: BaseEntityTable_Enum

The BaseEntityTable_Enum contains the names of tables that reference the BaseEntity table

Column Data type Description
id int Database-internal object ID of this record (primary key)
table nvarchar (255) Name of table that references the BaseEntity table


Table: CategoricalState

The categorical states available for categorical descriptors

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
descriptor_id int Reference to the descriptor to which the state belongs (foreign key)
label nvarchar (255) Short label (or name) of descriptor state
abbreviation nvarchar (255) Abbreviated label of descriptor state
detail nvarchar (MAX) Additional detail text explaining or commenting on the descriptor state definition
wording nvarchar (255) Optional separate wording for natural language generation (label will be used if this is missing)
display_order int A positive number defining the sequence in which descriptor states are displayed
DefaultValue: '0'


Table: CategoricalSummaryData

The categorical data of a description

Column Data type Description
id int Database-internal ID of this record (primary key)
description_id int Reference to the description to which these data belong (foreign key)
state_id int Reference to the state (the descriptor is implicit in state_id) (foreign key)
notes nvarchar (MAX) Free-form text detailing special cases or additional data beyond the values
modifier_id int Optional reference to a modifier definition (e.g. "strongly", "at the base", "in autumn") (foreign key)
frequency_id int Optional reference to a frequency modifier definition (e.g. "rarely", "usually", "mostly") (foreign key)


Table: DataStatus_Enum

Values of data status used for descriptions according to SDD 1.1 rev 5

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of data status
code nvarchar (255) Code of data status
abbreviation char (1) Single (intuitive) character
detail nvarchar (MAX) Additional detail text explaining or commenting on the data status definition


Table: Description

The description in the database

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of description
detail nvarchar (MAX) Additional detail text explaining or commenting on the description definition
project_id int Each description belongs to exactly one project (required, foreign key)


Table: DescriptionScope

The scope of the description

Column Data type Description
id int Database-internal ID of this record (primary key)
description_id int Reference to the description to which these data belong (foreign key)
label nvarchar (255) Short label (or name) of scope
dwbURI nvarchar (255) Reference to DiversityWorkbench component
type nvarchar (255) Scope type ("GeographicArea", "Citation", "Observation", "Specimen", "TaxonName", "OtherConcept", "Stage", "Part" or "Sex")
taxon_id int Reference to one of potentially several taxa described by the description (not used, foreign key)
specimen_id int Reference to one of potentially several specimens (collected and preserved) that are being described (not used, foreign key)
observation_id int Reference to one of potentially several observations (not preserved) that are being described (not used, foreign key)
source_reference_id int Reference to one or several literature references on which the description is based (not used, foreign key)
geographicarea_id int Reference to one of potentially several geographic areas refered to in a description (not used, foreign key)
other_scope_id int Reference to further scopes, e.g. stage, sex etc. (foreign key)


Table: Descriptor

Descriptor (= characters, features) define variables

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of descriptor
abbreviation nvarchar (255) Abbreviated label of descriptor
detail nvarchar (MAX) Additional detail text explaining or commenting on the descriptor definition
display_order int A positive number defining the sequence in which descriptors are displayed
DefaultValue: '0'
subclass nvarchar (255) The three character subclasses of SDD are all combined here in one entity and distinguished by this attribute ("categorical", "quantitative" or "text")
DefaultValue: 'categorical'
statistical_measurement_scale nvarchar (255) Scale of descriptor: Categorical may be nominal (unordered, “red/green/blue”) or ordinal (ordered, “bad/medium/good”); Quantitative may be interval (°C) or ratio (mass, length, time, K)
DefaultValue: 'nominal'
usually_exclusive tinyint Applicable to categorical (nominal/ordinal) descriptors only. If usually exclusive = 1, then by default the user interface allows only entering one state. Nevertheless, multiple states in the data are valid.
DefaultValue: '0'
state_collection_model nvarchar (255) Handling of multiple values: OrSet/AndSet: unordered set combined with or/and, OrSeq/AndSeq: ordered sequence combined with or/and, WithSeq: example is “green with brown”, Between: an example is “oval to elliptic”
DefaultValue: 'OrSet'
mandatory tinyint Is the scoring of this descriptor mandatory (required) in each item?
DefaultValue: '0'
repeatability nvarchar (255) How reliable and consistent are repeated measurements or scorings of the descriptor by different observers and on different objects? ("ignore", "very low", "low", "below average", "slightly below average", "standard", "slightly above average", "above average", "high" or "very high")
DefaultValue: 'standard'
availability nvarchar (255) How available is the descriptor or concept for identification? ("ignore", "very low", "low", "below average", "slightly below average", "standard", "slightly above average", "above average", "high" or "very high")
DefaultValue: 'standard'
measurement_unit nvarchar (255) A measurement unit (mm, inch, kg, °C, m/s etc.) or dimensionless scaling factor
measurement_unit_precedes_value tinyint Set to 1 if the measurement unit precedes the value
DefaultValue: '0'
min_plausible_value float Applicable to quantitative descriptors only; in support of a plausibility check for values. Example: for tree height this could be 0, i.e. only positive values allowed
DefaultValue: '-1.79e308'
max_plausible_value float Applicable to quantitative descriptors only; in support of a plausibility check for values. Example: for tree height this could be 99
DefaultValue: '1.79e308'
wording_before nvarchar (255) Representation for natural language output, inserted before the states/value
wording_after nvarchar (255) Representation for natural language output, inserted after the states/value
data_entry_note nvarchar (MAX) A note or prompt when entering or dealing with data


Table: DescriptorInapplicability

The descriptor dependency rules

Column Data type Description
id int Database-internal ID of this record (primary key)
controlled_descriptor_id int Reference to the descriptor to which the rule will apply, e.g. which will be inapplicable if controlling state applies to the same description (foreign key)
controlling_state_id int Reference to the controlling categorical state; if present in a description, controlled descriptor is affected according to rule
rule nvarchar (255) The kind of rule creating a descriptor inapplicability ("inapplicable-if", "applicable-if", "inapplicable-computed-from-applicable")
DefaultValue: 'inapplicable-if'


Table: DescriptorStatusData

The status data of a descriptor for a certain description

Column Data type Description
id int Database-internal ID of this record (primary key)
description_id int Reference to the description to which these data belong (foreign key)
descriptor_id int Reference to the descriptor to which the status value belongs (foreign key)
datastatus_id int Reference to a status definition. Status is like a marker why data is missing or in need of revision (examples: "unknown", "not applicable", "to be checked", "data withheld" etc.) (foreign key)
DefaultValue: '0'
notes nvarchar (MAX) Free-form text detailing additional information (rarely used, not multilingual)


Table: DescriptorTree

The root and definition of a descriptor tree

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
is_complete tinyint Set to 1 if this tree includes the largest possible set of descriptors for the taxon set associated with a project (i.e. the "master-descriptor tree", other trees being subsets)
DefaultValue: '0'
label nvarchar (255) Short label (or name) of descriptor tree
type nvarchar (255) "mixed", "part-of", "property" or "generalization"
project_id int Each tree belongs to exactly one project (required, foreign key))


Table: DescriptorTreeNode

The descriptor tree nodes representing either nodes of the tree or descriptors ("leafes" of the tree)

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
parent_node_id int Reference to a parent node, creating a true tree; NULL for a root node (foreign key)
label nvarchar (255) Short label (or name) of internal node associated with a concept; NULL for a descriptor node
abbreviation nvarchar (255) Abbreviated label of node
display_order int A positive number defining the sequence in which child nodes are displayed
DefaultValue: '0'
descriptortree_id int Reference to a descriptor tree; all nodes must have a direct link to the tree definition (foreign key)
descriptor_id int Reference to a descriptor; NULL if the present node is an inner concept node defining a hierarchy of nodes or descriptors (foreign key)


Table: DescriptorTreeNodeRecFrequency

Selection of recommended frequency values for descriptor tree parts or single descriptors

Column Data type Description
id int Database-internal ID of this record (primary key)
node_id int Reference to the descriptor tree node (foreign key)
frequency_id int Reference to the recommended frequency modifier (foreign key)


Table: DescriptorTreeNodeRecModifier

Selection of recommended modifier values for descriptor tree parts or single descriptors

Column Data type Description
id int Database-internal ID of this record (primary key)
node_id int Reference to the descriptor tree node (foreign key)
modifier_id int Reference to the recommended modifier (foreign key)


Table: DescriptorTreeNodeRecStatMeasure

Selection of recommended statistical measures for descriptor tree parts or single descriptors

Column Data type Description
id int Database-internal ID of this record (primary key)
node_id int Reference to the descriptor tree node (foreign key)
measure_id int Reference to the recommended measure (foreign key)


Table: Frequency

Definition of frequency modifier values

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of frequency modifier (e.g. "very rare", "rare", "usually" etc.)
lower_estimate float Each frequency modifier defines a frequency range, this is the estimated lowest frequency
DefaultValue: '0'
upper_estimate float As above, estimate of upper range for the border; note: ranges may overlap!
DefaultValue: '1'
display_order int A positive number defining the sequence in which frequency modifiers are to be displayed
DefaultValue: '0'


Table: Modifier

Definition of modifier values

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
class nvarchar (255) Grouping of modifiers into classes ("Certainty", "Seasonal", "Diurnal", "TreatAsMisinterpretation" or "OtherModifierClass")
label nvarchar (255) Short label (or name) of modifier (e.g. "strong", "probably" etc.)
display_order int A positive number defining the sequence in which modifiers are to be displayed
DefaultValue: '0'


Table: OtherScope

Scope values for description scope values for scope types "Other scope", "Part", "Stage" and "Sex"

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of scope (e.g. "male", "female", "adult", "juvenile" etc.)
detail nvarchar (MAX) Additional detail text explaining or commenting on the scope and delimitation of the present description
type nvarchar (255) Describes the scope type ("sex", "stage", "part" or "other")
sex_status_id int Reference to sex status value according SDD V5.1 (foreign key)


Table: Project

Projects define separated workareas in a single database

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
parent_project_id int Reference to a parent project, creating a subproject; NULL for a root project (foreign key)
label nvarchar (255) Short label (or name) of project (or subproject); each project defines a separate work environment within a shared database by means of defining sets of descriptors (through DescriptorTree)
rights_text nvarchar (255) The default rights and licence statement for the entire project; BaseEntity provides means to override this for individual objects
licence_uri nvarchar (255) URI of licence for the project; BaseEntity provides means to override this for individual objects
detail nvarchar (MAX) Additional detail text explaining or commenting on the project (or subproject)
restricted_to_descriptortree_id int Reference to a descriptor tree (not used, foreign key)
restricted_to_taxonset_id int Reference to a taxon set (not used, foreign key)
primary_language_code nvarchar (3) Language of primary text version (fields directly in tables, may be translated in table Translation); Examples: "de", "en" etc. (ISO 639 language code)
ProjectProxyID int Reference to a project definition of DiversityProjects that allows control of access rights
DefaultValue: (0)


Table: Project_AvailableScope

Scope values available for a certain project

Column Data type Description
id int Database-internal ID of this record (primary key)
project_id int Reference to the project for which a shall be available (foreign key)
scope_id int Reference to a scope definition in OtherScope (foreign key)


Table: QuantitativeSummaryData

The quantitative data of a description

Column Data type Description
id int Database-internal ID of this record (primary key)
measure_id int Reference to the statistical measure (minimum, mean, maximum, ranges, sample size, variance etc.) recorded in value (foreign key)
description_id int Reference to the description to which these data belong (foreign key)
descriptor_id int Reference to the descriptor which the values record (foreign key)
value float The value of the statistical measure
notes nvarchar (MAX) Free-form text detailing special cases or additional data beyond the values
modifier_id int Optional reference to a modifier definition (e.g. "strongly", "at the base", "in autumn") (foreign key)


Table: Resource

Hyperlinks to separate rich text/media objects

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
terminology_role nvarchar (255) Role in relation to descriptors or states ("iconic" = icon/thumbnail, needs text; "primary"=display always, informative without text; "diagnostic"=optimized for identification; "secondary"=display only on request)
DefaultValue: 'secondary'
ranking_for_terminology tinyint Ranking of the resource with respect to terminology; range: 0 to 10
descriptor_id int Optional reference to a descriptor (at least 1 of description_id, descriptor_id, state_id or taxon_id must be present) (foreign key)
state_id int Optional reference to a categorical state (at least 1 of description_id, descriptor_id, state_id or taxon_id must be present) (foreign key)
taxon_role nvarchar (255) Role in relation to taxa or descriptions ("iconic" = icon/thumbnail, needs text; "primary"=display always, informative without text; "diagnostic"=optimized for identification; "secondary"=display only on request)
DefaultValue: 'secondary'
ranking_for_taxon_use tinyint Ranking of the resource with respect to taxa or descriptions; range: 0 to 10
taxon_id int Optional reference to a taxon (at least 1 of description_id, descriptor_id, state_id or taxon_id must be present) (foreign key)
description_id int Optional reference to a description (at least 1 of description_id, descriptor_id, state_id or taxon_id must be present) (foreign key)
scope_id int Reference to the scope of the resource (e.g. "sex", "stage", "season" etc.)
display_embedded tinyint false/0: display as link, true/1: display as embedded media object
DefaultValue: '0'
label nvarchar (MAX) Short label of (or caption for) the resource, e.g. a text displayed below an image or instead of the link
language_code nvarchar (3) Language of the resource itself, use zxx for language neutral/no linguistic content (ISO 639-1)
display_order int A positive number defining the sequence in which multiple resources are displayed
DefaultValue: '0'


Table: ResourceVariant

Different resource variants/instances/service access points

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
resource_id int Reference to the resource to which these variants provide access points (foreign key)
variant_id int Reference to the definition of a variant class e.g. "tiny sample", "small sample", "lower quality", "medium quality", "good quality" or "best quality" (foreign key)
url nvarchar (255) URL of the resource variant
pixel_width int Where applicable (still- or moving image): the width in pixel
pixel_height int Where applicable (still- or moving image): the height in pixel
duration int Where applicable (sound or moving image): the duration in seconds
size int The size of the resource in bytes
mime_type nvarchar (255) Type of the resource as MIME type like image/jpeg; color as color/hexrgb


Table: ResourceVariant_Enum

Classes for resource variants, values are predefined in the database

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of resource variant class ("tiny sample", "small sample", "lower quality", "medium quality", "good quality" or "best quality")
code nvarchar (255) Brief code equivalent to the (longer) label
detail nvarchar (MAX) Additional detail text explaining the resource variant class
quality_order int A positive number providing a filter mechanism for "tiny sample" to "best quality"; the smaller the number, the less quality a resource has
DefaultValue: '0'


Table: SexStatus_Enum

Values of sex status predefined according to SDD 1.1 rev 5

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of sex status
code nvarchar (255) Code of sex status
abbreviation nvarchar (255) One or two (intuitive) characters
detail nvarchar (MAX) Additional detail text explaining or commenting on the sex status definition
display_order int A positive number defining the sequence in which sex status values are displayed
DefaultValue: '0'


Table: StatisticalMeasure_Enum

The statistical measures predefined according SDD 1.1 rev 5

Column Data type Description
id int Database-internal ID of this record, references BaseEntity (primary key)
label nvarchar (255) Short label (or name) of statistical measure
code nvarchar (255) Code of statistical measure according SDD 1.1 rev 5
abbreviation nvarchar (255) Abbreviation of statistical measure
display_order int A positive number defining the sequence in which statistical measures are displayed
DefaultValue: '0'


Table: TextDescriptorData

The text data of a description

Column Data type Description
id int Database-internal ID of this record (primary key)
description_id int Reference to description to which these data belong (foreign key)
descriptor_id int Reference to a text descriptor to which the free-form text belongs (foreign key)
content nvarchar (MAX) Free-form text referring to information on one descriptor
notes nvarchar (MAX) Free-form text detailing special cases or additional data beyond the values


Table: Translation

The translations of entries related to BaseEntity

Column Data type Description
id int Database-internal ID of this record (primary key)
object_id int Reference to the object to which the translation belongs, references BaseEntity (foreign key)
language_code nvarchar (3) Language of representation translation (ISO 639 language code)
column_id int References the translated column name
contents nvarchar (MAX) Translated contents


Table: TranslationColumn_Enum

The TranslationColumn_Enum contains the column names that are translated in the Translation table

Column Data type Description
id int Database-internal object ID of this record (primary key)
column nvarchar (255) Name of column that is translated in the Translation table



Footnotes: The following conventions and abbreviations have been used in the tables: Columns of primary key:underlined



Project related tables

The access to the data is managed via projects where every local project of DiversityDescriptions is assigned to a project from DiversityProjects. Every description is assigned to a local project and the descriptors are optionally assigned to several projects by the descriptor trees. DiversityDescription contains the tables UserProxy, ProjectUser and ProjectProxy to allow a independent administration of the basic functions related to projects and users.

ER

Table: ProjectProxy

The projects as stored in the module DiversityProjects

Column Data type Description
ProjectID int ID of the project to which the specimen belongs (Projects are defined in DiversityProjects)
Project nvarchar (50) The name or title of the project as shown in a user interface (Projects are defined in DiversityProjects)
ProjectURI nvarchar (255) URI of a project in a remote module, e.g. refering to database DiversityProjects


Table: ProjectUser

The projects that a user can access

Column Data type Description
LoginName nvarchar (50) A login name which the user uses for access the DivesityWorkbench, Microsoft domains, etc..
ProjectID int ID of the project to which the specimen belongs (Projects are defined in DiversityProjects)
ReadOnly bit If the user has only read access to data of this project
DefaultValue: (0)


Table: UserProxy

The user as stored in the module DiversityAgents

Column Data type Description
LoginName nvarchar (50) A login name which the user uses for access the DivesityWorkbench, Microsoft domains, etc..
CombinedNameCache nvarchar (255) The short name of the user, e.g. P. Smith
AgentURI nvarchar (255) URI of a user in a remote module, e.g. refering to database DiversityAgents
Queries xml (MAX) Queries created by the user



Footnotes: The following conventions and abbreviations have been used in the tables: Columns of primary key:underlined