A comprehensive overview over all analysis/editing queries and reports available can be found in the previous chapter List of all analysis queries and reports available. The following sections supply detailed information about most of the available queries:
Analysis of item relations
Character analysis
Character cross tabulation
Character cross tabulation, numerical characters
Character state analysis (char. state usage)
Checking data
Consistency checks if 'Referential Integrity' failed
Edit or view data
Inapplicable character (= character dependency)
Reorganize project
Repair and maintenance
Special character states
Unused items/characters
See also:
Chart/graph reports
Tabular reports
Note: Several queries need "helper queries" (e. g. named <Your Projectname>_IID_x_CID...), which will not be deleted after use. You can delete them if you have not saved any queries using them. If you open a query from the DeltaAccess dialog box, any necessary helper queries will be recreated as needed.
Please tell me if you are looking for other types of analysis (they may not be easy or possible though). If you create useful queries yourself and want to share them, I will be glad to integrate them into future versions of DeltaAccess. Also, please contact me if you would like to see more explanations for a specific analysis.
Contents of Analysis section
Main table of contents
Currently there is only a single very rudimentary analysis, Simple-matching similarity for single IID. It will calculate the number of all matching character states between a specified item (you are prompted to enter an item IID when you run the query) and all other items. Non-matching character states are ignored. After you have switched to design view, you can remove the restriction to a single item and generate a full matrix of item relations. Be warned though that with a large data set this can be a very slow process. The size of the join will be approximately the square of the number of records in the _DESCR table, which may be 105 × 105 = 1010 records, the index alone of such a thing would be 74 GByte... Such analyses and more sophisticated similarity calculations can not be executed with straightforward SQL, you have to use more sophisticated algorithms and program them separately.
Instead of programming multivariate analysis techniques like cluster analysis into DeltaAccess, an export format for the major statistics packages (SAS, SPSS) will be added in the next version.
Analysis queries
List of all analysis options
Character usage BY Item (crosstab) displays for each item and character the number of character states used. The character number (CID) is listed as column heading. This version excludes the special states (U/V/-/TE); all text characters (type TE) are therefore completely missing. A second version includes these states (Character usage BY Item, incl. U,V,- (crosstab)). Here the cell is empty if the character is not used at all, and an "X" appears if the character is textual or numeric, i. e. no categorical character states are defined.
The cross tabulations will work only up to 255 characters. To analyze descriptor projects with more characters, you can define character subsets (e. g., with categorical = UM/OM characters only), or you must use the list versions Character usage BY Item (list) and Character usage BY Item, incl U,V,- (list) and finish the analysis using the pivot analysis in Excel.
To see a total of the above queries, i.e. how many items have used 1, 2, or more states for a give character, use one of the char-state multiplicity BY Char analyses under Character state analysis. These analyses above and the multiplicity analyses are closely related. The 'Character usage BY Item' analyses above could also have been called 'Character state multiplicity BY Item'. The different naming conventions were chosen, because the primary interest in the analyses above is whether states have been used at all. The multiplicity of character states is added information, which is only rarely useful.
The character usage cross tabulations do not provide row and column percentages. You might therefore want to export the DeltaAccess cross tabulation to Excel, and do the following:
Column percentages: In Excel, insert a new row above the data, enter an equation in the cell above the first column, containing: =COUNT(B3:B100)/ROWS(B3:B100), assuming that your Count of character state data for this column are in B3 to B100. Format the cell with Percentage, 1 decimal place (Format Cells in the context menu, select the first tab Number, Percentage). Copy it, mark all the empty cells above the columns to the right, and paste a copy of the first cell.
Row percentages: Insert a new column in front of the data, enter an equation like: =COUNT(C3:Z3)/COLUMNS(C3:Z3) into the first cell, format as percent. Copy the first cell, mark all the empty cells in front of the rows, and paste a copy of the first cell.
In combination with row and column percentages, the character usage table is an excellent tool to check the completeness of your data set. Any char with a column percentage close to 100% should be checked for the missing data (items which have not been filled). It is likely, that this character should in fact be available and scored for all items. Any item with a low row percentage is recognized as badly treated.
*** Note: it would be better to have character instead of character state counts, make additional analyses using the same helper queries used by cross tabulation below ***
While the above queries count how many states are used in each character, the following queries count the use of any state of a character:
Character usage cross tabulation (test scoring dependencies; U,V,- = missing) and ...; U,V,- = used): In this analysis, all characters are tabulated against all other characters. The table counts in how many items a pair of characters is used together. In the first analysis, a character is counted as used (scored), if either a normal character state, or the special state 'V' was used. The special states 'U'/'-'/'TE', are excluded and counted as missing. The second analysis (...; U,V,- = used) counts all states as used.
This type of analysis is useful to detect hidden dependencies among characters. Note that this analysis will not show you characters dependent on a certain state. It will only show exclusive groups of dependent characters. For example, if character 2 is applicable only if state character 1, state 1 has been set, and character 3 is applicable only if character 1, state 2 has been set, a result could look like:
1 |
2 |
3 |
|
1 |
87 |
69 |
12 |
2 |
69 |
14 |
|
3 |
12 |
|
1 |
Compare also the queries in the following section Unused items/characters for more detailed information on which item/character combinations have not been used. If you want to know in which items you have used one of the special character states (U/V/-/TE) you can use queries from the section Special character states.
Analysis queries
List of all analysis options
Introduction to character cross tabulation
If you want to analyze how characters interact, you can make a cross tabulation (or contingency table) of characters. Only for binary characters (which are not available in DELTA) you would cross the character directly. In categorical characters (UM/OM) you cross the character states instead. Since conceptually you want to compare characters rather than states, DeltaAccess uses the term character cross tabulation. The following chapter Character cross tabulation discusses the available analysis options for categorical characters.
If you want to compare numerical characters (IN/RN) you must first group the values into classes, to get a one-dimensional frequency distribution of each character you want to compare (see Frequency distribution of numerical characters). These frequency distributions can then be compared in a cross tabulation. Currently, DeltaAccess offers only limited support for cross tabulation analysis of numerical characters; you must use, e. g., MS Excel to finish you analysis.
You can further analyze cross tabulations statistically by exporting them to a statistical analysis software (e. g. the SAS system) and run a chi-square test, G-test (loglinear test), or Fisher's exact test. The latter should be preferred, if sample size is small and many cells of the table contain less than 6 items. Although Fisher's exact test was originally defined only for 2 x 2 tables, most statistical software packages will offer a generalized version of Fisher's exact test for n x m tables. On the other hand, Fisher's exact test will be very slow if the size of the table becomes large, so you might have to resort to the approximate chi-square or G-test.
These statistical tests test the null hypothesis that the characters are independent. Character dependence can have several causes:
Statistical methods of phylogenetic inference assume that the characters (whether morphological or molecular) are independently identically distributed (IID condition of random variables). Clearly, the assumption of independence is rarely fully satisfied. Even consecutive bases of DNA sequences of non-coding regions are not completely independent.
Note that the statistical test mentioned above can not test the validity of this assumption. They can not distinguish between the first causes mentioned, which make the phylogenetic inference unreliable, and phylogenetic correlation.
See also the chapter on Comparing characters with external classifications, phylogenetic trees or cluster analyses.
Analysis queries
List of all analysis options
(See also Introduction to character cross tabulation.)
Char-state usage Char 1 x 2 (crosstab) will generate a cross-tabulation of the character states of two characters. You will be asked for the numbers (CIDs) of two characters when you execute the query. Note that although explicitly coded U/V/-/TE states are present, neither implicit character states nor Null-states (i. e. states you have never used) are. Thus when you look at the row-sum for, e. g., state 1, further items may have the state 1 of the first character you selected, but for these items the second character is not defined at all. Also, note that because of multiple character states, the row sum may exceed the number of items in your data set. A second version: Char-state usage Char 1 x 2 (crosstab, names) uses the character names instead of the character number (CID) as column headers. It is not possible to display both the number and the name as column label of a crosstab in the current version of Access. You can do it only with the following list versions and Pivot analysis in Excel.
To analyze 2 to 4-way interactions between characters, including "0"-entries, row sums, and column sums, the queries: Char-state usage Char 1 x 2 (list) to Char-state usage Char 1 x 2 x 3 x 4 (list) provide the information equivalent to a crosstab in list form. The queries named Char-state usage [...] (list, names) are identical with these queries, except that they add the character and character state names to the list. This can be convenient if you want to make reports. See the section about crosstab queries above for a discussion of how such lists can be used together with the Excel pivot wizard to create very flexible multidimensional crosstab analyses.
Note that the results will only take explicit 'U'-states into account. DELTA defines unused characters (for this item no character state has been set) as implicitly unknown. It is therefore possible that if character 7 and 12 are crossed, and the independent count of items for which character state has been used is as follows:
CID |
CS |
Count |
7 |
1 |
8 |
7 |
2 |
5 |
7 |
3 |
1 |
12 |
1 |
1 |
12 |
2 |
13 |
12 |
3 |
1 |
The cross-tabulation of CS1 = CS(character 7) and CS2 = CS(character 12) looks like:
CS1 |
CS2 |
Count |
1 |
2 |
8 |
1 |
3 |
1 |
2 |
2 |
5 |
2 |
3 |
1 |
3 |
2 |
1 |
This is so because missing states can not be crossed. If you want an analysis including the missing states you must use reorganize queries to insert 'Unknown' into item descriptions. Inserting 'U'-states guarantees that each item uses all characters, so the cross-tabulation will be complete.
Analysis of states within a character: Note that you can use the analysis queries discussed above to analyze the relationship between the character states of a single character. If you start Char-state usage Char 1 x 2 (crosstab) and enter the same character number (CID) for the first and the second character, you will analyze multistate situations within a character. For example:
In the following example, the results for a character with 3 states indicate that character state 1 occurs 74 times alone, 4 times in combination with state 2, which occurs 126 times without other states. State 2 also occurs 10 times together with state 3, which never occurs alone. Such a result should be reason to reassess the definition of state 3. It seems likely that state 3 is not independent of state 2. Note that this is always a symmetrical matrix, since, e. g., CS1 x CS2 must be identical with CS2 x CS1.
CS (char 1) |
CS (char 2) -> |
CS1 |
CS2 |
CS3 |
1 |
|
74 |
4 |
|
2 |
|
4 |
126 |
10 |
3 |
|
10 |
If you want to tabulate complex "features" of your items, which are defined through a combination of several characters, you can use the Multi-ItemEdit dialog box to create calculated characters. Compare also the chapter Comparing characters with external classifications.
As discussed in the Introduction to character cross tabulation, the analysis queries in this section cross the character states of several characters against each other. An analysis which indeed crosses all characters against all characters, counting only whether a character is used (scored) or not, is Character usage cross tabulation. It is listed under character analysis.
Analysis queries
List of all analysis options
Character cross tabulation, numerical characters
In addition to the cross tabulations of categorical characters against each other, 3 analyses involving numerical characters are provided. They are provided in list format only (i.e. not crossed inside DeltaAccess) and intended for use with the pivot analysis in Excel.
Numerical x categorical char (list, names)
Numerical x numerical char (list, names)
Numerical x numerical x categorical (list, names)
You can cross two numerical (IN/RN) characters alone (Numerical x numerical) or under the condition of a categorical (UM/OM) character (Numerical x numerical x categorical). You can also tabulate a single numerical character against a categorical character.
For the numerical characters, the Min, Mean, Max, and lower and upper range are given. The "Mean" column may contain a mean, median, mode or a single value. A field CalcMean is calculated, which is defined as the Mean if present, or the Mean of the upper and lower range. The queries with two numerical characters also calculate the ratio of the calculated mean of the first numerical character through the calculated mean of the second numerical character.
In contrast to the list versions for categorical characters ('Char-state usage Char 1 x 2...'), the result of this list is not a count of occurrences, but the original values. Because of this, the results contain the attribute ItemName. The results of the analyses involving categorical characters can be further analyzed using pivot analysis in Excel. Excel can, e. g., calculate the mean, min, max, count, etc. of the mean values of the items.
Compare also the chapter Frequency distribution (histogram) of numerical characters.
Analysis queries -
List of all analysis options
Character state analysis (char. state usage)
To analyze how often certain character states of each character have been used, you can use Char-state usage BY Char (list/percent). The percentage is relative to all states of the same character. It pertains to all items, even if the character in question is inapplicable to some items.
The option Char-state usage BY Char (crosstab) presents this as a cross-tabulation in more compact form, but without percentages. In the cross-tabulation the special states (U/V/-/TE) are excluded. Three more analysis variants are available to restrict the analysis to numerical (...only IN/RN) or categorical (...only UM/OM) characters only, or sort the crosstab by character name instead of character number (...sort by name).
Character state usage by item cross-tabulation would be a very large matrix containing only yes/no values. Since Access is limited to 256 columns in a cross-tabulation, only a few characters could be displayed as a matrix. The information in list form is available as Char-state usage BY Item (list). It is not very useful in itself, but provided here as the basis of further analysis steps. A similar query is, e. g., the basis of the checkbox report.
Multiplicity: If you want to know how often you have used multiple states of a categorical character (type UM/OM), you can use UM/OM char-state multiplicity BY Char (list/percent). This query counts the number of times 1, 2, 3, etc., states (regardless which states, except for the special states U/V/-/TE). The first query treats combined character states (which have been defined as, e. g., '1&2' in the character definition) as a single, additional state. UM/OM char-state multiplicity BY Char (list/percent, resolving combined CS) is more exact (but slower) and resolves these cases. Use the Character usage BY Item analysis queries from Character analysis to analyze the character state multiplicity by item.
An interesting option to revise your character definition is: UM/OM char-state multiplicity: Potential additional exclusive (Multistate=0) characters. It checks which characters could potentially be changed to MultistateType exclusive (value=0), without any item descriptions violating this condition. Exclusive characters do not allow the scoring of multiple character states of a character. For example, '1,1/2' = character state 1 OR 2 of character 1, would not be allowed. Restricting your characters to exclusive simplifies many statistical analyses, which are difficult to interpret if multiple states can be present. It is therefore desirable to define as many characters exclusive as possible. Classical DELTA defines the EUM/EOM character types for this purpose, while DeltaAccess uses the MultistateType attribute. Under the analysis section Checking data a query to check your item descriptions for violations of the exclusive multistate type can be found; in a way the current query does the opposite and looks for a potentially more stringent character definition.
The related question of character state usage of multiple characters crossed is found under the analysis type Character cross tabulation. Such cross-tabulations or contingency tables analyze, e. g., for 2 characters with 2 states each, how many items have the character state combinations '1,1 x 2,1', '1,1 x 2,2', '1,2 x 2,1', and '1,2 x 2,2', respectively.
Note that you can also cross the states of a single character. If you enter the same character number both as first and second character of the cross-tabulation, you can see how often certain states occur single or together with another state. See Analysis of states within a character for more information.
Analysis queries
List of all analysis options
This group of queries is concerned with checking the integrity or plausibility of your data.
Exclusive character state condition (Multistate=0) violated: If a categorical character has been defined as MultistateType exclusive (value = 0), this query finds all items where more than one character state has been entered. Example: if character '1' is defined as exclusive, then '1,1/2' violates this restriction. Compare the section on Character state usage for a query to search for potential additional exclusive characters.
The group starting with List items where... checks certain character state situations in categorical characters.
List items where all states of a char. are used (states are 'saturated'): Lists all characters and items where all character states (except the special states U/V/-) have been used (= scored). Another expression for this situation is that the character states are "saturated" for this item. You might want to consider using the special state 'V' (variable) instead. If all states are scored, no information is available for identification.
List items where both "Present" and "Absent" states are used: If a binary character has two exclusive character states, they should normally not be both used (or scored) in a single item. The special state 'V' (variable) should be used instead. Note: The query searches for a combination of the strings "presen" and "absen" (to cover "present" and "presence"). It therefore works only with english wording. If you have used "Yes" and "No" or using a different language, you can easily modify the query in design mode.
List items where both normal and special CS are used: The special states U/V/- should normally not be used together with other character states or themselves. Such combinations do not make sense in an analytical context. The query finds all cases where for single item both special and normal states are defined, e. g., 5,4/- or 5,U/1. This should normally considered an error, although in rare exceptions it might not be one.
List items where 'mandatory' characters are missing: Characters can be defined as mandatory (or 'required') in the character definition. The query lists all item where this condition is violated. Compare also the option to check for potential conflicts between mandatory and inapplicable definitions. ***Currently this includes mandatory characters which are inapplicable, must be revised! Note that if you use the internal forms provided you will be alerted during data entry that a mandatory character has not yet been entered, this option is relevant mainly after import of and after you have made a character mandatory which was not mandatory before.
Checks specific to numerical characters: The queries testing Min/Max of each or any statistic across all items are not very valuable to analyze items, but very useful to catch errors in the data set.
Numerical chars, Max of each statistic (crosstab) and Numerical chars, Min of each statistic (crosstab) will not retrieve the minimum or maximum entered for an item, but the minimum or maximum of the statistic (= numerical attribute, e. g., minimum, mean, or maximum). This can be very useful for error checking. Normally you will now the probably range of your values. Also, if the minimum of all means is less than the minimum of all minima, you probably have made a mistake during data entry. Numerical chars, Min/mean/max of any statistic even pools all statistics, i.e. it calculated the minimum of any value entered for an item for this character. This is very useful for a fast for an initial search for errors which result in values of different orders of magnitude, e. g., the omission of a decimal point. Then use the detailed tests to see in which statistic the errors can actually be found.
An additional plausibility test that can be performed on the statistics entered for numerical values, tests for the sequence of attributes: Numerical statistics, sequence violated. It checks whether Min £ lower range limit £ Mean/median/mode £ upper range limit £ Max. Note that the conditions involving the range are not strict. If, e. g., the distribution is asymmetrical and the range is defined as mean plus/minus standard deviation, the minimum can be larger than the lower range limit, or the maximum larger than the upper range limit.
Test for use of "-" ("to"-operator, modifier) in UM type characters: If the '-' operator is detected in the imported data set (e. g. '2,1-3', where 2 is a UM character), it will be translated using a 'to'-modifier (unless the 'to'-operator is always used, in which case the MultistateType is changed). Frequently, the use of '-' in DELTA data sets turns out to be unintentionally. If a character is of type UM, the DELTA code '1-3' is equivalent to state '1' or '3', not '1' or '2' or '3'. You probably have coded your imported data correctly, but it is worth checking. The query will list all occurrences of this situation, so that they can be confirmed or corrected.
The queries Inapplicable chars defined as mandatory and Inapplicable chars used (not updateable) from the section Inapplicable characters are useful for error checking as well.
Analysis queries
List of all analysis options
Consistency checks if 'referential integrity' failed
|
Who should read this: The queries in this section are only needed if you received a message during the import process that a relation could not be created. |
Referential integrity protects your data set from becoming inconsistent. For instance, the referential integrity between the character definition and item description guarantees that you can not add a character number to item descriptions which does not exist. If you change a character number (e. g., to change the sequence of characters in the Reorganize character dialog box), the character number in the item description is automatically updated to maintain the integrity of the link.
Referential integrity can only be installed, if the conditions that shall be protected are initially fulfilled. If your imported data violate these conditions, you will receive an error message stating that referential integrity could not be created.
If the relations are properly installed, these queries cannot return any records!
Only a few queries are currently provided, because many potential integrity violations will not occur in data sets imported from DELTA coded files. While it is possible to use characters/states in the item descriptions that are not defined in the character definition, it is not possible to define a character state without a corresponding character number.
If you find cases where additional referential integrity checks are needed, please inform me and send me a data set where such an error occurs.
Analysis queries
List of all analysis options
Edit... queries
The queries starting with Edit... are editable views to one to several tables. They are provided more as an example of how to join the base tables of the DeltaAccess information model, than because they are particularly useful.
Starting with DeltaAccess 1.6, several queries will be created with each project as default queries. These include queries useful for editing, like ..._StatesPlus_Ordered. However, in general it is recommended to use these queries only as a basis for you own analyses, and use the Character and Item editor for editing.
Currently the only exception is:
Edit Modifier templates: This query is used by the button in the CharEdit-form as well, since the form only supports to edit the enabling of modifiers already present.
Numerical data
There are 3 queries (Mean, ranges, min/max of numerical characters...) available to view the major numerical attributes in more compact format. All three are not updateable (i.e. you can not use them for editing). Two queries displays each attribute in a column of its own, with and without Notes, the third formats the attributes in the usual way as "(Min-) LowerRange-Mean-UpperRange (-Max)", e. g. "(2.0-) 3-4.5-7 (-11)". Since each numerical attribute may have Notes of its own, the Notes shown are collected and concatenated.
The numerical data lists are sorted first by character, then by item. To sort by item first, you can select the column labeled IID and click on the sort button () in the toolbar. To restrict the list to a single character or item, ascending, mark this character or item and click on the AutoFilter button (
) see Filtering data.
Mean, or middle of ranges/min/max of 1 numerical char. Asks for the character number (CID) of a single numerical character. In addition to the formatted display (identical with the last query discussed above), for all characters where no mean is given, the middle value of the ranges is calculated. This query is especially useful to create a frequency distribution (histogram) in Microsoft Excel, see Frequency distribution (histogram) of numerical characters.
Sorry, this chapter is incomplete! See the description column in the List of all analysis options for some additional information on the remaining queries.
Analysis queries
List of all analysis options
Inapplicable characters (= character dependency)
After you have defined some character as dependent on certain character states (i.e. they become inapplicable in any item where this character state has been selected), you can view your definition using:
To check for potential error situations, you can use:
Queries involving the explicit inapplicable character-code ('-') can be found under the heading Special character states.
Analysis queries
List of all analysis options
Most reorganize functions of DeltaAccess are found in the Reorganize project definition form (reorganize characters, items, headings, and character states) or directly in the item-, heading-, or character-editor. The queries provided here only deal with implicit character states, for which no other dialog exists yet in DeltaAccess.
Implicit character states are not converted to data during import or after editing data in DeltaAccess. Treating them like actual data will be desirable for analysis and interactive identification purposes. Adding the implicit states is a comparatively complex process, and it would be very slow to integrate this into the other queries. A database draws its power from the use of indexes, and although generating a view including implicit states (union query) is not too slow, all subsequent operations will be much slower because they cannot use the indices any more.
Implicit states: Check applicable (= no other state used
Implicit states: Insert permanently into item description (_DESCR)
The first option will give you a list of characters that would be inserted if the second option is selected. Implicit character states will then be inserted as actual data (for any item where no other characters states of the same character are already set). This is equivalent to the DELTA directive "Insert implicit values".
The special state 'U' for Unknown of a character is implicitly present in all items that are unused or not recorded, i.e. no other state of a character has been scored in this item. The behavior is equivalent to defining the special state 'U' as implicit. Therefore, similar queries to the ones described above for user-defined implicit states are provided for the default implicit state 'U':
Unused chars: Insert 'U' into item description (all types)
Unused chars: Insert 'U' into item description (OM/UM only)
These queries insert the special character state 'U' (unknown) into all item/character combinations not yet used (as found by Unused item-char combinations... from the Unused Items/characters query group). This can be useful for analysis purposes, but normally "no entry" (= unused) should indicate that the character has not yet been checked, while an explicit 'U' entry indicates that is has been checked, and either positively no information is available, or contradictory evidence could not be resolved.
Analysis queries
List of all analysis options
Most of these options should not be needed, because they are either set by default, executed during import or editing of characters in the CharEdit-form, or simply not recommended. They are provided as help in system maintenance, or as repair options only.
Note that the options presented here are only concerned with local problems in a correctly functioning database. Read the section on database repair if the database itself should have become corrupted, e. g., if Microsoft Access can not open it any more.
Add character heading default AutoGroups: Add default AutoGroup headings to the character headings entity (special named character groups for identification). Automatically executed when new descriptor project is imported or created! Note that executing this query may either add only the missing AutoGroup headings (if the default HIDs remained unchanged) or may add duplicates (if the default HIDs had been changed).
Add special CS-Queries: In normal DELTA several special character states, representing the unknown/variable/not applicable codes (U/V/-), are considered implicit. They need not be present in the character definition and are added automatically. DeltaAccess adds them as explicit character states automatically during the import process of external DELTA files. The option Add CS: U,V,- for all OM/UM/RN/IN repeats this at a later time. It might be useful, if you have accidentally deleted the special states.
Similarly DeltaAccess uses several internal character states which are not part of the normal DELTA character definition for the numeric attributes (min, mean, max, etc.) and a special default state for text characters (TE). If you experience problems, you can always run the queries starting with Add special CS: ..., which will insert possibly missing states. Usually you will receive an error message about duplicates (i.e. records to be added are already present), which can be ignored. The difference between the number of records quoted in the error message and the number originally quoted by the append query is the number of internal character states which have actually been added.
The only query in this group that is more generally useful is Add CS: Full numeric attributes for all RN/IN. If during import you did not check the extended numeric attributes checkbox, you can later do this using this query.
Note that instead of adding states globally, you can also add them to selected characters in the CharEdit-form, using the pick list in the header of the character states section, which adds default states for some numeric characters instead of all.
Modifiers: The query Item description: Move recognized Notes to Modifier is executed during setup, you could try to repeat it at a later time, but note that the modifiers will not automatically be enabled. You should enable modifiers either singly, or using the predefined modifier sets.
The number of states options allow you to check if any inconsistency occurred in this calculated field (it is normally calculated if CharEdit-form is used, but you might have added states directly through the use of tables or queries), and to recalculated the NumStates attribute.
Analysis queries
List of all analysis options
If you want to know in which items you have used one of the special character states (U/V/-) you can use:
Not applicable ('-') used (Item list)
Unknown ('U') used (Item list)
Variable ('V') used (Item list)
Searching for 'U' states only returns items where a 'U' has been explicitly entered, not the characters for which no states have been recorded (unused). Compare the section about inserting 'Unknown' into item descriptions.
The code for not applicable ('-') should only be entered explicitly where it is not already evident from the inapplicable character (character dependency) definitions (= implicitly not applicable). Two queries allow you to edit or delete cases in which the not applicable ('-') code is used redundantly:
Not applicable ('-') used explicit where already implicitly N/A
Not applicable ('-') used explicit where already implicitly N/A: delete!
The special state 'TE' may be allowed to directly enter textual information, in effect turning a categorical character into a text character. This might be convenient, yet it is questionable in terms of data integrity. Therefore you can not only check in which items you have used the feature, but also where you have allowed it in the character definition:
Text ('TE') instead of categorical (UM/OM) data allowed (Char. def.)
Text ('TE') instead of categorical (UM/OM) data used (Item list)
Similarly, you can check where you have used combined character states, and where you have defined them:
Combined categorical char. states, e. g. '1&3' (Char. def.)
Combined categorical char. states, e. g. '1&3' (Item list)
Analysis queries
List of all analysis options
The analyses in this group check in various combinations whether any character state has never been used so far, or more specifically, which item/character combinations do not occur.
Unused Char-states (...) checks, which character states, have not been used in any item. Two versions are available: ...(U/V/- included) lists unused special states together with normal states, ...(U/V/- excluded) ignores them.
The queries Unused item-char combinations (CID-order) and Unused item-char combinations (item-order) list all character/item combinations which have not been used. That is, for any given item they list all characters that have not been used in the item description. For large data sets these queries can be quite slow and return many records, which can be confusing to analyze further. Therefore try to use the following queries first:
The queries starting with Unused items per char ... list only the number of items which do not use a given character. Queries starting with Unused chars per item ... are the "character-complement" of Unused items per char... and list the number of characters used for a given item. They are especially useful to check the extent to which an item can be regarded as 'finished'. Use Unused item-char combinations... if you want to see which items are missing.
Both query types are available in different order = sorting modes (percent-order, or item-order/CID-order) and in two versions each, which differ in the way the special character states (U/V/-/TE) are treated. If the query description ends with UV- = unused, the special states are counted as if no character states had been entered, while queries ending in UV- = used count special character states just like normal characters. Use Unused item-char combinations... if you want to see which characters are missing. The Unused item-char combination-queries always work with the UV- = used assumption.
If you want to insert the special character state 'U' (unknown) into all item/character combinations not yet used (as found by Unused item-char combinations...) you can use queries from the Reorganize descriptor project query group, compare inserting 'Unknown' into item descriptions. This can be useful if you want to analyze certain kind of questions. Normally "no entry" (= unused) should indicate that the character has not yet been checked, while an explicit 'U' entry indicates that is has been checked, but the question could not be resolved.
Analysis queries
List of all analysis options
Main table of contents Next