DeltaAccess –
a SQL interface to DELTA (Description Language for Taxonomy), implemented in Microsoft Access

Appendix 1: Advanced database administration information

   Fingerprinting data sets
   Securing an Access database
   Repairing a corrupted Access database
   Linking to other data sources
   Linking to other database subsystems
   Merging descriptor projects (data sets)
   Next table of contents: Appendix 2: Development documentation

Previous table of contents    Main table of contents

 Fingerprinting data sets

Scientific data should be shared as much as possible. Yet, since the procedures of electronic publication, authorship and copyright to data sets are not yet settled (note that major changes to data set copyright were introduced around 1997), this if often not the case. The protection of the intellectual property rights of databases is a difficult problem. Regarding DELTA data sets, it seems that currently almost nobody is willing to share her or his work, because of concern about fraudulent use.

One method, which can be used to share data sets without loosing control over them, is to fingerprint them. Fingerprinting allows other people to use them, but it allows the author to recognize that somebody is in fact using her or his work. There are two variants that can be used:

Destructive fingerprinting

This method arbitrarily changes data in the data set. You could simply exchange some item names, swap parts of item descriptions, or arbitrarily change character states. This is a rough method, which seriously reduces the usefulness of the data set. It can be used to send out an example, or to send (e. g. me) a data set which produces a certain kind of error with DeltaAccess or some other DELTA software package.

Steganographic fingerprinting

With a little bit more effort, a data set can be changed in a way which is neither obvious nor creates major problems in using the data set, e. g., for identification methods. Depending on the size of the data set, change 2 to 10 items for each type of character:

Text characters: It is very simple to change textual information in a way that is very difficult to find: the spelling of the publication reference of a species, final page of literature references, deviating abbreviations of journals, etc. The idea is that the changes should be such that they are not likely to arise if somebody else collects the data independently, and that they are not easily removed by rewriting a few sentences.

Numerical characters: Change the least significant digit of the mean: For example changing 2.5-3.8-4.6 to 2.5-3.7-4.6 has almost no influence on the usability but uniquely identifies the data set.

Categorical characters: These are most difficult to change without impeding identification results when the data set shall be used. You can always change Modifiers or Notes, and you could change some character which is usually not used in identification, but present for analysis purposes only (e. g. biochemical data).


The logic behind fingerprinting is: if anybody invests enough work to find the minor errors introduced by the fingerprinting, she or he has probably also discovered some real errors present in the data set, and spent enough time improving the quality of the data, to demand a kind of authorship her- or himself.

Fingerprinting is commonly used in commercial data collections. For example, any serious address collection will have some incorrect entries, which possibly point to private addresses of employees of the publisher. If anybody start using these addresses in forbidden mailings, or incorporates them into another address collection, the source can be identified.

Appendix 1: Table of contents    Main table of contents

 Securing an Access database

Who should read this: Experienced administrators who want to restrict access to their data to specific users or user groups.

DeltaAccess databases can be protected from being accessible to anybody, using the Microsoft Jet security model. Since DeltaAccess is distributed under a GNU General Public License, any user is allowed to secure her or his version of DeltaAccess.

To secure your database, you must have a full version of Microsoft Access. It is not possible to do the following with a run-time version. Please consider the alternative method of copyright protection using data fingerprinting.

Database password protection: A simple method of protecting data is setting a password for the entire database file. Close all forms of DeltaAccess, leaving the database explorer open. In the Tools menu, select Security and then Set Database Password. All users must now enter this password before they are allowed to open the database. This is an easy way to prevent unwanted users from opening a database. However, no other security measures are provided once it is open.

User-level security model: A much more powerful security model is the user and group level security of the Microsoft Jet Database engine. Users and groups can be given very specific rights or restrictions for each object separately. Some users may be allowed to modify the character definitions, while other users may only read them.

Using user-level security requires a good knowledge of the security model and can be complicated. I recommend studying the documentation provided by Microsoft (see Security in the Microsoft Access help contents). The following chapter Tips on user-level security may help you to get started and avoid some potential pitfalls. It is, however, not a step by step introduction.


Note that DeltaAccess exports data during the backup process to an unprotected database. This is intentional. It avoid problems during upgrading or when exchanging data with other users. (It can be quite confusing to unprotect a database.) If you use database security, keep all backup copies made by DeltaAccess in a secure place, and protect them by other measures.

Appendix 1: Table of contents    Main table of contents

 Tips on user-level security

This chapter is no introduction to user-level security. It may clarify some points, after you have read the MS Access documentation on the topic.

User and group definitions are stored in the system.mdw database. Each installation of MS Access (or any other application using Jet) installs this file. If multiple users in a network use databases concurrently, they must use a single security workgroup database. The MS Access Workgroup Administrator, a link to which can be found in the main office directory, allows the administrator to change the security workgroup database to which an installation of Access is connected.

The default account is called, e. g., Admin or Administrator (the name depends on the language version of Access that you have installed). Any new installation of MS Access uses that account. As long as this account does not have a password, you are not given the option to log in under a different user name. To start using user-level security, you thus must enter a password for the administrator account.

Enter additional users and groups. (Unfortunately, user passwords can not be set from the administrator account. You must close Access, and log in under the new user name to set a password.) It is good practice to give rights only to groups and not to users. Create the groups you consider necessary, e. g., "Developer", "Scientist", and "Typist".

During the creation of groups and users, you are asked for a personal security ID (PID) for this user or group. This is a crucial point of the security model. You should make a note of the string you have entered here and keep it at a secure place. Note that the PID is case sensitive. The string is used to create an internally unique identifier in the workgroup. If another workgroup has defined the same group names, but did not use the same PIDs, the groups will be treated as different by Access. However, if you install your database in a new workgroup, and create groups with the same name and PID as in your original workgroup, the new groups will have the same rights as they used to have in the previous workgroup.

Thus, if anybody gets to know the names of groups and their PIDs, e. g., by stealing your written note, she or he can make a copy of the database, and create the appropriate groups in a new installation of Access. It is not necessary to know any user names or passwords, since she or he can create any user, and assign this user to the duplicated groups. On the other hand, if you do not know the PIDs of your own groups, it may be impossible to update your data to a new version of DeltaAccess or Microsoft Access at a later point. Also, your system.mdw might become corrupted or lost, in which case you will be locked out from your own security. If you know the PIDs and group names, you can recreate the most vital part of the system.mdw. The easiest way to understand the PID is to view them as a password for the right to create a specific user or group.

Now revoke or grant your new groups for each object in the database the desired rights. A good strategy would be to make all forms, reports and modules read only. Grant read or write rights to the project tables (or queries in the case of subsets).


If you have followed these steps so far, you have secured your users. This will avoid unintentional changes or deletions by users, which should not be allowed to do certain changes. Unfortunately, the database itself is not yet secure! The reason is that anybody who does not use your workgroup definition (system.mdw), but has access to the database (whether by stealing it or by normal access over a network) has still all original administration rights. The following two steps describe how to create a truly secure database.

The database and objects in it have an owner, who has always the right to the object, whether she or he has explicit user rights or not. The owner of DeltaAccess, as distributed, is the administrator! (This is necessary, if I want to give you full access, including the chance to fully secure a version of DeltaAccess. Only these administrator rights, give you the right to secure DeltaAccess.) Again, as in the situation above, any newly installed version of Access has full rights to the database! You must therefore change the ownership as well, to secure the database. Note that a user with administration rights can assume ownership of all objects in the current database. Unfortunately, you can not assume ownership of the database itself. Only this process in non-reversible, and is therefore required. To fully secure a database, you must create a new database while you are logged in under a username you have created yourself, i.e. not as the database administrator. Import all objects from the other database into your new database. Doing this, you will become the owner of these objects.

Another common error which is easier to fix than the ownership issue, is to remove users and groups from the default groups (like the administrators group). The effects is that indeed user rights are restricted to the rights given to the user itself or to the groups to which she or he still belongs. The problem is, that although the operation in the current workgroup (system.mdw) has been secured, the database has not been secured. Any user accessing the database (e. g. DeltaAccess.mdb) with a newly installed version of Microsoft Access will be an administrator in her or his own system.mdw. Thus, it is important to revoke all rights the default groups have to the objects in the database you want to protect. Select the administrator user, select all tables, queries, etc., and revoke all rights. Do the same for the default groups which are present in a newly installed version of Access.


DeltaAccess has not yet been tested in a fully secure mode. If you have problems, please contact me. You may have to set more differentiated rights for some forms. If you figure out what you have to change, I would appreciate a note. If sufficient interest exists, I could create default groups in the distributed version of DeltaAccess, which could be used as templates for your own groups.

My recommendation about implementing security is, not to secure DeltaAccess.mdb itself. Create a new database under a non-administrator user account as described above. Import only the tables and queries of your base and subset projects. Secure these fully in the new database. Open the original DeltaAccess.mdb again and delete the tables and queries you have imported to your new database. (If you use the management option Delete project, forms will be deleted as well; use the management option "add editing forms" at a later time to recreate the forms.) Create links back into the original DeltaAccess.mdb. You now have table like objects, which function almost like a table, but do not contain any data. The process is similar to linking tables from SQL-server into DeltaAccess. You do not have to revoke any right from these links. Even if a link has full rights, the external source of the data, i.e. the mdb file which you have secured, will limit the rights. The advantage of this solution is that it is easy to upgrade to new versions of DeltaAccess. You do not have to test which application objects require which rights for which users.

Note that it would be very difficult to create a DeltaAccess user interface for administering user-level security. A considerable knowledge of the local situation (in a network) is required, and security would not be complete unless the programmer has no ownership of the database anymore. To administer any security model in a local area network, a database administrator is required.

Appendix 1: Table of contents    Main table of contents

 Repairing a corrupted Access database

If your system is stable, database corruption occurs very rarely under Access 97. It usually occurs if your computer or the operating system crashes, and you could not close your programs normally. I have not had a case yet, where the options 2.a) or 2.c) did not work.

The following is a short list of things you should do to repair a corrupted Access database (*.MDB) file:

1. Close Access and make a copy of the database file. This is important, because sometimes a failed repair attempt can prohibit another attempt, which would otherwise have been successful. In a network, make sure no other user is accessing the database.

2. Try one of the following options. Always work on a new copy of your original database. Each option may repair your database. Use the next option only, if the previous option was not successful.

2.a) Reopen the database in Access, and select from the Tools menu, Database Utilities the option Repair database. This is the automatic repair mechanism build into Access, which rebuilds all internal data structures. If a message tells you that the repair was successful, select the option "Compact database". Usually these two steps are sufficient.

2.b) If this was not successful, make a new copy from your original corrupted file, and reverse the steps: First compact, then repair. This may in fact not be different from 2.a) anymore. My experience with it goes back to Access 95 and it probably was a bug in the repair process of Microsoft Access 95, which may have been corrected in Access 97.

2.c) If the error still persists, it might be due to corrupted code. Change the VBA project name, which can be found in the Tools menu, Options, Advanced tab, in the middle lower right of the dialog box. Change the name to any name, e. g. "TMP". This causes all modules to decompile. Execute step 2.a). If the corruption has been solved, you can change the VBA project name back.

2.d) If neither of these steps has helped, the last things you can try before you decide that you have to go back to a backup, is the following: Get a fresh copy of the DeltaAccess application and use the restore option, using your old, corrupted database file (DeltaAccess.mdb) as a backup file. You should rename the file first, to avoid confusion with the fresh copy of the same name. If the corruption involves only forms, reports, or code but not your data, you could be successful. If the automatic restore option of DeltaAccess fails, you could even try to create a fresh, empty database in Access (File, New) and manually import the database tables of your descriptor project (all tables starting with your project name; use the standard Access tools found under the File menu) into that. If that was successful, you could try to repeat the automatic restore explained above. I had no occasion, where this was necessary, so I explain this is just as a possibility.

3. Rename the copy back to your original filename (usually DeltaAccess.mdb).

Appendix 1: Table of contents    Main table of contents

 Linking to other data sources

Who should read this: This chapter is relevant to you only if you want to store your data outside of DeltaAccess, e. g. on a SQL server in a network, but still want to use the analysis and editing functions of the DeltaAccess application. Storing data on a SQL server can be useful if many users access the data (e. g., an average of more than 20 concurrent users) or if you want to publish your data online on the internet.

There is no automatic routine to link external data sources (from Microsoft SQL-Server, or other databases) into DeltaAccess. The exact procedures depend on the database server available. I assume that this task is performed only by experienced users and therefore provide the following instructions to do it manually instead. Linking data sources is not possible with a run-time version of Microsoft Access; you need a full version. Two tasks can be distinguished:

a) Linking tables which already exist in an external database into DeltaAccess

Linked tables are a special feature of Microsoft Access. A linked table can be opened like an internal table, but the data remain in the external source. The table is not imported. Changes in the external data are immediately reflected inside Microsoft Access. Linked tables are represented in the database explorer by an arrow in front of the table symbol (e. g. ). The linking of existing tables is necessary, e. g., if you integrate other database subsystems manually; see also the following chapter Links to other database subsystems.

To manually link a table, select the database explorer window and use the menu entry Get external data, Link tables in the File menu of Microsoft Access. The following process depends on the driver for the external database. Consult your Microsoft Access documentation if you can not find your data source or the linking does not work (see the Microsoft Access help topic Import or link data). You may have to install additional drivers either from the Microsoft Access package (not all drivers are installed in the default setup) or you may have to install an ODBC driver supplied by the manufacturer of your database server. Perhaps try to link any table from another Access database (mdb-file; use, e. g., a backup of a descriptor project) to get acquainted with the process and make sure your problems are driver specific.

b) Porting the information model used by DeltaAccess to an external database server

If you want to keep the data of a descriptor project on a SQL server database server and use DeltaAccess as a front end to it, you must first create appropriate data structures on the server, before you can link these tables back into DeltaAccess. Porting data to a server usually requires administrator privileges.

To create a functional descriptor project with data residing on a SQL server, first create an empty new project. Now create a table on your database server for each table created in DeltaAccess. This must be done manually, unless you migrate to Microsoft SQL-Server. A very useful porting tool for MS Access databases (or other ODBC sources) is the dbScripter (see www.dkgadvancedsolutions.com). It can convert the table structure to a SQL script, which can be run on the database server.

The foreign tables must have the same structure as the ones in Microsoft Access, especially regarding the field names. The sequence of fields (= attributes) does not matter. The field types should be equivalent to the types used in DeltaAccess. Replacing some types should work well (e. g. using double precision real numbers where DeltaAccess uses single precision, or integer where DeltaAccess uses Boolean = Yes/No). The most difficult situation occurs, if you have no equivalent type to the memo type used by DeltaAccess (= VARCHAR; up to 64000 characters). You should test the behavior of DeltaAccess and inform me about errors. I may be able to improve the error handling in a future version of DeltaAccess.

Some functionality of DeltaAccess described in this manual relies on relational integrity rules being implemented. Beware if you use a database which does not implement these rules.

After you have created the tables of the DeltaAccess information model in the external database, you must link these tables back into DeltaAccess, see the task discussed above.

Next, delete each table created for your empty new descriptor project inside Microsoft Access (select it in the database explorer and press the Delete key) and rename the equivalent linked tables from the foreign source to these names. The table names in the foreign source (SQL server) and inside Microsoft Access can be different, but you must use the standard names for the linked tables inside DeltaAccess. To rename a linked table click twice on it in the database explorer.

You now have an empty descriptor project, where the data reside on the SQL server and the editing and dialog forms are available in DeltaAccess. Avoid to delete this descriptor project using the delete management option. This will break the connection to the SQL server (it will not delete the data though).

Note that you can not import DELTA files directly into such a server based descriptor project! The import process always deletes existing projects and creates a new project. To import data from DELTA into your SQL server tables, import the data set to a different (temporary) project name, and use the overwrite management action, to move the data to the SQL server. This action should preserve the connection to the server (this has not been tested yet, please inform me if problems occur!).


Note that the information model and thus the table structure may be changed in new versions of DeltaAccess. When upgrading to a new version of DeltaAccess, the restore backup mdb function changes old projects to conform with the current information model. If you have moved data to a server, you may have to do this manually. The information model on the client and on the server side must be identical. You have several options:

c) Possible problems

Experiments were made with linked external tables, but porting descriptor projects to a SQL database server has not been tested. I will try to help you if you experience problems. The following problem was already found:

When you create a linked character subset, a special table containing the CID numbers included in the subset is created. DeltaAccess tries to protect this table through referential integrity to make it immune to changes of CID. Such changes occur, e. g., if you reorganize the sequence of characters. If the subset is based on a descriptor project linked in from an external data source, referential integrity will not be possible. In contrast to character subsets, linked item subsets should be no special problem. Static subsets are vulnerable to changes of IID, but dynamic subsets should work as expected.

Appendix 1: Table of contents    Main table of contents

 

Migrating to Microsoft SQL-Server: If you consider using Microsoft SQL-Server as your database server, a migration tool called upsizing wizard (http:// www.microsoft.com/ AccessDev/ProdInfo/ AUT97dat.htm) is available free of charge from Microsoft. This tool will create tables with a compatible data structure on the MS SQL-Server and will create triggers simulating the relational integrity rules that were defined in Microsoft Access. It also will automatically move your data, and give you much information about the cooperation of Microsoft Access and Microsoft SQL-Server. A good book about the interaction between both systems is Access and SQL Server by Viescas, Gunderloy and Chipman (Sybex). If you are using MS SQL-Server 6.x, be sure to select "Use Triggers" and not "DRI" (= declarative referential integrity) when exporting the referential integrity. Only triggers will export the cascaded updates and deletes, which are necessary for the functionality of DeltaAccess.

 Linking to other database subsystems

Who should read this: Experienced administrators who want to integrate other database subsystems into DeltaAccess. See also the general introduction to the concept of database subsystems in the introduction. The previous chapter Linking to other data sources contains additional technical information about the process of linking data sources into Microsoft Access.

The attributes CollUnit, LitRef, and ItemName of the Item entity should link into separate database subsystems for collection management, literature references, and nomenclature respectively. Currently these attributes are simple text fields, not protected by relational integrity. You can enter any text, whether it is available in your specimen, literature, or nomenclature database or not. This clearly is not how it should be. The entries should be selected using pick lists linked directly into the foreign database, and a double-click should open the linked record in the foreign subsystem, e. g. a literature reference.

I can do this for my own literature and collection subsystem database (an unpublished Microsoft Access application), but other databases must provide an ODBC driver and you have to do the linking manually using the standard Microsoft Access tools. If enough experience is accumulated, it may be possible to generalize this process sufficiently, that it can be performed with the help of a wizard like tool. Anybody volunteering in this area is more than welcome!

I would very much like to link commercial literature/reference software into DeltaAccess. I would be grateful to anyone who could help me with this, by checking if his or her software allows Microsoft Access to link to any part of it (preferably a list of unique reference identifiers, if possible user readable like using the Harvard format). In general this would either mean that the reference manager uses a standard PC database format, or that it provides an ODBC driver. I currently do not know of any reference manager that provides this.

Some technical hints about using links in Microsoft Access can be found in the previous chapter Linking to other data sources.

Appendix 1: Table of contents    Main table of contents

 Merging descriptor projects (data sets)

Who should read this: Experienced administrators who manually want to merge existing descriptor projects into a joint project, possibly using subsets to offer multiple views on the single data set. The general rationale of using separate projects has already been discussed in the introduction.

DeltaAccess supports multiple independent descriptor projects. This corresponds with the current use of DELTA in taxonomy. Yet, in many cases it will make sense to join existing projects concerning closely related subjects into a single project. Multiple subset views, restricting the visible characters or items, can be created for different researchers, taxonomic groups, or regions. In contrast to the text-file-based classical DELTA programs, DeltaAccess supports multiple, concurrent users as long as they are connected through a local area network.

The main problem in merging data sets is that usually the character definitions of the descriptor projects are incompatible. If they are compatible, you only need to append the item definition and item descriptions, see the chapter Appending data to a project.

If the character definitions of the descriptor projects are dissimilar, you have to merge the character definitions first. This can be a difficult process, because usually only experts can decide which characters are identical and which are not. It is not sufficient that the character name is identical: the same term might be used for entirely different features if the item groups are sufficiently distinct.

A true merge, which will append those characters which have identical meanings, but append all other characters is not yet available among the options of the append management action. Two semi-manual methods are therefore outlined in the following paragraphs:

Semi-manual method to merge two descriptor projects, inside DeltaAccess

To achieve a merging of two projects with only partially compatible character definitions, you can do the following:

Examples:

 

First project "P1":

Second project "P2":

Situation: Character 1 and 2 are compatible, character 3 has to be manually resolved, while the color has been recorded only in the second project.

 

1. notes
2. substrate
3. spore length
4. spore width

1. notes
2. substrate
3. spore size
4. color

Step 1. Append the character definition of the second project to the first project using the Append management action (leave "append item definition from" empty). The characters of the second project will be appended using consecutive ID numbers after the last number used by the first project. Note that the project name of the second project will be appended to the character names, to guarantee that the names are unique.

 

1. notes
2. substrate
3. spore length
4. spore width
5. notes [P2]
6. substrate [P2]
7. spore size [P2]
8. color [P2]


 (unchanged)

Step 2. Move all characters in the second project up, using the Reorganize characters tab page of the Reorganize project definition form. If you set the Steps-value to the last character number in the first project ('4' in the example), you can do this with a single click on the Move-down button. The second character set should now have identical character numbers with the newly imported additional characters in the first project.

 


 (unchanged)

5. notes
6. substrate
7. spore size
8. color

Step 3. Delete all newly imported characters in the first project, which have already a corresponding CID coming from the first project. You can do this in the Reorganize tab page used above, where you can select multiple characters and delete them in a single operation.

 

1. notes
2. substrate
3. spore length
4. spore width
7. spore size [P2]
8. color [P2]


 (unchanged)

Step 4. In the second project, edit each character you have deleted in the first project, and change its CID to the CID of the corresponding character in the first project. The simplest way to do this is to directly edit the CID number in the character editor, since no other character will use the same number.

 


 (unchanged)

1. notes
2. substrate
7. spore size
8. color

Step 5. Append only the items of the second project to the first project using the Append management action.

     


Manual method to merge two descriptor projects, using DELTA coded text files (export/re-import)

Open the Reorganize method, select all characters of the second project and "move" them as many steps down as the first project has characters. The result is that all characters will be renumbered in a way that the two projects have separate sets of character numbers, which do not overlap. If you have verified that some characters are identical, you can move these characters down again, so that their CID number equals the CID number of the corresponding character in the first project. Export both projects to DELTA coded text files, and join the two projects by manually appending the additional characters of the second project to the character definition of the first project. Do the same with the item descriptions and re-import the combined DELTA file. Use the Reorganize character states method to join those characters which are identical indeed (join all character states of these characters). Beware that some DeltaAccess features might be lost when you export to and re-import from DELTA coded text files!

Appendix 1: Table of contents    Overview manage projects    Main table of contents

 Troubleshooting during installation

Who should read this: Users who have problems installing DeltaAccess the first time.

So far I found the following sources for problems you might have:

Installation    Main table of contents    Next