FreeCEN Administration Tools—dat2csv


WARNING Always work on copies of your data files with these programs and keep copies of the originals safe. They have had the bare minimum of testing and we would not want to damage many hours of painstaking work.

If you come across a bug/problem/invalid output related to the program, I will need a copy of the data file that failed and the error listing to diagnose what went wrong.

I reserve the right to fix errors by changing the documentation.

See revision log for current version and bug fixes.

Overview

The primary function of dat2csv is to check the integrity of an IN-CENS data file. In the process it generates a .CSV spreadsheet data file and a message listing which can be used (in conjunction with csv2dat) to fix any problems found. It does a lot of checking of the overall structure of the data file looking for record corruption (such as that caused by system crashes). It also does a small amount of checking of the actual data but most of this is left to csv2dat.

A secondary function, for co-ordinators that wish to use it, is to convert a FreeCEN file into HTML format for posting on a web site. This is intended as an interim solution pending the availability of the FreeCEN database.

Running dat2csv (command line)

For running dat2csv via a menu, see FC-TOOLBOX

Create a folder on your disk (we suggest \fctools). Put the program dat2csv.exe into this folder. Copy your data file into this folder (e.g. rg121854.dat)

Run the program by double clicking on it

A black console window (DOS) will open which says

Enter command :

This is a prompt for a command line which must look like one of the following

Year        ENG/WLS          SCT

1841        ho41nnnn.dat     hs4zzmmm.dat (or .che or .vld)
1851        ho51nnnn.dat     hs5zzmmm.dat
1861        rg09nnnn.dat     rs6zzmmm.dat
1871        rg10nnnn.dat     rs7zzmmm.dat
1881        rg11nnnn.dat     rs8zzmmm.dat
1891        rg12nnnn.dat     rs9zzmmm.dat
1901        rg13nnnn.dat     rs0zzmmm.dat

Some pieces are rather large—to allow for these to be divided to delegate the work, the following is possible.

Unpadded numbers (without leading zeros) will be accepted but converted to the correct form for output files.

It works out the year and piece number from the file name (so don't change them!)

When you hit enter the program will run—it is not so fast as it was (but still pretty quick). There is a scrolling record count so you can see how it is getting along.

When it finishes it will say how many records have been output and say

<hit return to exit>...

When you do, the window will close.

Now look in the messages file (*.erd). Here it will give some information about the run and again say how many records have been output and also any other problems detected. Keep this file with the output *.csv file, it will be useful for correcting the data if necessary.

The types of messages are “Fatal” which stop the program immediately and may not produce an output file; “Error” which may produce an invalid output file (the .CSV will be ok but it may not generate a good IN-CENS file when processed with csv2dat); and “Warning” which will attempt to correct the error and produce a valid output file. See also “Info” messages switched on with the -v VERBOSE switch.

If any “Error” messages are seen (and “Fatal” of course) you will certainly need to correct the data before accepting it. Warnings may or may not need fixing immediately.

Note that one error can sometimes generate other spurious ones. A lot of things that are warnings in dat2csv are errors in csv2dat and vice-versa because they are concentrating on different problems.

Error & warning messages

Running dat2csv (Advanced instructions)

These extra instructions are for advanced use of the program, not for novices.

The file name can be a full path specification i.e. D:\folder\rg12nnnn.dat. The output and message files will be put in the same path.

The program can also read VALD-REV data files which have extension .VLD or CHECK-CENS data files, which other extensions (normally .CHE).

The default mode creates a brief (skeleton) .CSV file with a lot of repetition omitted and a selection of messages (Error and Warning) relating to problems with the file, if any.

There are alternatives available by using a switch before the file name e.g.

-w rg12nnnn.dat

The switches are

These switches (above) are mutually exclusive and cannot be combined.

These switches (above) are mutually exclusive and cannot be combined.

These switches (above) are not available via the FC-TOOLBOX menu.

The command can also be run from a DOS (Command) console from the start by following the command name with the options and file name. e.g.

D:\fctools>dat2csv -v -a rg129876.che

DOS style / switches can be used if prefered but they are always placed before the file name.

Finally, the program can also be invoked by drag-and-drop of a data file onto the program or a shortcut, e.g. on the desktop. No oportunity is given to enter switches but it is a very quick way to make default runs. The output and error files are sent to the same folder as the source.

The return code from dat2csv is 12 if any Fatal errors occured, 8 if any Errors occured, 4 if any Warnings occured and zero otherwise.

Web page generation

Handling of deleted records

What is done with records that are marked in the data file as deleted depends on which options are chosen. All modes put out a suitable warning message. In the default (BRIEF) mode the word “DELETED” is to also inserted into the “Notes” column of the spreadsheet which can be used to find and check them. With -a (ALL) mode, this is not necessary because column A of the spreadsheet already contains the Deletion Marker Flag = ‘D’. If the -z (STRIP) option is chosen (possibly with any other switches) then the deleted records are removed from the output file entirely. It should be noted, however, that in this case, the correspondence between the record numbers (and their associated warning messaged) and the spreadsheet rows is broken which can make diagnostics difficult. If the -w (WEB) switch is chosen then -z (STRIP) is assumed and cannot be switched off. This is to guard against deleted records inadvertently making it onto the web page.

Back to introduction


HOME

Webmaster
The Parsons family home page.

©2002–04 (last updated 10 Sep 2004) Rick Parsons, Bristol, England