This information has been obtained by studying the format of actual files. Some subtle details may have been missed.
(the leading letters are the column names from "dat2csv -a" output and the next letter is the column name for the default output)
-a DEF A 0 1 Deletion marker (D or blank) B 1 4 Not used. Was Registration district - this usage is discontinued. C 5 6 A six digit number (leading zeros) which counts the households D 11 4 A four digit number (leading zeros) which counts members in each household E A 15 20 Parish name (I don't check this, it is up to you to get it right!) F B 35 4 Enumeration district (3n+1a, the remaining numeric fields have trailing spaces) G C 39 5 Folio number (4n+1a) H D 44 4 Page number (4n) I E 48 4 Schedule number (3n+1a) J F 52 5 House number (4n+1a) K G 57 30 House/Street name (default -) L H 87 1 Uninhabited flag (b, u, v, n, x or -) M I 88 24 Surname (capitals, default -) N J 112 24 Forenames (default -) O K 136 1 Flag for name fields (x or -) P L 137 6 Relationship (default -) Q M 143 1 Condition (M, S, U, W or -) R N 144 1 Sex (M, F or -) S O 145 3 Age (no default but 999=unknown/unreadable) 148 1 Age unit(y, m, w, d or -) T P 149 1 Flag for detail fields i.e. rel/cond/sex/age (x or -) U Q 150 30 Occupation R Employed Status (extracted from occupation field) V S 180 1 Flag for occupation (x or -) W T 181 3 County code (3a capitals, no default but UNK if not known) X U 184 20 Birth place (default -) Y V 204 1 Flag for birth place (x or -) Z W 205 6 Disability (default blank) AA X 211 1 Language (W, E, B, G or blank) AB Y 212 44 Notes (default blank, no case conversion) 256 Start of new record
Most of the alphabetic fields are converted so that each word starts with a capital letter. The exceptions are shown above. The algorithm used in IN-CENS, does not allow for initial punctuation such as ( or " but csv2dat does a better job.
The same as the RGnn file format except where shown
B 1 4 Book number (4n) (ENG/WLS) Was 1841 CON Parish number - this usage is discontinued. C 5 6 Household number (sometimes with trailing spaces) D 11 4 Members in household (sometimes with trailing spaces) I 48 4 Blank (schedule not used) J 52 5 Blank (house number not used) P 137 6 Blank (relationship not used) Q 143 1 - (condition not used) W 181 3 County code (3a capitals - INC, OUC, IRL, SCT, OVF, UNK) X 184 20 Blank (birthplace not used) Z 205 6 Blank (disability not used) AA 211 1 Blank (language not used)
When downloaded this is UKCENSIN.TXT until the tutorial has been completed.
0 1 A Double Quote 1 3 The County Code 4 5 The PRO Reference = RG12 or HO107 9 9 The Date of the Census = ddMonyyyy (sometimes just the year) 18 6 Flags as set up in the Settings part of IN-CENS 24 41 Blank (Spaces) 65 1 A Double Quote 66 1 A New Line Character
This file remains fixed once it is set up by the initial start of IN-CENS
Where xxx is the county code.
I originally thought that this file was also fixed at the same time as the one above. However I have since discovered that it is written to each time IN-CENS runs.
Record 1 -------- 0 4 The Transcriber ID = e.g. Mine is PARI 4 4 The number of records in this file (*) 8 56 The Transcriber's name 64 End of Record (another good power of two record length!) Census Piece Records -------------------- 0 4 Piece Number 4 4 Parish code in 1841, null in 1891 8 20 Registration District (1891) or Hundred/Parish (1841) 28 8 LDS Film Number 36 2 NULL (*) 38 6 Number of Households Transcribed for this Piece 44 6 Number of Records in the RG12pppp.DAT file 50 13 NULL 63 1 The Letter "a" (= Census Piece Record) 64 End of Record Civil Parish Records -------------------- 0 4 Piece Number 4 4 NULL 8 20 Civil Parish 28 5 NULL (*) 33 1 Space (*) 34 1 Asterisk = indicates that this Parish has been started (**) Sometimes a "C". 35 4 The Transcriber ID (from Record 1) (**) 39 9 The Date last written (**) 48 4 NULL (*) Checker ID ? 52 9 NULL (*) Date checked ? 61 2 NULL 63 1 The Letter "b" (= Civil Parish Record) 64 End of Record
(*) These fields are defined in the program but don't seem to be used. They were probably intended for CHECK-CENS but never used.
(**) Only present if the Parish has been started for transcribing, all NULL otherwise.
An anomoly is that there are Civil Parishes mentioned which don't exist in the census.
The same as the .DAT file format except that there is one
extra field making the record 20 characters longer. This means
that for the default output from dat2csv
all the
columns after the first need to move along one to allow for the
Ecclesiastical District in column B. Also the following change
meaning slightly
-a DEF A 0 1 Deletion marker? (N after validation change, D for deleted record or blank) C 5 6 Household number (may have leading zeros or trailing spaces) D 11 4 Members in household (ditto) AC B 256 20 Ecclesiastical District name (blank for 1841) 276 Start of new record
Alters from the In-CENS contents
0 1 A Double Quote 1 3 The Chapman Code for the County = CON 4 5 The PRO Reference = RG12 or HO107 9 5 Spaces 14 4 The Year of the Census (actually in the same place as before without the date) 18 6 Flags as set up in the Settings part of IN-CENS 24 34 Blank (Spaces) 58 6 A number with leading zeros (unknown use) 62 4 NULL 66 4 Checker/Validator ID (as transcriber above) 70 4 Blank 74 56 The Checker's/Validator's name 130 End of record
There is no closing quote or new line character any more.
This is the most free form file of the bunch and contains the comments made by the checker. Each record is of the form as follows in multiple lines.
Rnn x/y - Parish/a/b/c/d/e/House name SURNAME/Forenames Free form comments on one line. ============ (12 characters)
Where
This is the original validation program output. The main data file is the same format as the .CHE file above, except that the Birth County and Place fields have been "normalised". The versions that the transcriber used are put in placechg.dat as described here.
There are multiple lines of this form. There can be zero or more lines for each record number but they are in assending sequence.
0 6 Record number in .che file (with leading zeros) 6 10 Date of change in the format yyyy-mm-dd 16 8 Time of change in the format hh:mm:ss 24 3 Transcribed County code of birth 27 20 Transcribed Place name of birth 47 1 New line character
This is the revised and improved validation program. All the output is in one file.
The same as the .CHE file format except that there are two
extra fields making the record 23 characters longer. This means
that for the default output from dat2csv
all the
columns after the Birth Place Flag need to move along two
to allow for the Alternate (enumerated) County and Birth Place
fields
-a DEF B AC 1 4 1841 Book number (4n) (ENG/WLS) W U 181 3 Transcribed County code (3a capitals, no default but UNK if not known) X V 184 20 Transcribed Birth Place (default -) AD X 276 3 Normalised County Code (3a capitals) AE Y 279 20 Normalised Birth Place 299 Start of new record