Census data processing has always been the area most affected by the development of new technology, in particular computer technology. Computers began to be used to process census data nearly 60 years ago. Since that time computer technology has rapidly developed and census data processing options are many more now than ever before. Therefore, a specific study should be conducted early in the planning stage regarding the data processing options currently available in the market to process the census.
Data processing tasks consist of data editing, data coding, data capture, data review, data verification, data evaluation, consistency checks and detailed data analysis and data tabulation. Execution of data processing operations leading to the development of tabulations and other census products will commence as soon as completed census questionnaires/forms start arriving at the data processing facility.
Stages of Data Processing
- Data Entry from Summary sheet for Preliminary Results
- Preparation for Scanning
- Character Inspection and Key Correction
- Editing and Coding
- Data analysis and Tabulation
Summary Sheet for Enumeration Area (Completed by Enumerator)
1.Preparation of Preliminary Report started during first week of May 2014 till second week of July
- 8 staffs, 8 hours a day, 81,750 EAs
- Microsoft Access database for data capture
- Total households, male, female and total population up to township levels.
- Information captured were Geo-code, household type, number of males and number of females
Preliminary results – Only totals
Preparation for Scanning
- Preparation for scanning started during first week of May with (15) permanent staffs
- (70) temporary staffs were recruited for preparation work for 5 months starting from 1st July
- Finished on 1st October 2014
- Removal of all foreign objects (paper clips, staples, tapes, etc).
- Removal of any damaged (torn or soiled) questionnaires and transcribing them onto new questionnaires.
- Checking the fields for State/Region, District, Township, Ward/VT, EA and Household number are correctly filled.
- Ensuring that there are no duplicate or missing households.
- Separating (splitting) the two parts of the questionnaire so that they become legal-size forms that can fit into the scanning machine.
- Recording any new Ethnicities that were coded as 914, so that they can be assigned a code at a later stage.
The DRS P900 Scanner
Scanning – test run was started during second week of May, 2014
- 30 DOP staff worked as operators in 2 shifts of 8 hours each from 14 July,
- 3 computers for registration
- procured 2 and hired 6 state-of-the-art scanners
- (1 reserve) scanner
- All the scanners were networked
- The scanners use both Optical Mark Reader (OMR) and Intelligent Character Recognition (ICR)
- The system also captures texts and makes them available for further coding/sub-coding.
- The scanners are hired from DRS Services Limited, a company contracted by UNFPA.
- Average – 140,000 forms per day
- A total of 11.5 million questionnaires were scanned
- Process was completed on 1 October 2014
Character Inspection and Key Correction
- The system highlights all captured handwritten characters with which it has low levels of confidence (mostly due to bad handwriting)
- Operator examines these characters in bulk (Character Inspection), flags them and passes them on to Key Correction
- Key Correction – correcting the flagged character or accepting it as valid.
- Operator is presented with an image of the appropriate part of the form and an indication of the error.
- Operator either corrects the error, accept the character as it is, or pass to a Supervisor for resolution.
- Character inspection and key correction was started at the second week of May with 20 staffs until end of June
- 2 shifts was started at 15 July, with 65 staffs in each shifts, 7 hours a shift
- Average 130,000 forms were completed per day
- Process finished 17 October 2014
Character Inspection: Examining characters in bulk
- Ethnicity (other “914”), Occupation and Industry written as free text during enumeration, hence cannot be captured using OMR or ICR technology
- Enumerators wrote the description of the Occupation and Industry in the questionnaire
- Census Office developed the coding index for occupation and industry based on ISCO 2008 and ISIC version 4
- To improve quality of coding, each questionnaire is being coded by two operators (“double-blind” coding).
- If the two operators have coded differently, the questionnaire automatically goes to the Supervisor to make a determination on the correct code.
- 153 Staff were trained in coding. Process started on 20 October, to be finished in late 2015
A typical operator coding screen
A typical Supervisor coding screen
Census processing flow
Data analysis and Tabulation
The results of the 2014 Census have been published in a number of volumes. The first was the Provisional Results (Census Volume 1), released in August 2014. The Census Main Results were launched in May 2015. These included The Union Report (Census Report Volume 2), Highlights of the Main Results (Census Report Volume 2-A), and reports of each of the 15 States and Regions (Census Report Volume 3[A - O]). The reports on Occupation and Industry (Census Report Volume 2-B) and Religion (Census Report Volume 2-C) were launched in March 2016 and July 2016, respectively.
The results of the 2014 Census have also been published thirteen thematic reports and a Census Atlas. They address issues on Fertility and Nuptiality; Mortality; Maternal Mortality; Migration and Urbanization; Population Projections; Population Dynamics; the Elderly; Children and Young People; Education; Labour Force Dynamics; Disability; Gender Dimensions; and Housing Conditions, Amenities and Household Assets. Their preparation involved collaborative efforts with both local and international experts as well as various Government Ministries, Departments and research institutions.