Completeness of colorectal cancer registration in the Danish hereditary non-polyposis colorectal cancer (HNPCC) register

Registries

The Danish Central Population Register (CPR) was established in 1968 with the purpose to give all citizens in Denmark a unique personal identifier—a CPR number, which is used at all contacts with public authorities including health care and all private companies handling personal information such as social security, education, and income [14]. The CPR number allows for linking data in different registers on the same individual with 100% accuracy and without loss to follow-up if a citizen moves between different parts of Denmark.

The HNPCC-R was established with the purpose to decrease morbidity and mortality of cancer in individuals and families with a hereditary predisposition of colorectal cancer. In the early’90’ies, families with suspicion of a hereditary predisposition for colorectal cancer based on family history were included in the HNPCC-R. Since then, the genetic cause of Lynch syndrome has been discovered and the term “HNPCC” widely abandoned. However, the HNPCC-R kept its name and the families were classified according to their identified pathogenic germline variant or family cancer history using the three main groups: (1) Lynch syndrome due to a pathogenic or likely pathogenic variant in one of the four MMR genes MLH1, MSH2/EPCAM, MSH6, or PMS2 regardless of family history; (2) Familial colorectal cancer with fulfilment or close to fulfilment of the Amsterdam I criteria defined by either a) three relatives with colorectal cancer (or two colorectal cancers and one adenoma with high grade neoplasia), where one relative is first degree to the two others, b) three relatives with colorectal cancer, where two is first degree, and one is second degree and diagnosed younger than 50 years, or c) two first-degree relatives with colorectal cancer, where at least one is diagnosed younger than 50 years; and (3) Moderate familial risk with either one family member diagnosed with colorectal cancer before age 50 or two first-degree relatives diagnosed ≥ 50 years [15].

Reporting to HNPCC-R is voluntary but has gained national outreach—especially accelerated by the co-financed EU-project InfoBioMed in 2007 [16]), where registrations are done electronically. Clinical data are collected from surgeons, pathologists, clinical geneticists, colonoscopists, and genetic laboratories. Variables recorded include data on an individual level (such as CPR number, sex, genetic tests, surveillance procedures, cancers, and polyps including histopathological analyses) and data on a family level (such as risk classification, recommended surveillance programme, and pedigree).

Relevant relatives are checked manually in other relevant registers for cancers and surveillance procedures. Cancer diagnoses are verified at the HNPCC-R and type of documentation are coded as pathology report, medical record, death certificate, or “family report” if information given by a member of the family cannot be verified in any of the former three resources. If the family reports a condition that they think is colon cancer, but the condition can be verified to be a benign condition e.g. as diverticulitis coli, it is deleted.

Data are collected systematically at entry into HNPCC-R (retrospectively) and is continuously up-dated (prospectively), e.g., when a family member consults a department of clinical genetics, or when new cancers are reported to the HNPCC-R. Registrations of colorectal cancer include date of diagnosis, ICD9 code, location, morphology, surgical procedure, and type of documentation. Date of diagnosis is defined as date of cancer surgery, date of biopsy for unresectable tumors, or best estimated date if diagnoses are based on death certificates.

The DCR was founded in 1943 as a national cancer register with the aim to collect data enabling reliable incidence statistics across time, area, occupation, and therapeutic approach [17, 18]. The register includes data on primary cancers through medical records and pathological reports secured by computerized and visual quality control routines [18]. The reporting from hospital departments was initially voluntary, but was made compulsory at March 1987 [19]. In 2004, the data recording was automated with only 10% of diagnoses collected through manual registration and contact to relevant clinical departments. The register has a high data completeness gained through annual linkage with other national registers such as the National Patient Register, the Danish Pathology Register, and with death certificates [20]. DCR collects data on incident carcinomas, sarcomas, leukemias, and lymphomas as one diagnosis per organ with exceptions made for multiple cancers of the skin and paired organs with similar morphological characteristics [18, 21]. “One diagnosis per organ” are defined by ICD codes allowing for registrations of synchronous and metachronous colorectal cancer in different colonic segments as each segment of the colorectum has a unique ICD code. Variables recorded in DCR include CPR number, sex, date of diagnosis, ICD7/ICD10 code, location, and morphology. Date of diagnosis is defined as the date of first admission during which the cancer was diagnosed [20].

The study population

The study period ranged from January 1943 to December 2014. The study population included members of families registered in HNPCC-R at or before the 20th of February 2018. For families with Lynch syndrome, we included carriers of a (likely) pathogenic variant (n = 1798), non-carriers (n = 2360), and untested relatives with a 50% risk of carrying the variant (n = 3925). For familial colorectal cancer and moderate familial risk, we included individuals at risk defined as colorectal cancer patients and their first-degree relatives (n = 26,800 and n = 14,916, respectively). Only individuals with a valid CPR number were eligible for the study.

Matching of colorectal cancer registrations

Colorectal cancer registrations were retrieved from both the HNPCC-R and the DCR and matched by CPR numbers.

ICD7 and ICD10 diagnoses from the DCR were converted to ICD9 diagnoses to match the diagnoses in HNPCC-R before data were merged in SAS [22] by CPR numbers, creating all possible combinations of registrations in the two registers on the same individual. To judge which combinations correctly matched the same colorectal cancer, the registrations in each combination were compared using a 13-step algorithm to match the diagnoses as correctly as possible in patients with multiple diagnoses (Supplementary Table 1). In brief, for each patient colorectal cancer registration(s) from both registers were matched by date of diagnosis, localization, and morphology with stepwise less agreement, ranging from maximum of 12 months difference between the date of diagnosis in identical segments of the colon, to more than 12 months difference and in adjacent segments. For simplicity, all the cases that could be matched by one of the 13 steps in the algorithm were pooled and referred to as matched cases. Detailed matching criteria are presented in Supplementary Table 1.

Registrations in DCR were considered as matched if they could be matched with a registration in HNPCC-R up to one year after the end of the study period to account for dates in DCR systematically being registered earlier in time than dates in HNPCC-R (because dates in DCR are date of first admission and dates in HNPCC-R are date of treatment).

Registrations of colorectal cancer diagnoses in one register without a match in the other register were checked for possible matches with registrations of other neoplasia instead, and in case of synchronous registrations of different neoplasia in the two registers the original documentation in HNPCC-R was reevaluated to reveal possible errors in registration of diagnoses, localization, or histology in one of the registers. Errors in the HNPCC-R were corrected in the process. Registrations of colorectal cancers in HNPCC-R without a matching registration in DCR were also manually checked by reevaluating the original documentation. We could not retrieve documentation for the cases only found in DCR.

Statistics

Number of unique colorectal cancer registrations was defined by the number of cases registered in both HNPCC-R and DCR plus the number of cases only registered in one of the registers.

Agreement between HNPCC-R and DCR was defined by the number of colorectal cancer registrations in both registers divided by the number of unique registrations.

Completeness of a register was defined by the number of colorectal cancer registrations in the register divided by the number of unique registrations.

p-values were estimated in SAS with the Chi-square test—or Fisher’s exact test if group sizes were smaller than 5. Adjusted p-values were estimated with a stepwise logistic regression analysis in SAS with the proc logistic function.

Comments (0)

No login
gif