Public Health Agency of Canada / Agence de santé publique du Canada
Skip all navigation -accesskey z Skip to sidemenu -accesskey x Skip to main menu -accesskey m Skip all navigation -accesskey z
Français Contact Us Help Search Canada Site
PHAC Home Centres Publications Guidelines A-Z Index
Child Health Adult Health Seniors Health Surveillance Health Canada



Volume 16, No.2 -1995

 [Table of Contents] 

 

Public Health Agency of Canada (PHAC)

Status Report:
Cancer Projection in Canada
(A Joint Effort of the National Cancer Institute of Canada and Health Canada)

Kathy Clarke

Abstract

The National Cancer Institute of Canada (NCIC) and the Laboratory Centre for Disease Control (LCDC) at Health Canada are collaborating to develop long-term cancer projection techniques using national and provincial cancer incidence data for the major cancer sites. Theoretical models developed at LCDC have been generalized for use in the provinces. A user-friendly software package is now being developed. Progress will be reviewed at the third Canadian Cancer Projection Workshop in St John's, Newfoundland, in August 1995.

Key words:
Age-period-cohort; Canada; forecasting; incidence; mortality; neoplasms; regression analysis

Background

The first Canadian Cancer Projection Workshop was held in Toronto in June 1991.1 The recommendations of this workshop were to continue to improve the quality, timeliness and completeness of cancer incidence data; to standardize projection methods used by the provincial cancer registries;(a) and to implement a technique that could be used by cancer registries across the country.

Development of a long-term cancer projection technique was initiated, using national and provincial cancer incidence data for the major cancer sites. At the second national meeting held in Quebec (City) in May 1993, representatives from provincial cancer registries, the National Cancer Institute of Canada (NCIC) and the Laboratory Centre for Disease Control (LCDC) of Health Canada reviewed the progress of cancer projection strategies in the provinces and the long-term cancer projection technique developed at LCDC, and discussed how software should be developed for this purpose.

Provincial Cancer Registries
Linear regression is being used in most provincial cancer registries for all cancer projections, although log-transformed rates, time-series models and trend analysis have been implemented in some cases. Some provincial cancer agencies do not have biostatisticians on staff and thus are restricted to simple analysis. Participants from a number of cancer registries have expressed interest in projecting mortality for palliative care planning and in projecting survival.

In provinces with more established registry systems, there is interest in moving beyond simple estimation of frequencies, for instance, to evaluation of the impact of covariates and changes in detection methods and definitions, the estimation of confidence limits and small area analysis. In provinces where registry data is in the process of being computerized, the quality of cancer incidence data is the primary concern.

Statistics Canada
Statistics Canada conducts mainly short-term projection, projecting rates up to the current year of publication, normally a three-year period. Data from 1981 onward are used, adjusted for known anomalous trends in the data. Incidence rates by site and sex are modelled using a linear regression assuming a Poisson error distribution with two age groupings, 1-44 and 45+. For some sites, a quadratic model is used.

LCDC
LCDC has developed a non-linear model for major cancer sites. While at LCDC, Dr Ian MacNeill has developed a model that is an alternative to the age-period-cohort models previously proposed. Confidence intervals can be calculated to give a degree of certainty to total incidence or mortality. The method is flexible enough to use covariates and to restrict the analysis to give specific age, sex and site estimates of rates and total incidence counts. Dr James Myles of Queens University has evaluated how the LCDC approach could be implemented in a user-friendly software package.

Models Used in Scandinavia

Dr Timo Hakulinen
A number of models have been used in Scandinavia for forecasting, prediction and projection. Models used in administrative planning and as baselines for prevention strategies require acceptable confidence levels. The development in the incidence of cancer at specific sites is dependent upon etiologic factors, for instance, the effect of smoking on lung cancer, of fertility on breast cancer or of mass screening for early detection of cervical cancer; varying latency periods between exposure and incidence; and methodology, i.e. changes in criteria (detection, facilities, definitions) over time. Often the etiologic factors are unknown.

The simplest model to use for cancer projections is normal linear regression on a log scale. Using the absolute scale can be a problem when the trend is decreasing and the prediction or the confidence limits fall below zero. On the other hand, linear models on a log scale may lead to implausibly high predictions due to the specified exponential growth. Good regression models can be less effective where population projections and unexpected changes in etiology influence predictions. The non-identifiability problem in age-period-cohort analysis does not apply with linear predictions since a linear trend in cohort can be reparametrized into a linear trend in period and vice versa.

Poisson regression can also be used to model cancer incidence. In large countries, the variation exceeds the Poisson expectation, and consequently confidence levels grow. Other sources of error are the uncertainty about the regression parameters and the random error in the future observations. A Taylor series expansion and the delta method can be used to calculate confidence intervals for total incidence. In addition to additive models, multiplicative and power models can be used, although a power model is more difficult to interpret. These models are easily fitted by GLIM statistical software.

Birth cohort is important in modelling because etiology is often related to birth cohort. Regional parallelism also gives more power to the model. The question remains as to whether different regions have different slopes. A curvilinear approach may also be assumed.

Dr Hakulinen has examined the relationships of factors leading to the prediction of future prevalence and mortality. The age-period-cohort-residence (APCR) model of past incidence can be used to predict future incidence, while the age-period-follow-up (APF) model of past relative survival can be used to predict future relative survival and excess mortality. Both these measures, along with population size and structure forecasts and general mortality forecasts, can be combined to predict future prevalence and mortality. An example of note is colon cancer in Finland, where incidence is increasing while mortality is decreasing.

Confidence intervals assess the accuracy of predictions and the range of future outcomes; they can be used to eliminate absolutely inaccurate data; and they provide a basis for evaluation of cancer prevention actions and ongoing monitoring. When choosing a model for incidence, one should remember that simple models with few parameters reduce the confidence error and that non-linearity may surface in the future even if it is not observable at present, or it may exist in the present with a lower power.

GLIM macros and fitted values can provide a practical implementation of APCR and APF models.

Development of a Canadian Methodology

Dr Ian MacNeill
Dr MacNeill has developed a methodology for cancer projections. Although rates of incidence and mortality are sometimes relatively stable, there are several factors to consider.
  • The exponential growth in incidence and mortality with age
  • The age distribution of the population, particularly as the baby-boom generation reaches the high-risk age group and the 85+ cohort increases
  • Relatively low rates and "noise" in young age groups, where areas are small or where data are incomplete
  • Break-points such as Clemmensen's hook (change in rates of increase in breast cancer at menopause)
  • Covariates such as tobacco use or fertility rates
Different models may be required for different sites. If growth with age is exponential and growth over time is linear, then a multiplicative exponential-linear model may be used provided there is no interaction between age and period. A more complicated combination may be necessary if long-term projections are required, e.g. a multiplicative exponential-logistic model for melanoma.2

Point and interval analysis for total mortality are obtained by multiplying an age-period-specific rate by an age-period-specific population estimate and then summing over the appropriate age interval. A Taylor's series expansion is used to calculate the variance of interval estimates of total mortality and the variance of age-standardized rates. A model can be transformed easily to an age-cohort model by replacing period with cohort, using the linear relationship between the two.

A number of concerns are being addressed.

  • Transformation of sketchy data about birth cohorts to fit the model
  • Fitting covariates involving dramatic changes where later cohorts may be more or less affected than current ones (e.g. lung cancer or female breast cancer)
  • Latency, for instance, a potential 50-year latency for lung cancer from smoking
  • Instability of future trends
  • Changes in diagnosis and treatment
  • Heteroscedasticity (unequal variances) in the data

A Model for Prostate Cancer

Dr James Myles
Dr Myles has used the model developed at LCDC for prostate cancer in Canada and generalized it to individual provinces. The model is exponential for age and linear over period.

Although this model may be reliable when counts are high, caution must be used when generalizing to the provincial level. Irregularities in the data are often not "noise" but, rather, explainable trends. For example, low registry numbers in Quebec due to underregistration prior to 1980 necessitate the removal of years before 1980 when using data for projection. There is some concern about estimated percentages of underregistration as well as overregistration, for example, where new prostate-specific antigen testing recognizes indolent foci. Such changes in screening techniques and definitions further complicate reporting. Although there is no apparent problem of convergence, the determinants of trends affecting the future are not always obvious.

The model used for Canada as a whole converged for each of the provinces. However, there are a number of problems that should be resolved. The smaller the incidence, the higher the variability of the parameter estimates. This will be the case for the small provinces that have fewer cases. For example, the standard error of the age effect is smaller for Ontario and Canada than for smaller regions. Another problem occurs when there are a few years of unusual incidence. This can have a large effect on estimates of the trends. A lack of homogeneity across the country is possible.

One solution to the high variability in the parameter estimates for the small provinces is to use small area estimation techniques, including synthetic estimates and empirical Bayes' estimation. To determine whether unusual trends are "noise" or systematic fluctuations, collaboration with the provinces to discuss data quality is required.

A software package needs to be developed for use in the provinces. This package would be written using the statistical package S-Plus,a personal computer software that can support three-dimensional graphics. Input data would include cancer site with observed incidence and mortality for each province and for all of Canada, population counts, years to forecast, population estimates for years to be forecast and possibly covariate data. Possible outputs are projections of cancer incidence and mortality rates by sex for the major cancer sites, point and interval estimates of total mortality and incidence, point and interval estimates of age-standardized mortality and incidence rates, graphics and other reports. A consensus has yet to be reached to determine specific outputs.

The comparative value of this model and others, such as the Poisson model, has been discussed. Although the linear model seems appropriate, a statistical comparison should be performed. As further incidence data are collected over the next decade, models could then be adequately evaluated. Generalization to a power model is a possibility. When developing a software package, practicality seems preferable to methodologic appeal.

Priorities for Ongoing Projection Strategies

The following four areas of concern have been identified for ongoing work leading to the next Canadian Cancer Projection Workshop.

Methodology
Collaboration with consultants in statistics and epidemiology is essential to evaluate the proposed models and to compare them with the ones currently being used. Sensitivity analyses should be performed to assess the impact of underregistration or overregistration, missing data or changes and variations in definitions and diagnosis. Covariates, risk factors and proxy information for risk factors (e.g. dietary) should be incorporated into existing models if possible. Further methodologic work is also needed to determine the accuracy of confidence limit estimates. Projection must be validated using historical data.

Data Quality
Quality and timeliness of data is the ongoing challenge of provincial registries. Their expertise and knowledge of cancer incidence and mortality must be utilized. The system developed to project cancer incidence and mortality should allow for adjustments by data quality indicators (e.g. a change in registration or detection) and the exclusion of specific years of data in which the quality is limited.

Software Package
A cancer projection package remains a high priority. A menu-driven approach that is user-friendly, incorporates a flexible model or several models, displays graphics, prints reports and has options for input and output is the ideal. Hardware and software features have not yet been chosen, although a study of user requirements has been suggested.

Collaboration
LCDC continues to provide support and is developing a cancer projection software package to be used by the provincial registries. Cancer registries continue to provide high-quality data and collaborate with LCDC in documenting changes in diagnostic techniques, definitions and under or overregistration.

Recent Developments in Cancer Projection

Dr Hakulinen's paper about confidence intervals for GLIM models has been published in Statistics in Medicine.3 Dr Hakulinen has also co-authored an article applying the earlier version of the American National Cancer Institute (NCI) CAN*TROL program to the Nordic countries.4

Dr MacNeill has applied surface model techniques to cancer incidence projections. He has written and published papers on the theory, including segmented models and modelling heteroscedastic age-period-cohort data,5-7 and on specific sites, including melanoma2 and prostate cancer.8

Dr Myles has prepared a report comparing the surface model techniques and GLIM models with Poisson regression. The report indicates that the surface model techniques were more frequently superior.

Chris Waters of LCDC is generalizing the surface model software to allow a general user interface. This interface, as well as GLIM modules and their interface, is being developed by Bob Parkes of the Ontario Cancer Treatment and Research Foundation (OCTRF). A poster presentation about colorectal projections was presented at the OCTRF Cancer Epidemiology Seminar in Toronto on May 2, 1995.

Upcoming National Workshop

The third Canadian Cancer Projection Workshop will be held August 16, 1995, in St John's, Newfoundland. Developments and comparisons of cancer projection methods identified at earlier workshops will be presented. Progress in the development of software and user needs and requirements will also be reported. This workshop will include a presentation of the updated CAN*TROL program by Dr Potosky of the NCI.

(a) Specified requirements of the standardized projection methods include the following: focus on intermediate to long-term projections provide results for specific age groups, sex and major cancer sites; make use of Poisson regression models; provide confidence intervals of estimates; incorporate appropriate covariate information where possible; and use widely available software packages.

References

1. McLaughlin JR, Morgan P, Mao Y. Executive summary: first Canadian Cancer Projection Workshop. Chronic Dis Can 1992;13(3):47 51.

2.MacNeill IB, Elwood JM, Miller D, Mao Y. Trends in mortality from melanoma in Canada and prediction of future rates. Stat Med 1995;14:821 39.

3. Hakulinen T, Dyba T. Precision of incidence predictions based on Poisson distributed observations. Stat Med 1994;13:1513 23.

4. Wiklund K, Hakulinen T, Sparén P. Prediction of cancer mortality in the Nordic countries in 2005: effects of various interventions. Eur J Cancer Prev 1992;1:247 58.

5. MacNeill IB, Mao Y. Change-point methods for mortality and morbidity data. J Appl Stat Sci 1993;1:359 77. Reprinted in: Applied change-point problems in statistics. Commack (New York): Nova Science Publishers, 1994.

6. MacNeill IB, Mao Y, Xie L. Modeling heteroscedastic age-period-cohort cancer data. Can J Stat 1994;22:529 39.

7. MacNeill IB, Mao Y, Xie L, Tang SM. Segmented models for age-period-cohort cancer data. Int J Math Stat Sci 7. In press.

8. Morrison HI, MacNeill IB, Miller D, Mao Y. The impending Canadian prostate cancer epidemic. Can J Public Health. In press.

Author References

Kathy Clarke, Cancer Division, Bureau of Chronic Disease Epidemiology, Laboratory Centre for Disease Control, Health Canada, Tunney's Pasture, Postal locator: 0700B2, Ottawa, Ontario K1A 0L2

[Previous] [Table of Contents] [Next]

Last Updated: 2002-10-29 Top