Dr. Weeks Comment: a variation of don’t ask, don’t tell…. don’t look, don’t diagnose. From autopsy studies: 38% of women age 50 breast cancer; 98% of women at age 50 have thyroid cancer; 50% of men at age 50 have prostate cancer – but whether these cancers express is another story.
Overdiagnosis: An Underrecognized Cause of Confusion and Harm in Cancer Screening
The Mayo Lung Project (MLP) was a National Cancer Institute-funded randomized clinical trial designed to determine the effectiveness of intensive screening with chest radiography and sputum cytology in comparison with usual care (1). The trial was begun in 1971 and was completed in 1983, when the average follow-up after the last screen was about 3 years. Although the 5-year survival for lung cancer was much higher in the screened group than in the control group, there was no difference in lung cancer mortality. This apparent discrepancy between survival and mortality along with an excess of 46 lung cancer cases in the screened group (206, as compared with 160 in the usual-care arm) has been the source of much controversy. Marcus et al. (2), in an attempt to resolve this controversy, used the National Death Index-Plus search to extend the follow-up of the MLP participants through 1996. The investigators report their findings in this issue of the Journal (2).
After more than 76 000 person-years of observation in each group, there was still no statistically significant difference in lung cancer mortality (4.4 deaths per 1000 person-years in the intervention group versus 3.9 deaths per 1000 person-years in the control group); the mortality rates from all other causes were virtually identical. The authors acknowledge the possibility of contamination, noting that many of the subjects in the control group did have chest radiographs during the intervention period; they point out, however, that it is not known what proportion of these radiographs were obtained for screening rather than for evaluating specific symptoms. Furthermore, they point out that the markedly higher 5-year survival and excess cases in the screened group indicate that this group did, in fact, undergo more intensive screening than did the control group. In addition, the investigators found no baseline differences in age, smoking habits, or other lung cancer risk factors in the two groups (3). Thus, the authors provide compelling evidence that a major reduction in lung cancer mortality was not missed because of insufficient follow-up, contamination, or faulty randomization (4).
Marcus et al. (2) reason that the apparent discrepancy between survival and mortality is due largely, if not completely, to some combination of lead time, length, and overdiagnosis biases. Furthermore, they demonstrate that, when lung cancer survival is measured from the time of randomization rather than from the time of diagnosis and thereby adjusted for lead time, the survival advantage in the screened group persists. Thus, by process of elimination, they conclude that the discrepancy between survival and mortality is mainly due to the tendency for screening to detect the more slowly progressive forms of a disease (length bias), some of which would not have become clinically significant (overdiagnosis bias). (An analysis of lung cancer incidence after the completion of the trial could help determine the relative contribution of these two biases, but National Death Index-Plus does not provide incidence data.) Although it is sometimes argued that the dismal prognosis for lung cancer is inconsistent with the overdiagnosis hypothesis (4), this reasoning is flawed because it confuses symptomatic cases of lung cancer with asymptomatic cases, which are detectable only through screening.
Overdiagnosis occurs with the detection of “pseudodisease” (5), a subclinical condition that would not have produced signs or symptoms before the individual died of other causes. In any screening program, some proportion of screen-detected cases will be pseudodisease simply because of competing mortality. In the MLP, a substantial proportion of screen-detected cases were probably pseudodisease for three reasons: 1) the mortality rate from all causes in smokers is high, about threefold that in nonsmokers (6); 2) some squamous cell carcinomas detectable by sputum cytology are very small; and 3) some primary adenocarcinomas detectable by chest radiography grow very slowly (7).
It should be pointed out that pseudodisease is almost impossible to document in a living individual. When pseudodisease is treated, as it almost always is, long-term survival is attributed to the treatment and is labeled a cure. In the rare instances when it is not treated because of old age or other contraindication, pseudodisease cannot be confirmed as such while the patient is still alive because, by definition, it must remain asymptomatic until the patient dies of other causes. These problems with documentation probably explain why pseudodisease has received relatively little attention. However, autopsy studies provide irrefutable evidence that pseudodisease is abundant, both for cancer in general (8) and for lung cancer in particular. In a 30-year review of all adult autopsies on hospital deaths at the Yale New Haven Hospital (New Haven, CT) (9), about one in six lung cancers observed at autopsy had not been recognized before the death of the patient. In the 10 most recent years of the review, about 1% of the men had had previously unsuspected lung cancer, most cases of which were resectable and presumably asymptomatic. In a more recent study of smokers being considered for lung reduction surgery (10), unsuspected primary lung cancer was found by preoperative chest radiography in 2% of the patients. Thus, it is not unreasonable to expect 6 years of intensive screening to detect 46 cases of pseudodisease among 4618 high-risk subjects in the intensively screened group of the MLP.
Overdiagnosis can also occur with the detection of a nonmalignant condition that is misclassified as malignant, that is, a pathologic false-positive error. Although the authors specifically exclude this type of error from their definition of overdiagnosis, pathologic false-positive results probably occur not infrequently in cancer screening. Even under the microscope, the distinction between malignancy and inflammation (11) or hyperplasia (12) can sometimes be very subtle, and the pretest probability of malignancy is usually low in screen-eligible subjects. In the MLP, the subset of patients with squamous cell carcinomas detected by sputum cytology alone, who had a 5-year survival of 83% (1), probably included some instances of pathologic false-positive results as well as pseudodisease.
Overdiagnosis plays havoc with our understanding of cancer statistics. Because overdiagnosis effectively changes a healthy person into a diseased one, it causes overestimations of the sensitivity, specificity, and positive predictive value of screening tests and the incidence of disease (13). As the MLP and a recent analysis of Surveillance, Epidemiology, and End Results (SEER)1 data illustrate (14), overdiagnosis also markedly increases the length of survival, regardless of whether screening or associated treatments are actually effective. However, overdiagnosis does not reduce disease-specific mortality because treating subjects with pseudodisease does not help those who have real disease. Consequently, disease-specific mortality is the most valid end point for the evaluation of screening effectiveness.
For individuals who undergo cancer screening, overdiagnosis is also highly relevant because it is the most serious side effect. False-positive results, which have received much more attention, may cause the screenee to worry for months about having cancer and may lead to an invasive procedure, such as a percutaneous needle biopsy, in the case of lung cancer screening. In contrast, overdiagnosis gives the screenee a false diagnosis of cancer for life and leads to definitive treatment, such as a lobectomy in the case of lung cancer screening. However, the public is much less informed about overdiagnosis than false-positive results. In a recent nationwide survey of women (15), 99% of the respondents were aware of the possibility of false-positive results from mammography but only 6% were aware of either ductal carcinoma in situ by name or the fact that mammography could detect a form of “cancer” that often doesn’t progress.
One apparent paradox in the MLP is that the lung cancer mortality was 11% higher in the screened group than in the control group. Although this excess mortality could be explained by chance alone (P = .18, two-tailed Fisher’s exact test), overdiagnosis could also have contributed to it in both real and spurious ways. Unnecessary surgery for pseudodisease or a pathologic false-positive result could have led to some deaths in the screened group that were correctly attributed to lung cancer. (In a randomized clinical trial of screening, deaths from treatment should be attributed to the target disease.) In addition, overdiagnosis could have led to a spurious increase in lung cancer deaths in the screened group because of misclassification of the cause of death, i.e., “sticking diagnosis bias.” It is not difficult to imagine that a diagnosis of lung cancer could have influenced subsequent testing and reporting in a patient’s medical record, which, in turn, could have influenced the cause of death that appeared on the death certificate. Deaths from various causes could have been misclassified as deaths from lung cancer, but there are two good reasons to suspect that this misclassification involved metastatic adenocarcinoma, in particular. The primary site of this disease is often difficult to determine. Moreover, adenocarcinoma was the only cancer cell type for which patients in the screened group actually had a shorter median survival than those in the control group (2), despite the effects of lead-time, length, and overdiagnosis biases.
Misclassification because of sticking diagnosis bias would have biased the MLP results against screening. However, because the mortality rates for other causes of death were virtually identical in the two groups, an equally large misclassification of death in favor of screening, probably related to treatment complications, must have also been present. For example, some deaths due to surgery may have been attributed to diseases other than lung cancer, such as pneumonia. Regardless, the fact that the all-cause mortality rates were nearly identical (2% higher in the screened group) makes it extremely unlikely that any major net benefit of screening was missed.
The negative results of the MLP and the problem of overdiagnosis do not exclude the possibility that screening for lung cancer with low-dose helical computed tomography (CT) could be highly effective and worthwhile. CT is far more sensitive than chest radiography. In a recent screening study (16), CT detected almost six times as many stage I lung cancers as chest radiography, and most of these tumors were 1.0 cm or less in diameter. However, for this very reason, overdiagnosis and false-positive results could be a much bigger problem with chest CT than they were with chest radiography. In a recent study of small (<3 cm) surgically resected peripheral adenocarcinomas that had been followed by CT (17), tumor volume doubling times ranged from 42 to 1486 days and one half of the tumors had doubling times over 1 year. With a volume doubling time of 1 year, it takes nearly 8 years for a tumor to increase in diameter from 5 mm to 3 cm, plenty of time for the screenee to die of other causes.
Because the potential for overdiagnosis and false-positive results will be so great with helical CT, it is essential that there be some mechanism in the screening process to minimize these side effects, such as a mandatory observation period for small nodules. Randomized clinical trials should be performed, and all causes of mortality should be closely monitored to avoid missing a major benefit or harm from the screening process. Finally, a balanced presentation of the potential benefits and risks””including overdiagnosis””should be made to all prospective screenees to ensure that they can make an informed decision about being screened or enrolled in a randomized trial of screening.
+ Author Affiliations
Affiliations of author: Department of Radiology, Dartmouth-Hitchcock Medical Center, Lebanon, NH, and Center for the Evaluative Clinical Sciences, Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH.
- Correspondence to: William C. Black, M.D., Department of Radiology, Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr., Lebanon, NH 03756.
Using Autopsy Series To Estimate the Disease “Reservoir” for Ductal Carcinoma in Situ of the Breast: How Much More Breast Cancer Can We Find?
Purpose: To determine how many cases of breast cancer might be found if women not known to have the disease were thoroughly examined (the disease “reservoir”).
Data Sources: MEDLINE search from 1966 to the present.
Study Selection: Hospital-based and forensic autopsy series examining women not known to have had breast cancer during life.
Data Extraction: Observed prevalence of occult invasive breast cancer or ductal carcinoma in situ (DCIS) in which the number of women who were given a diagnosis was the numerator and the number of women examined was the denominator. For each autopsy series, we attempted to ascertain the level of scrutiny (sampling method, number of slides examined) given to the pathologic specimens.
Data Synthesis: Among seven autopsy series of women not known to have had breast cancer during life, the median prevalence of invasive breast cancer was 1.3% (range, 0% to 1.8%) and the median prevalence of DCIS was 8.9% (range, 0% to 14.7%). Prevalences were higher among women likely to have been screened (that is, women 40 to 70 years of age). The mean number of slides examined per breast ranged from 9 to 275; series that reported higher levels of scrutiny tended to discover more cases of cancer.
Conclusions: A substantial reservoir of DCIS is undetected during life. How hard pathologists look for the disease and, perhaps, their threshold for making the diagnosis are potentially important factors in determining how many cases of DCIS are diagnosed. The latter has important implications for what it means to have the disease.
AND FROM THE READER’S DIGEST…
Cancer Screening: Doing More Harm than Good?
- Shannon Brownlee,
- New America Foundation
Eventually, researchers and doctors hope, better screening tests will be able to distinguish between cancers that need to be treated and those that don’t. But until then, many experts believe, the decision to get screened should rest on an individual’s values and his or her ability to handle uncertainty.
Suzanne Bull always half expected that she’d get cancer. After all, she lived in Marin County, California, where breast cancer rates are among the highest in the country. Still, she was determined to do whatever she could to protect herself. She ate right and exercised, and every year, she went into San Francisco to get a mammogram.
Last year, when Bull was 54, she got the news she’d been dreading. An ultrasensitive digital mammogram showed a suspicious spot on her left breast. A biopsy confirmed it was cancer. Fortunately, the surgeon told her, it had been caught early: She had ductal carcinoma in situ, or DCIS, which meant that the cancer was still confined to a single milk duct. And it might well stay there, he added, since DCIS generally doesn’t become invasive. That all sounded great, Bull recalls, until the surgeon told her that there was no way to know whether her cancer would turn out to be the lazy, nonthreatening type of DCIS or the potentially invasive kind. She needed a lumpectomy, he told her, and should also consider undergoing radiation and taking the drug tamoxifen.
Bull agonized over the decision for two weeks but in the end went ahead with the lumpectomy and radiation. “I had to do everything I could to stop this disease,” she says. With two clean mammograms behind her, Bull feels lucky. “I’m just glad I had access to digital mammography,” she says. “It finds things so much earlier.”
It’s hard to believe, but some researchers wouldn’t call Bull lucky at all. They say that yearly mammograms are not nearly as effective at reducing the risk of dying of breast cancer as most women think, and that mammography leads many women to get unnecessary treatment — especially those diagnosed with DCIS. The problem is bigger than just mammography: They say the prostate-specific antigen (PSA) test may do men more harm than good if they don’t already have symptoms of prostate cancer. And they have similarly grim things to say about other widely used cancer screening tests.
Their view stands in stark contrast to the message being put out by groups like the American Cancer Society and even the federal government, which say that finding and treating tumors as early as possible is the surest way to avoid a cancer death. But a growing group of scientific heretics — published in highly respected medical journals, working at some of the most august institutions — strongly believe that it’s time to rethink our whole approach to cancer screening.
That’s because screening tests pick up many small cancers that would never have caused any symptoms. “Screening for cancer means that tens of thousands of patients who never would have become sick are diagnosed with this disease,” says H. Gilbert Welch, MD, codirector of the Outcomes Group at the Veterans Affairs Medical Center in White River Junction, Vermont, and a leading expert in cancer screening. “Once they’re diagnosed, almost everybody gets treated — and we know that treatment can cause harm.” Tamoxifen for breast cancer can trigger life-threatening clots in the lungs, for instance. Surgery for prostate cancer leaves 60 percent of men unable to have an erection. For that matter, some of the screening tests themselves carry risks: Up to 5 out of every 1,000 people who get a colonoscopy have a serious complication, such as a colon perforation or major bleeding.
Most people diagnosed with cancer undoubtedly see these risks as the price they must pay to avoid dying of cancer. “The reality is not so simple,” says Dr. Welch. Screening tests are very good at catching tumors that would never bother us, he notes, but they’re actually pretty bad at catching the fastest-growing and most deadly cancers in time to cure them. The bottom line, says researcher Floyd Fowler, Jr., PhD, president of the Boston-based nonprofit Foundation for Informed Medical Decision Making: “Screening’s power to cut your risk of dying has been wildly overinflated.”
How Cancer Can Fool a Screening Test
The idea that getting tested for cancer might be useless or even harmful may strike you as completely wrongheaded. After all, smaller cancers are easier to cut out. They’re also less likely to have metastasized, or spread to other parts of the body — and metastasis is generally what makes cancer deadly. Sure, it’s possible for a tumor to kill without metastasizing: A brain tumor, for example, can cause devastating harm when it grows big enough to squeeze healthy tissue inside the skull. But most cancers threaten life only after a few cells break free and travel through the bloodstream or lymph fluid to set up shop in another part of the body. Once that’s happened, a surgeon can no longer cure a patient by removing the tumor. And even powerful chemotherapy drugs are often unable to kill every last errant cell.
Physicians used to think that a tumor needed to get to a certain size before it would spread. But that’s not necessarily so, says Barnett S. Kramer, MD, associate director for disease prevention at the National Institutes of Health. “Some tumors spread extremely early,” he says. They begin metastasizing when they consist of only a few million cells, which sounds like a lot but is smaller than the period at the end of this sentence — too small to detect with most screening tests. By the time this kind of cancer is big enough to be seen on a mammogram or other test, it’s already sent seeds to other parts of the body.
The flip side of this problem is that many screening tests do a great job at catching cancers that would never have caused problems and could simply have been left alone. This notion violates most of what we think we know about cancer, says Dr. Kramer, because most of what we know is based on the tumors that cause harm. If you think of all the different varieties of cancer as making up an iceberg, cancers that cause symptoms represent only the part of the berg above the waterline. For most of human history, these were the only tumors we knew anything about: the breast cancer that had grown big enough to feel, the lung cancer that was causing shortness of breath.
Screening allows us to look under the water, at the tumors that haven’t yet become symptomatic. We assume they will eventually cause symptoms, but increasing evidence suggests that’s not always the case. Evidence from autopsies, for instance: In one study, postmortem exams showed that nearly 9 percent of women of all ages who died of any cause other than breast cancer had undiagnosed DCIS. Among women from Denmark, where mammography is not as common as it is here, a whopping 39 percent of middle-aged women who died of other causes had undetected breast cancers. Similarly, says outcomes researcher Dr. Welch, a 1989 study found that 60 percent of men over age 60 have undetected prostate cancer — yet only about 3 percent of deaths in men are due to prostate cancer.
So screening tests raise red flags about cancers destined to loll about quietly, causing no problems. But there’s more. They also blare the alarm about cancers that would actually go away on their own — because, in fact, some cancers simply disappear.
Brandon Connor, now age seven, was suspected of having cancer even before he was born. It had been a difficult pregnancy, and Brandon’s mother, Kristin, then 35 and a lawyer in Atlanta, was undergoing regular ultrasounds. One of the tests picked up what looked like a tumor on Brandon’s spine. Doctors made a tentative diagnosis of neuroblastoma, a nervous system cancer.
Neuroblastoma comes in two forms, one of which is deadly. But there was no way of knowing if Brandon’s tumor was indeed a neuroblastoma, much less whether it was dangerous, without doing a biopsy, and its location made that risky. The Connors opted instead to keep a close watch to see if the cancer grew; the doctors said Brandon’s tumor should regress within his first year if it was going to. It didn’t, and by the time Brandon was two years old, he’d undergone more than a dozen MRI scans.
Finally, the doctors advised the Connors to go ahead with surgery. The day before the operation, though, the surgeon ordered one last imaging test. The neuroblastoma was gone. “We couldn’t believe it,” says his mother. Today, physicians know that many neuroblastomas regress on their own during infancy or early childhood.
“People kept telling us, ‘Thank God they found it on the ultrasound,'” Kristin Connor says. Looking back on the years of worry, she adds, “In hindsight, I’d say it was more like a curse.”
The Damage Screening Can Do
Forget the fact that unnecessary therapies for cancer are a tremendous drain on our health care budget, already strained to the breaking point. “Many oncologists would probably tell you that they’ve had patients who suffered serious side effects, even death, from treatment that they might not have needed,” says William C. Black, MD, a professor of radiology at Dartmouth-Hitchcock Medical Center. No one intentionally prescribes unnecessary treatment, of course. But it’s often difficult to know if a patient really needs to be treated, so the tendency is to be aggressive, just in case.
Treatment can exact a profound toll. Take the case of George Brown. At 75, Brown was still a practicing lawyer in Denver last year when he was diagnosed with prostate cancer. His doctor prescribed Lupron to block production of testosterone (which many prostate tumors need in order to grow). “I didn’t realize that Lupron was chemical castration,” says Brown. “I was extremely depressed. I was having hot and cold flashes. I cried at everything.” Radiation therapy damaged his rectum and left him with little control of his bladder or bowels. He is now facing another round of a different testosterone-blocking drug.
Despite his troubles, Brown believes his care was lifesaving. And there’s no way to know in any particular case. But the fact is that most men diagnosed with this cancer have invasive therapy, even though statistics say that many men could safely choose “watchful waiting”: getting PSA tests to monitor the cancer and treating it only if it begins to grow rapidly.
Does Screening Save Lives?
For many people, even serious side effects like the ones Brown suffered would be worth putting up with if the treatment reduced their risk of dying of cancer. That’s the point of getting screened, isn’t it? Yet only one cancer screening test, the venerable Pap smear, has truly slashed the risk of death. Between 1955 and 1992, according to the American Cancer Society, Pap smears cut the death rate for cervical cancer by 74 percent, and deaths have continued to decline each year.
But no other test has had such a powerful effect. The PSA test has been widely used in the United States since the late 1980s, but it’s not clear that it’s had a big impact on the death rate for prostate cancer. Between 1975 and 2005, the latest year for which statistics are available, the death rate dropped from 31 per 100,000 men to 24.6. That’s a real decline, but many experts doubt that PSA testing deserves all the credit — especially given what happened during a “natural experiment” in Seattle and the state of Connecticut in the late 1980s.
Medicare patients in Seattle were five times more likely than those in Connecticut to get PSA testing between 1988 and 1990 and were also more likely to have surgery and radiation for prostate cancer. But when researchers followed up through 1997, they found the Seattle men were just as likely to die of prostate cancer.
“Prostate screening seems to make sense,” says Nortin M. Hadler, MD, a professor of medicine at the University of North Carolina at Chapel Hill and the author of Worried Sick: A Prescription for Health in an Over-treated America. “If only it worked.”
Mammograms also offer a smaller benefit than many patients — and doctors — assume. Mammography’s effectiveness has been hotly debated, but a carefully conducted 2005 analysis suggests it cuts the risk of dying of breast cancer by 15 percent, says the NIH’s Kramer. That means a 60-year-old who gets regular mammograms shaves her risk of dying of the disease in the next decade from 7 per 1,000 to 6 per 1,000
As for colonoscopy: It allows the doctor to remove polyps, growths that can turn into cancer. The best estimates suggest that colonoscopy can cut the risk of death from colon cancer by as much as 60 percent. (We don’t know for sure if it reduces the risk of death, because those studies haven’t been done.) Sixty percent sounds great, until you realize that the chances of dying of colon cancer aren’t all that big to start with. The average woman has a 2.1 percent risk of dying of colorectal cancer. (So of all the things that can kill her, this will be the culprit about 2.1 percent of the time.) The average man’s risk is a little higher, about 2.3 percent. Knocking a 2.3 percent risk down by 60 percent means it drops to 0.9 percent — a benefit, yes, but not necessarily big enough to outweigh all other considerations.
To Screen or Not to Screen
The fact is, there’s no single answer. It depends on many factors, including how old you are, what other diseases you have, and what you value most in terms of your health. Dennis Fryback, PhD, is a former member of the U.S. Preventive Services Task Force, a group of experts convened by the federal government to make recommendations about screening. The task force recommends colonoscopy every ten years for people between the ages of 50 and 75, yet the 61-year-old Fryback has concluded it does not make sense for him to get screened.
He came to that decision in part because he has no family history of colon cancer. If he did, his chances of getting it would increase, and so would the odds he’d benefit from the test. He also knows that getting the exam requires at least a day of taking laxatives to clean out the colon and then facing the possibility of a perforation from the procedure, a risk that goes up with age. He balanced the possible reduction in his chances of dying of colon cancer against his other health problems. He had a heart attack last year and suspects he will die of heart disease before a colon polyp has a chance to kill him.
Given his circumstances, Fryback figures, colonoscopy “is like an expensive lottery ticket. I might get some extra time, but chances are much better that I won’t get anything. It’s like paying, say, $5 to have a very long-shot chance at a few hundred dollars.”
When looking at his odds, Fryback has an advantage: He’s an expert in medical decision making. Most of us, of course, are much less familiar with medical statistics, but there are tools to help average patients come to a decision that’s right for them. Called patient decision aids, these tools come in the form of brochures, videos, and Web-based interactive programs; some include interviews with cancer survivors and people considering getting screened, who discuss their own decisions. Patients can sometimes take them home to study at their own pace.
Decision aids aren’t widely available yet, but some insurance companies and a handful of medical centers offer them. Suzanne Bull used a patient decision aid DVD before opting to undergo radiation treatment for her breast cancer. “Watching it was the best thing I did,” she says.
Eventually, researchers and doctors hope, better screening tests will be able to distinguish between cancers that need to be treated and those that don’t. But until then, many experts believe, the decision to get screened should rest on an individual’s values and his or her ability to handle uncertainty. “We have come to fear dying from disease more than dying at the hands of overzealous doctors,” says Dartmouth’s Dr. Black. The fact is, both are risks when we get screened for cancer.