The Reliability of Lymphoma Diagnosis in Small Tissue Samples Is

Hematopathology / LYMPHOMA DIAGNOSIS IN SMALL SAMPLES
The Reliability of Lymphoma Diagnosis in Small Tissue
Samples Is Heavily Influenced by Lymphoma Subtype
Patricia L. Farmer, MD,1 Denis J. Bailey, MD,2 Bruce F. Burns, MD,3 Andrew Day, MSc,4
and David P. LeBrun, MD1
Key Words: Lymphoma; Tissue microarray; Needle biopsy; Immunophenotyping
DOI: 10.1309/J7Y74D9DXEAJ9YUY
Abstract
A specific pathologic diagnosis is important in
malignant lymphoma because the diverse disease
subtypes require tailored approaches to clinical
management. Reliance on small samples obtained with
cutting needles has been advocated as a less invasive
alternative to using larger, excised samples. Although
published studies have demonstrated the safety and
apparent sufficiency of this approach in informing
clinical care, none have systematically determined the
accuracy of pathologic lymphoma subtyping based on
very small samples. We used a tissue microarray
representing 67 cases of malignant lymphoma and 17
samples of nonneoplastic lymphoid tissue to model
lymphoma diagnosis in small samples. Overall, 73.8%
of the cases were diagnosed with a level of confidence
deemed sufficient for directing clinical management;
85.9% of these diagnoses were accurate. Small cell
lymphomas with highly distinctive immunophenotypes,
including small lymphocytic, mantle cell, and Tlymphoblastic lymphoma, were recognized most
consistently and accurately in the small samples. In
contrast, follicular lymphoma and marginal zone
lymphoma were especially difficult. Our results indicate
that the reliability of lymphoma diagnoses based on
small samples is heavily influenced by lymphoma
subtype.
474
474
Am J Clin Pathol 2007;128:474-480
DOI: 10.1309/J7Y74D9DXEAJ9YUY
The optimal care of patients with cancer requires a precise pathologic diagnosis. This is generally obtained through
histologic examination of tumor tissue, often supplemented
with the results of immunohistologic or other ancillary tests.
Obtaining an adequate biopsy sample sometimes requires an
invasive surgical procedure, such as thoracotomy or laparotomy, which may be associated with considerable morbidity or
expense. Therefore, increasing reliance has been placed on
small biopsy samples obtained with a cutting needle, often
under radiologic guidance.
Nowhere is the need for diagnostic precision greater than in
the management of patients with malignant lymphoma. For
example, the 3 most common lymphoma types, diffuse large Bcell lymphoma (DLBCL), follicular lymphoma, and Hodgkin
lymphoma (HL), are associated with distinct biologic characteristics demanding tailored approaches to patient management.
The pathologic diagnosis and classification of the numerous
lymphoma types that are currently recognized involves the application of morphologic and immunophenotypic criteria, as outlined in the World Health Organization (WHO) system.1 These
criteria are based on relatively large tissue samples because the
pathologic distinction between some hematolymphoid processes requires consideration of tissue architecture, a parameter that
is difficult or impossible to assess in very small samples.
Several studies have evaluated the usefulness of needle core
biopsy specimens obtained clinically in the diagnosis of lymphomas. They have found that the technique is well tolerated by
patients, associated with a low rate of complications, and sufficient for clinical decision making in between 80% and 90% of
cases.2-10 Although a few such studies have considered the relative amenability of HL vs non-Hodgkin lymphoma (NHL) or
high- vs low-grade lymphomas to diagnosis by cutting-needle
© American Society for Clinical Pathology
Hematopathology / ORIGINAL ARTICLE
biopsy, none have systematically evaluated the various specific
types of NHL.8,11 Furthermore, most previous studies used lymphoma classification systems that antedate the current WHO
scheme; this system incorporates morphologic and
immunophenotypic criteria that are likely to affect the
amenability of some lymphomas to accurate diagnosis in small
specimens. Finally, most previous studies assessing the usefulness of the needle biopsy procedure defined a successful outcome according to the apparent sufficiency of the pathologic
diagnosis for clinical decision making. Surprisingly, no study
has systematically used the larger, excised sample from the
same lesion as a “gold standard” to evaluate the accuracy of
lymphoma classification based on very small tissue samples.
Tissue microarrays (TMAs) are constructed by arraying in
a single recipient paraffin block small cores of tissue harvested
from multiple donor blocks such that histologic sections prepared from the recipient block are representative of all of its constituent cores. Thus, TMAs permit efficient histopathologic and
immunohistochemical evaluation of large numbers of small,
formaldehyde-fixed, paraffin-embedded tissue specimens.
We used a TMA to evaluate the relative amenability of
lymphomas of different types to accurate recognition based on
small tissue samples. Diagnoses were made based on the
examination of histologic and immunostained sections from a
TMA representing 67 cases of malignant lymphoma and 17
cases of nonneoplastic lymphoid hyperplasia. The TMA
core–based diagnoses were then correlated with those based on
the original, larger samples.
Materials and Methods
Construction of a Randomized TMA
Paraffin-embedded lymphoma samples were identified by
searching the laboratory information system of Kingston
General Hospital, Kingston, Canada. The hematoxylin-phloxinsaffron (HPS)-stained and immunostained histologic slides
from each case were retrieved and reviewed by 2 of us (P.L.F.
and D.P.L.). Lymphomas were classified and (for follicular
lymphomas) graded by consensus according to the WHO system.1 All cases included in the study met the following criteria:
adequate tumor representation in at least 1 paraffin block of
formaldehyde-fixed tissue and unequivocal correspondence of
the morphologic and immunophenotypic findings with the
pathologic criteria used in lymphoma classification. The proportions of cases representing the various lymphoma types were
selected so as to approximate the frequency with which they are
encountered in North American practice.
Representative areas were marked on the glass slides for
subsequent sampling in the TMA. The TMA was assembled
using a tissue-arraying instrument (Beecher Instruments, Silver
Spring, MD) as described previously.12 Two 0.6-mm-diameter
tissue cores were harvested from each case. The pairs of cores
were arrayed in a recipient paraffin block in random order, as
determined using a random number generator. Histologic sections were cut from the TMA block using a Leica microtome
(Richmond Hill, Canada). The sections were mounted on adhesive glass slides (Surgipath Canada, Winnipeg, Canada), stained
for HPS, and immunostained for CD45, CD20, CD3, CD5,
CD10, CD15, CD21, CD30, CD79a, CD43, cyclin D1, CD23,
bcl-2, bcl-6, terminal deoxynucleotidyl transferase, and Ki-67.
Antibody sources, dilutions, and clones are shown in ❚Table 1❚.
Antigen retrieval was performed by incubation of slides in
EDTA for 30 minutes at 95°C in a kitchen steamer (T-Fal
Canada, Scarborough, Canada). Immunohistochemical staining
was done on a Ventana automated immunostainer (Ventana,
Tucson, AZ) using the biotin-avidin immunoperoxidase technique. The signal was generated using a hydrogen
peroxide–activated diaminobenzidine solution intensified with
copper, and the sections were counterstained with hematoxylin.
All antibody incubations were for 30 minutes at 37°C.
Diagnostic Evaluation
The HPS-stained and immunostained TMA sections were
evaluated independently by 4 pathologists (P.L.F., D.J.B.,
B.F.B., and D.P.L.), all with expertise in lymph node pathology.
Two of the pathologists had reviewed the original histologic
slides retrieved from the archive; the others were told only that
the samples were of lymphomas and nonneoplastic lymphoid
tissues. Originally, 100 cases were retrieved; 16 cases were
excluded from diagnostic evaluation, before scoring by the
pathologists, because of inadequate (ie, <1 complete core) representation in the HPS-stained TMA section or, in 1 case, persistent uncertainly as to the lymphoma subtype based on review
of the slides from the original large sample.
❚Table 1❚
Primary Antibodies Used
Antigen
Clone
Supplier
Dilution
CD45
CD20
CD3
CD5
CD21
CD79a
CD43
CD23
CD10
CYCD1
CD15
CD30
TdT
bcl-6
Ki-67
bcl-2
RP2/18
L26
PS1
4C7
2G9
JCB117
L60
1B12
56C6
SP4
MMA
Ber-H2
Polyclonal
P1F6
MM1
100/D5
Ventana, Tucson, AZ
Ventana
Ventana
Ventana
Vector, Burlingame, CA
Cell Marque, Rocklin, CA
Ventana
Ventana
Ventana
Neomarkers, Fremont, CA
Ventana
DAKO, Carpinteria, CA
Supertechs, Bethesda, MD
Vector
Ventana
Ventana
1:100
1:200
1:100
Prediluted
1:10
Prediluted
Prediluted
Prediluted
Prediluted
1:100
1:20
1:20
1:10
1:20
Prediluted
Prediluted
TdT, terminal deoxynucleotidyl transferase.
Am J Clin Pathol 2007;128:474-480
© American Society for Clinical Pathology
475
DOI: 10.1309/J7Y74D9DXEAJ9YUY
475
475
Farmer et al / LYMPHOMA DIAGNOSIS IN SMALL SAMPLES
Cores from the following cases were scored: B-cell chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL),
7 cases; follicular lymphoma grade 1 or 2 (FL), 7 cases; marginal zone B-cell lymphoma (MZL), 10 cases; DLBCL, 8
cases; follicular lymphoma grade 3 (FL3), 3 cases; mantle cell
lymphoma (MCL), 6 cases; classical HL, 14 cases; nodular
lymphocyte predominance HL, 1 case; anaplastic large cell
lymphoma, 1 case; peripheral T-cell lymphoma (PTL), 7 cases;
precursor T-lymphoblastic leukemia/lymphoblastic lymphoma
(TLL), 3 cases; and, reactive lymphoid hyperplasia (RL), 17
cases. For each case, the pathologists examined the HPS-stained
and immunostained sections and recorded their favored diagnosis on a standard form along with a numeric score that reflected
their level of confidence in their diagnosis (see the “Results”
section for the definitions of these scores).
For each specimen, the TMA core–based diagnosis from
each pathologist was compared with the excision diagnosis. The
raw percentage of agreement and the standard unweighted κ
scores were calculated after pooling the assessments from each
of the 4 raters so as to create a combined sample of 336 (84 ×
4) assessments.13 Agreement is provided overall and by diagnosis. Comparison of the agreement among raters based on the 84
specimens, regardless of the gold standard diagnosis, was
accomplished by calculating the multirater κ score.
Results
The performance of individual pathologists in making
diagnoses based on the TMA cores that agreed with those based
on the corresponding excised samples was reasonably consistent. The 4 pathologists made correct, specific diagnoses in 61,
62, 66, and 68 (73%-81%) of the 84 cases represented in the
TMA. The rate of correct diagnosis was not significantly different between pathologists (P = .48). Moreover, the average concordance rate of the 2 pathologists who had made the gold standard diagnoses based on the excised samples was almost identical (<1% difference) to that of the 2 pathologists who had not
seen the excised samples, suggesting that prior exposure to the
larger samples provided little relative advantage.
In subsequent analysis, each pathologist-sample encounter
was considered an independent data point. Thus, the results
from 336 (ie, 84 samples × 4 pathologists) encounters were tabulated on 2-dimensional tables ❚Table 2❚ and ❚Table 3❚ in which
the rows correspond to the gold standard diagnoses (excision
diagnosis) and the columns to the diagnoses based on the TMA
cores (TMA core diagnosis). Thus, concordant diagnoses fall
on the diagonal across the first 11 diagnostic categories.
Columns outnumber rows so as to accommodate additional,
less specific designations applied by some pathologists to some
of the TMA cores, a single encounter in which a diagnosis of
sarcoma was made, and the encounters in which a pathologist
was unable to make any diagnosis.
476
476
Am J Clin Pathol 2007;128:474-480
DOI: 10.1309/J7Y74D9DXEAJ9YUY
Table 2 shows the pathologists’ results overall. For each
cell that falls on the diagonal across the first 11 rows and
columns, the middle number (row percentage) corresponds to
the sensitivity and the bottom number (column percentage) to
the positive predictive power (ie, the portion of positive test
results that are true positives) of the TMA core–based diagnosis
relative to the gold standard, excision diagnosis.
Concordant, specific diagnoses (ie, those in which the
diagnosis made based on the TMA cores corresponded exactly
to the gold standard, shown in bold in the tables) were made in
76.5% of the encounters (257/336; κ = 0.73). Sometimes
pathologists applied less specific diagnostic terms, including
“DLBCL or grade 3 follicular lymphoma,” “indolent-appearing
B-cell lymphoma,” and “lymphoma, not otherwise specified.”
Some cores were interpreted as “B-cell lymphoma, not otherwise specified”; these are included in the “lymphoma, not otherwise specified” column in the table. These less specific terms
were applied correctly in the vast majority of cases (18/20) in
which they were used. If the 18 correct core-based diagnoses
are considered as concordant in the analysis, the overall concordance rate increases to 81.8% (275/336).
Three lymphoma types seem especially amenable to correct
recognition in the small core samples: CLL, MCL, and TLL.
These entities were recognized in the TMA cores with sensitivities ranging from 100% (12/12) for TLL to 86% (24/28) for
CLL, and very high positive predictive powers of 100% for TLL
(12/12) and CLL (24/24) and 95% (21/22) for MCL. In contrast,
cases of FL, FL3, and MZL seemed relatively difficult to recognize in small samples. Among these cases the sensitivities of
TMA core–based diagnoses ranged from 68% (19/28) for FL to
8% (1/12) for FL3, and positive predictive powers were 76% for
MZL (16/21), 70% for FL (19/27), and 33% for FL3 (1/3). The
pathologists achieved an intermediate level of performance in
recognizing DLBCL, HL, PTL, and RL in the TMA cores.
Among these categories, it is noteworthy that the positive predictive power for PTL, a diagnosis that can be difficult to make even
based on large samples, was 100% (21/21).
The unweighted κ score is an alternative measure of agreement that corrects for agreement by chance.13 The second to last
row in Table 2 shows the κ scores calculated based on the agreement between the gold standard and TMA core–based samples
for each diagnostic category. A value of 1 indicates perfect
agreement between the methods, whereas a value of 0 represents no agreement beyond what would be expected by chance;
negative values suggest agreement worse than expected by
chance. The κ scores for CLL, MCL, and TLL are 1 or nearly
1, confirming a high rate of agreement with the gold standard
diagnoses for these lymphoma types, whereas the lowest κ
scores were obtained for FL, FL3, and MZL.
Provision of an unequivocal diagnosis in a pathology report
implies a high degree of confidence on the part of the pathologist in the accuracy of his or her opinion. In difficult cases,
© American Society for Clinical Pathology
Hematopathology / ORIGINAL ARTICLE
❚Table 2❚
Concordance at All Confidence Scores*
TMA Core Diagnosis
Excision
Diagnosis ALCL CLL
ALCL
DLBCL
FL
FL3
3
75
100
CLL
HL
MCL
MZL
PTL
RL
DLB/
FL3
IBCL
Sar- No
LNOS coma Dx
1
25
2
1
4
1
29
91
81
1
4
3
1
8
3
FL
FL3
1
3
4
19
68
70
5
42
19
1
4
33
1
8
33
HL
MCL
1
4
3
1
3
3
2
7
6
MZL
PTL
1
3
33
2
7
10
1
4
1
1
8
2
50
83
82
7
18
11
2
7
3
3
11
38
1
3
20
2
7
40
2
17
40
1
4
13
3
5
4
21
88
95
1
3
5
1
4
5
16
40
76
1
4
13
3
8
38
7
18
9
1
4
1
21
75
100
TLL
28
1
3
14
1
4
14
1
8
14
1
2
14
32
28
1
2
100
1
8
9
5
8
45
3
0.856
0.664
24
0.917
0.850
1
1
3
36
0.836
0.710
2
3
7
27
3
61
0.663 0.114
0.788
0.557 –0.009 0.846
22
0.907
0.903
2
3
10
21
0.477
0.407
21
0.846
0.780
12
1.000
1.000
12
60
24
2
5
29
1
4
14
2
5
18
1
4
9
12
100
100
RL
Total
4
24
86
100
DLBCL
Total
κ†
κ‡
TLL
40
28
12
61
90
82
74
0.821
0.780
5
–0.008
–0.015
8
–0.012
–0.024
7
1
–0.011 NC
–0.021 NC
2
3
18
11
–0.017
0.091
68
336
0.733
0.673
ALCL, anaplastic large cell lymphoma; CLL, chronic lymphocytic leukemia/small lymphocytic lymphoma; DLBCL, diffuse large B-cell lymphoma; DLB/FL3, diffuse large B-cell
lymphoma or follicular lymphoma grade 3/3; FL, follicular lymphoma grades 1/3 or 2/3; FL3, follicular lymphoma grade 3/3; HL, Hodgkin lymphoma; IBCL, indolent B-cell
lymphoma; LNOS, lymphoma, not otherwise specified; MCL, mantle cell lymphoma; MZL, marginal zone B-cell lymphoma; NC, not calculated; No Dx, no diagnosis made;
PTL, peripheral T-cell lymphoma; RL, reactive lymphoid hyperplasia; TLL, T-lymphoblastic lymphoma; TMA, tissue microarray.
* For each diagnosis, the top number indicates the number of diagnoses given as that diagnosis, the middle number is the row percentage (ie, the number of diagnoses in the cell
expressed as a percentage of the total number of “gold standard” diagnoses in the corresponding row), and the bottom number indicates the column percentage (ie, the number of
diagnoses in the cell expressed as a percentage of the total number of TMA core diagnoses in the corresponding column). Bold type indicates concordant diagnoses.
† Agreement between excision diagnosis and TMA core diagnosis.
‡ Agreement between raters’ TMA core diagnoses.
pathologists typically include a statement expressing how confident they feel. The intent is to provide clinicians with an indication as to whether patients may safely be managed based on
the interpretation provided or, alternatively, whether clinical
decisions should be deferred pending the availability of additional information. This convention was modeled in our study
by requiring the pathologists to accompany each of their TMA
core–based diagnoses with a confidence score defined as follows: 3, confident opinion sufficient for planning patient
management; 2, opinion may be sufficient for planning management, as determined by additional, perhaps clinical, considerations; 1, “suggestive” findings, definitely requiring more
sampling or investigation to influence management decisions;
and 0, no diagnosis. Not surprisingly, a strong correlation was
evident between confidence score and the proportion of cases
that were diagnosed correctly in the TMA: overall concordance
between the core-based and gold standard diagnoses at confidence scores of 3, 2, and 1 were, respectively, 93.7%
(136/145), 74.8% (77/103), and 57% (44/77); 11 cases were
given a confidence level of 0.
TMA core–based diagnoses made at confidence scores 2
or 3 deserve particular attention because they might be expected to form the basis for clinical decision making. These results
are given in Table 3. All TMA core–based diagnoses that
received a confidence score of 1 or 0 are accounted for in the
“No Diagnosis” column. Of the TMA core–based diagnoses,
Am J Clin Pathol 2007;128:474-480
© American Society for Clinical Pathology
477
DOI: 10.1309/J7Y74D9DXEAJ9YUY
477
477
Farmer et al / LYMPHOMA DIAGNOSIS IN SMALL SAMPLES
❚Table 3❚
Concordance at Confidence Scores of 2 or 3*
TMA Core Diagnosis
Excision
Diagnosis
ALCL
ALCL
CLL
DLBCL FL
FL3
3
75
100
CLL
HL
MCL
MZL
PTL
RL
DLB/
FL3
IBCL
LNOS
No
Dx
24
86
100
1
4
2
28
88
85
FL
16
57
80
3
25
15
FL3
1
4
50
1
8
50
HL
1
4
10
1
4
2
44
73
94
MCL
1
4
3
1
3
3
2
7
6
MZL
PTL
2
7
40
1
3
25
1
4
25
2
17
50
1
3
33
1
2
2
20
83
100
2
5
4
1
4
10
7
18
70
1
4
20
2
5
40
4
10
8
16
57
100
TLL
1
3
33
1
4
33
1
4
1
2
6
2
8
29
9
6
50
7
15
25
17
1
4
1
23
58
26
9
32
10
12
100
100
RL
3
Total
1
25
2
DLBCL
Total
TLL
24
1
1
3
33
1
1
5
20
2
47
20
1
1
10
10
16
12
28
32
28
12
60
24
40
28
12
42
62
86
49
4
5
3
23
34
26
88
68
336
ALCL, anaplastic large cell lymphoma; CLL, chronic lymphocytic leukemia/small lymphocytic lymphoma; DLBCL, diffuse large B-cell lymphoma; DLB/FL3, diffuse large Bcell lymphoma or follicular lymphoma grade 3/3; FL, follicular lymphoma grades 1/3 or 2/3; FL3, follicular lymphoma grade 3/3; HL, Hodgkin lymphoma; IBCL, indolent Bcell lymphoma; LNOS, lymphoma, not otherwise specified; MCL, mantle cell lymphoma; MZL, marginal zone B-cell lymphoma; No Dx, no diagnosis made; PTL, peripheral
T-cell lymphoma; RL, reactive lymphoid hyperplasia; TLL, T-lymphoblastic lymphoma; TMA, tissue microarray.
* For each diagnosis, the top number indicates the number of diagnoses given as that diagnosis, the middle number is the row percentage (ie, the number of diagnoses in the cell
expressed as a percentage of the total number of “gold standard” diagnoses in the corresponding row), and the bottom number indicates the column percentage (ie, the number
of diagnoses in the cell expressed as a percentage of the total number of TMA core diagnoses in the corresponding column). Bold type indicates concordant diagnoses.
248 (73.8%) were made with a confidence level of 3 or 2.
Among these cases, the lymphomas could be distinguished
from the nonneoplastic lymphoid infiltrates with a sensitivity
of 96.6% (196 of 203 lymphoma cases were detected) and a
specificity of 93% (42 of 45 lymphoid infiltrates recognized as
nonneoplastic). RL was misdiagnosed as lymphoma (ie, falsepositives) in 3 cases (once each as FL, MZL, and DLBCL); all
of these diagnoses were associated with confidence scores of 2
(implying less than complete confidence on the part of the
pathologists). The 7 false-negative diagnoses (ie, lymphomas
misinterpreted as RL) included 4 cases of MZL and 1 case each
of CLL, FL, and HL.
Restricting the analysis to TMA core–based diagnoses
made with confidence scores of at least 2 is expected to increase
the positive predictive power at a cost of reduced sensitivity.
The results shown in Table 3 indicate that this is generally
478
478
Am J Clin Pathol 2007;128:474-480
DOI: 10.1309/J7Y74D9DXEAJ9YUY
what occurred. However, it is noteworthy that for the lymphoma types associated with the highest concordance rates,
CLL, MCL, and TLL, almost all of the correct diagnoses (1
case of MCL was the exception) were made with a confidence
score of at least 2, indicating that these lymphoma types are
generally recognized with an especially high degree of confidence. Although confident TMA core–based diagnoses of HL
and PTL were made with relatively low sensitivity (73%
[44/60] and 57% [16/28], respectively), both were associated
with high positive predictive power (94% [44/47] and 100%
[16/16], respectively), indicating that these diagnoses were reasonably reliable once made. Confident TMA core–based diagnoses of DLBCL were associated with reasonably high sensitivity (88%) but, importantly, modest positive predictive power
(85%) resulting from misapplication of the diagnosis to cases
of MCL, MZL, PTL, and RL. Confident, TMA core–based
© American Society for Clinical Pathology
Hematopathology / ORIGINAL ARTICLE
diagnoses of FL, FL3, MZL, and RL were associated with low
sensitivity (8%-62%) and low to modest positive predictive
power (50%-86%).
The relative amenability of different lymphoma types to
accurate diagnosis in small samples may also be evaluated based
on the interobserver agreement achieved when several pathologists review the same TMA sample. Because this measure does
not rely on reference to an external standard (in this case, the
gold standard diagnosis based on the large, excised sample), it
complements our other results. Multirater κ scores representing
the agreement between the pathologists’ TMA core–based diagnoses are shown in the last row of Table 2. They show that interobserver agreement was greatest for CLL, MCL, and TLL; lowest for FL, FL3, and MZL; and intermediate for DLBCL, HL,
PTL, and RL. Therefore, these relative rates of interobserver
agreement seem to correspond to the relative amenability of
these lymphoma types for accurate recognition in small samples.
Slides were reviewed from cases in which erroneous diagnoses were associated with high confidence scores to determine
why potentially distinctive immunophenotypic profiles failed to
result in accurate diagnosis. CD10, bcl-6, and bcl-2 are
expressed in most cases of FL and may be expected to assist in
their recognition. However, CD10 and bcl-6 are also expressed
in cases of DLBCL and in nonneoplastic lymphoid follicles, and
the diagnostic significance of bcl-2 expression is largely limited
to cases in which it occurs within recognizable follicle centers.
Thus, several diagnostic errors in our study seemed to have been
related to the difficulty of correlating expression of these markers with the tissue architecture in very small specimens. Both
cases of MZL that were misdiagnosed as HL were associated
with large, B-lineage immunoblasts or centroblasts that
expressed CD30; these seem to have been misinterpreted as
Hodgkin cells. Taking careful note of the lineage of the predominant, small cell lymphoid infiltrate when interpreting the CD30
staining in putative Hodgkin cells would seem potentially helpful in avoiding this pitfall because background T cells are likely
to predominate in classical HL, whereas B cells will generally
predominate in most B-lineage NHLs. Misdiagnosis of cases of
PTL as DLBCL seemed to have resulted from interpreting the B
cells in the sample, which are often quite abundant and may be
cytologically alarming, as the neoplastic component.
Discussion
Because we, as pathologists, are asked rather frequently to
make lymphoma diagnoses based on small specimens obtained
using cutting needles, we have been dismayed by the relative
dearth of objective data available to guide this practice. The usefulness of the needle biopsy technique is influenced by several
parameters, some of which cannot be studied by using a TMA
core–based modeling approach such as ours. For example,
beyond such strictly clinical issues as patient morbidity, clinically obtained needle biopsy specimens may be associated with
an increased risk of sampling error or crush artifact. Conversely,
the practice of sending one or more cores for analysis by flow
cytometry may improve the accuracy of the technique as used
in the clinical setting. However, considering the fundamental
importance of tissue architecture in the conventional pathologic
diagnosis and classification of lymphomas and the demonstrated importance of sample size in determining diagnostic success
among clinically obtained needle biopsy specimens, it is clear
that sample size is the single most important parameter distinguishing needle biopsy specimens from excised samples, at
least from the pathologist’s viewpoint.14
Each of our TMA cores had a cross-sectional area of 0.3
mm2 such that 0.3 to 0.6 mm2 (ie, from 1 or 2 cores) was available for histologic examination from each case. In our experience, the size of the cutting needle biopsy specimens obtained
clinically varies widely such that most (although by no means
all) are larger than our TMA cores. In this respect, caution is
required in applying our results to clinical practice. For example, whereas only 73.8% of our TMA core–based diagnoses
were made confidently, this proportion is likely to be higher in
a set of clinical specimens that includes larger samples. Small
samples are likely to be especially problematic in dealing with
lymphoid processes, such as HL or PTL, in which cellular heterogeneity impedes recognition of key cellular subpopulations,
and in distinguishing between the various lymphomas of follicle center cell origin, including FL, FL3, and DLBCL, in which
tissue architecture is of cardinal importance. These caveats are
offset by, and largely inseparable from, the opportunity provided by our modeling approach to maintain a measure of control
over the critical size variable, maintaining it conservatively
toward the lower extreme of what might be considered acceptable in a clinical specimen, to more effectively isolate the potentially important variable of lymphoma subtype in determining
diagnostic success in small samples.
We found that CLL, MCL, and TLL were diagnosed with
particularly high sensitivity, positive predictive power, interobserver reproducibility, and confidence in the small samples.
These are “small cell” lymphomas that may be morphologically similar to one another; however, their immunophenotypic
profiles are distinctive. In contrast, FL, FL3, and MZL, lymphomas that also generally contain a small cell component but
lack distinctive immunophenotypes discernible using commonly available antibodies, were difficult to diagnose. These observations support the notion that pathologists place especially
heavy reliance on immunophenotypic and cytologic data when
faced with very small specimens. It follows that diagnostic
accuracy in such specimens will be heavily influenced by the
relative amenability of lymphoma subtypes to diagnosis based
on immunologic criteria and that the identification of new diagnostic antibodies useful in the specific recognition of lymphoma
Am J Clin Pathol 2007;128:474-480
© American Society for Clinical Pathology
479
DOI: 10.1309/J7Y74D9DXEAJ9YUY
479
479
Farmer et al / LYMPHOMA DIAGNOSIS IN SMALL SAMPLES
types that are difficult to recognize in small samples is highly
desirable. Furthermore, the routine application of a broad panel
of immunostains, relative to the more tailored panels generally
applied to larger specimens, seems justified when dealing with
small specimens in which limited appreciation of tissue architecture is likely to impede the generation of morphologically
based diagnostic hypotheses.
Fluorescence in situ hybridization (FISH) is applicable to
small samples of paraffin-embedded tissue and useful in detecting cytogenetic abnormalities that correlate with lymphoma
subtype. For example, FISH-based assays are available for
chromosomal translocations associated with follicle center
cell–derived lymphomas, MCL and MZL.15 Although it was
beyond the scope of our study, a TMA core–based approach
similar to ours might be useful for evaluating the relative potential of various combinations of FISH probes to complement
immunostains in the classification of small lymphoma samples.
The results shown in Table 3 include several confident but
erroneous diagnoses based on the small samples that, in a clinical setting, might have been expected to contribute to inappropriate clinical management, including several cases in which
RL was misdiagnosed as lymphoma (ie, confident false-positives). We infer that reliance on very small samples for the
definitive pathologic diagnosis of lymphoma is associated with
a greater risk of diagnostic error than that associated with larger samples. Although aspects of our modeling approach prevent
us from quantifying this risk precisely, it is nevertheless apparent that exclusive reliance on needle biopsy specimens for clinical decision making in a given case of lymphoma should be
justified by a reasonable expectation that the benefit to the
patient outweighs the risk of diagnostic error. In documenting
the relative amenability of different lymphoma types to accurate
diagnosis in small tissue samples, we believe that our work will
also be useful in informing clinical practice in more specific
ways. For example, our results provide objective guidance for
pathologists in assessing the relative reliability of the specific
lymphoma diagnoses that they make based on small samples
and for clinicians in considering which of their patients with
previously diagnosed lymphomas may most reliably be monitored using cutting needle biopsies.
From the 1Department of Pathology and Molecular Medicine,
Queen’s Cancer Research Institute, Queen’s University, Kingston,
Canada; 2Department of Laboratory Medicine and Pathobiology,
University of Toronto, Toronto; 3Department of Pathology and
Laboratory Medicine, University of Ottawa, Ottawa; and 4Clinical
Research Centre, Kingston General Hospital, Kingston.
Supported by a Kingston General Hospital Clinical Research
Grant (Dr LeBrun).
Address reprint requests to Dr LeBrun: Dept of Pathology
and Molecular Medicine, Richardson Laboratory, Queen’s
University, Kingston, Ontario, Canada K7L 3N6.
480
480
Am J Clin Pathol 2007;128:474-480
DOI: 10.1309/J7Y74D9DXEAJ9YUY
Acknowledgments: We gratefully acknowledge the technical
assistance of Shakeel Virk and Margaret Morrow and technical
support from the Experimental Pathology Unit, Queen’s University
Department of Pathology and Molecular Medicine.
References
1. Jaffe ES, Harris NL, Stein H, et al, eds. Pathology and Genetics
of Tumours of Haematopoietic and Lymphoid Tissues. Lyon,
France: IARC Press; 2001. World Health Organization
Classification of Tumours.
2. Pappa VI, Hussain HK, Reznek RH, et al. Role of imageguided core-needle biopsy in the management of patients
with lymphoma. J Clin Oncol. 1996;14:2427-2430.
3. Quinn SF, Sheley RC, Nelson HA, et al. The role of
percutaneous needle biopsies in the original diagnosis of
lymphoma: a prospective evaluation. J Vasc Interv Radiol.
1995;6:947-952.
4. Screaton NJ, Berman LH, Grant JW. Head and neck
lymphadenopathy: evaluation with US-guided cutting-needle
biopsy. Radiology. 2002;224:75-81.
5. Demharter J, Muller P, Wagner T, et al. Percutaneous coreneedle biopsy of enlarged lymph nodes in the diagnosis and
subclassification of malignant lymphomas. Eur Radiol.
2001;11:276-283.
6. Zinzani PL, Corneli G, Cancellieri A, et al. Core needle
biopsy is effective in the initial diagnosis of mediastinal
lymphoma. Haematologica. 1999;84:600-603.
7. Sklair-Levy M, Amir G, Spectre G, et al. Image-guided
cutting-edge-needle biopsy of peripheral lymph nodes and
superficial masses for the diagnosis of lymphoma. J Comput
Assist Tomogr. 2005;29:369-372.
8. de Kerviler E, Guermazi A, Zagdanski AM, et al. Image-guided
core-needle biopsy in patients with suspected or recurrent
lymphomas. Cancer. 2000;89:647-652.
9. Balestreri L, Morassut S, Bernardi D, et al. Efficacy of CTguided percutaneous needle biopsy in the diagnosis of
malignant lymphoma at first presentation. Clin Imaging.
2005;29:123-127.
10. Agid R, Sklair-Levy M, Bloom AI, et al. CT-guided biopsy with
cutting-edge needle for the diagnosis of malignant lymphoma:
experience of 267 biopsies. Clin Radiol. 2003;58:143-147.
11. Sklair-Levy M, Polliack A, Shaham D, et al. CT-guided
core-needle biopsy in the diagnosis of mediastinal lymphoma.
Eur Radiol. 2000;10:714-718.
12. Parker RL, Huntsman DG, Lesack DW, et al. Assessment of
interlaboratory variation in the immunohistochemical
determination of estrogen receptor status using a breast cancer
tissue microarray. Am J Clin Pathol. 2002;117:723-728.
13. Fleiss JL. Statistical Methods for Rates and Proportions. New
York, NY: John Wiley; 1981.
14. Hesselmann V, Zahringer M, Krug B, et al. Computedtomography–guided percutaneous core needle biopsies of suspected
malignant lymphomas: impact of biopsy, lesion, and patient
parameters on diagnostic yield. Acta Radiol. 2004;45:641-645.
15. Gascoyne RD. Hematopathology approaches to diagnosis and
prognosis of indolent B-cell lymphomas. Hematology (Am Soc
Hematol Educ Program). 2005:299-306.
© American Society for Clinical Pathology