DÄ internationalArchive6/2025The Accuracy of Physicians’ Quantitative Estimates

Original article

The Accuracy of Physicians’ Quantitative Estimates

Dtsch Arztebl Int 2025; 122: 145-50. DOI: 10.3238/arztebl.m2025.0010

Knipps, L M; Fischer, I; Klenzner, T

Background: Doctors often describe sizes by comparison with everyday objects, e.g., a pinhead-sized tympanic defect or a dehiscence the size of a penny. But do they really know how big a pinhead is? We used an internet-based questionnaire to study whether quantities are accurately estimated and whether comparisons with everyday objects improve accuracy.

Methods: In a prospective, single-center study conducted by internet-based questionnaire, physicians estimated the size of everyday objects, such as a pea or a one-euro coin, and SI units as they appeared on a computer screen and then estimated their own accuracy of estimation.

Results: On average, the sizes of everyday objects and SI units were underestimated by 15% (95% confidence interval, [–17; –13]). The physicians’ self-assessment was not correlated with their actual degree of accuracy. Board-certified specialists considered themselves better estimators than others; however, no difference in accuracy was found between specialists and resident physicians. Nor did the particular specialty have any effect on the accuracy of estimation, even though the participating radiologists and neurosurgeons considered themselves especially good estimators. The frequent use of aids such as rulers in clinical practice was not associated with a better accuracy of estimation.

Conclusion: Underestimates of size, such as were frequently observed in this study, can cause inaccurate descriptions and faulty decision-making in clinical practice. We therefore recommend that quantities should be measured with the appropriate instruments, and that physicians should refrain from making eyeball estimates wherever possible, regardless of their medical specialty or degree of clinical experience.

Cite this as: Knipps LM, Fischer I, Klenzner T: The accuracy of physicians’ quantitative estimates. Dtsch Arztebl Int 2025; 122: 145–50. DOI: 10.3238/arztebl.m2025.0010

LNSLNS

How big is big? On a daily basis, physicians are faced with the necessity of providing concrete descriptions of sometimes abstract structures. For example, decisions on the treatment of cancer patients depend crucially on precise quantification of tumor extent. Evaluation of the progress of wound healing requires accurate documentation of the defect, and an exact account of the patient’s circumstances is needed on every handover. Our clinical experience is that sizes are often described with reference to objects. Comparative terms such as “a pea-sized node” or “golfball-sized contusion” are used all the time, even though to our knowledge there have been no publications on how frequently such size estimates are used in clinical routine and there are bound to be regional differences. Entorhinolaryngologists often have eardrum defects described to them as “pinhead-sized,” new patients report “cherry pit-sized” cervical lymph nodes, on rounds the ward physician talks of “penny-sized” wound defects, and before surgery the resident describes the potentially malignant mass at the lateral border of the tongue as “large.” But just how large is “large”?

All such descriptions are potentially sources of error:

  • Subjective estimation of size (in the absence of objective measuring devices, e.g., [sterile] tape measures)
  • Expression of this estimated size in terms of one’s personal perception of the dimensions of an everyday object (e.g., a pea)
  • Subjective estimation of the size by the person to whom the description is addressed

This gives rise to questions on the extent to which physicians (divided by specialty, experience, and sex) develop an ability to estimate sizes accurately in the course of their daily work and whether specialists in disciplines where measurement is a frequent occurrence, such as radiology, are more skilled in this respect. Previous investigations into accuracy of estimation mostly related to individual specialties or very narrow study questions (Table).

Overview of the research into the accuracy of quantitative estimates in the clinical setting
Table
Overview of the research into the accuracy of quantitative estimates in the clinical setting

The aim of this prospective single-center study was to explore the accuracy of physicians’ size estimates in relation to metric units and everyday objects.

Methods

A prospective, single-center, anonymized questionnaire study with 206 participants, all physicians, was conducted at University Hospital Düsseldorf, Germany, from March to June 2022. After approval from from the local ethics committee, a total of 1096 physicians employed at the hospital were invited by e-mail to take part in an online survey using the web application SoSciSurvey.de (Leiner, 2019, program version 3.3.17).

The first step was to set the monitor size with the aid of a reference (“bank card,” ISO Standard 7810) to confirm the automatically captured screen size. The following demographic data were also documented:

  • Specialty
  • Time spent working as physician (<5, 5–10, 10–15, >15 years)
  • Status (specialist in training, board-certified specialist)
  • Sex

The questionnaire was divided into three sections:

  • Estimation of the size of everyday objects
  • Estimation of the size of metric units
  • Subjective assessment of estimation skills

The sizes of everyday objects (pea, golf ball, walnut, tennis ball, pinhead, hen’s egg (grade M), 1-eurocent coin, 1-euro coin) were estimated with the aid of a sliding scale. The normal values were based on DIN norms or statutory regulations, with the exception of the pinhead, for which there is no standard size. To enable comparison between the estimation accuracy for everyday objects and for SI units, after the participants had estimated the 1-euro coin they were told its true diameter (23.25 mm). Knowing the dimension of the coin in metric units, the participants again had to estimate its size.

Metric units of 5 mm to 10 cm were estimated with the assistance of the sliding scale to test for differences in precision between metric units and everyday objects.

In the concluding part of the survey, the participants reflected on their estimation skills, again using a sliding scale (from “Completely agree” to “Completely disagree”). They were also asked whether they regularly used measuring devices (e.g., rulers).

The completed questionnaires were then statistically evaluated. For analysis of accuracy of measurement the participants were divided by sex, specialty, and degree of occupational experience. We also investigated whether there was any correlation between self-perception and actual measurement accuracy.

All statistical analyses were performed using Python 3.9.7. The data were expressed as mean ± standard error of the mean.

We tested for differences in accuracy of measurement by sex, specialty, and degree of occupational experience by means of one-way ANOVA with post-hoc analysis and Šidák’s corrected multiple comparison test. Multivariable analysis to verify the results showed agreement. The 95% confidence interval was calculated for all analyses.

Results

A total of 1096 physicians were invited to take part, of whom 266 started answering the questions in the survey. Of these, 206 persons from 29 specialties completed the questionnaire in full (response rate 18.8%, 108 women [52.4%], 98 men [47.6%]). Among the responders, 46.1% gave their status as specialist in training, while 53.9% were board-certified specialists. With regard to occupational experience, 29.6% had been working in their specialty for less than 5 years, 30.6% for 5–10 years, 12.6% for 10–15 years, and 27.2% for more than 15 years.

The sizes of eight everyday objects were estimated, with the 1-euro coin estimated two times. Four SI units were also estimated, resulting in a total of 2678 estimates.

The average degree of underestimation for both everyday objects and metric units was 15% [−17; −13]. The participants’ self-assessments did not correlate with their actual precision [–0.001; 0.001]. The basis for measurement was the percentage deviation of the actual estimate from the self-assessment. This self-assessment was documented on a scale from 0% (“Completely disagree”) to 100% (“Completely agree”) with regard to the statement “I am good at estimating” (Figure 1).

Relative accuracy of estimation
Figure 1
Relative accuracy of estimation

Both men and women tended towards underestimation. On average, men underestimated by 12%, women by 18% (mean difference: –0.06, [–0.11; –0.015]). The men rated their estimation skills higher than did the women. On a scale of 0–100 (in response to the statement “I am good at estimating”: “Completely disagree – 0%” to “Completely agree – 100%”), the average agreement was 70.65% for men and 60.65% for women, a difference of –10% [–16; –4].

Occupational status (specialist in training versus board-certified specialist) had no effect on accuracy of estimation (mean difference: –0.04 [–0.09; 0.001]). However, the board-certified specialists had a higher opinion of their estimation skills. On a scale of 0–100 (in response to the statement “I am good at estimating”: “Completely disagree – 0%” to “Completely agree – 100%”), the average response was 60.89% for women and 69.35% for men, a difference of –8.5% [15; –2].

There was no difference between the group of physicians with less than 5 years’ occupational experience and those with more than 15 years’ experience (mean difference: 0.044 [–0.02; 0.14]). Similarly, no difference was detected between the group of persons with 5–10 years’ occupational experience and those with at least 15 years’ experience (mean difference: 0.07 [–0.008; 0.15]).

Specialty had no effect on accuracy of estimation in our survey. Additional multivariable analysis, with specialty, clinical experience, sex, and occupational status as variables, also showed no difference in estimation accuracy among the different disciplines (R² = 0.18), between surgical and conservative specialties (R² = 0.04), or between those with more or less clinical experience (R² = 0.10). However, radiologists and neurosurgeons rated their own estimation skills particularly highly. On a scale of 0 (“Completely disagree”) to 100 (“Completely agree”) the neurosurgeons’ average agreement was 88.83%, compared with a mean of only 60.70% across other specialties. This difference of 28.13% (0.28 [0.18; 0.38]) clearly shows neurosurgeons’ high confidence in their estimation skills. Radiologists’ faith in their estimation skills was also above average, with a mean difference of 15% (0.15 [0.03; 0.27]) from other specialties (Figure 2).

Average self-assessment by stated specialty
Figure 2
Average self-assessment by stated specialty

The initial estimate of the diameter of a 1-euro coin (stated by the German Federal Bank as 23.25 mm) was 21.20 mm ± 4.35 mm. The accuracy improved only slightly after disclosure of the actual diameter (21.11 mm ± 4.80 mm; difference: 0.088 mm [–1.04; 1.22]). Overall the diameter was underestimated by an average of around –10% (Figure 3).

Average estimates of the size of a 1-euro coin
Figure 3
Average estimates of the size of a 1-euro coin

The average estimate of the diameter of a pinhead, defined as the spherical protrusion at the end of a pin by which it is gripped, was 1.46 ± 0.8 mm. Pins are produced and sold with heads ranging from 1 mm to 6 mm in diameter. There are no clearly defined standards by which to assess the participants’ estimates. However, the participants’ individual perceptions differed from the commercially available size range.

The findings were similar for estimation of the diameter of a standardized extra-fine pea (stated by the German Federal Ministry of Food and Agriculture as 7.5 mm). The average estimate was 5.81 mm ±  3.45 mm, 23% (−1.725 mm [−2.00 mm/–1.45 mm]) below the true size. Other standardized sizes for peas (very fine: 8.2 mm, fine: 9.3 mm) are larger again than the participants’ estimates (Figure 4).

Absolute size estimation of a pea
Figure 4
Absolute size estimation of a pea

The self-reported regular use of aids such as rulers and SI units in clinical routine had no detectable effect on actual estimation proficiency.

Discussion

Assessment of the sizes of everyday objects by the participants in our study, regardless of their occupational experience and specialty, was potentially inaccurate, with a tendency towards underestimation.

Limitations

This was a single-center study at a tertiary-care hospital, so the number of participants was limited. Nevertheless, more than 2600 measurements were generated with the aid of the 206 physicians who took part.

The survey tool was restricted to two-dimensional representation, but that seems sufficient for evaluation of basic estimation accuracy; indeed, estimated sizes are often documented two-dimensionally in clinical contexts (e.g., wound defects). As far as possible, the colors of the objects to be measured in the survey tool were realistic (e.g., the pea was green). However, coloration might affect perception of size. McConnell et al. took account of this in their investigation of the accuracy of volume estimation by doctors and nurses, in that they presented the probands with fluids in two colors, red (suggestive of blood) and green (2). Accuracy turned out to be independent of color, but, just as we observed, there was consistent underestimation of all volumes (25–600 mL).

The sample size (206 participants, 29 specialties) did not permit comprehensive comparison of specialties, but grouping into surgical and conservative disciplines and into individual specialties sufficed for statistical evaluation.

Comparison of accuracy between everyday objects and metric units

The study found no difference in size estimation accuracy between everyday objects and metric units. Counter to the expectation that metric units would be estimated more accurately because of their objective definition, the degree of uncertainty was comparable. Metric dimensions such as 5 mm or 2 cm are clearly defined and are regularly used in the field of medicine. In contrast, descriptions such as “the size of a golf ball” or “walnut-sized” rely on subjective perception and everyday experience.

The results indicate that without measuring devices, physicians find it difficult to visualize dimensions accurately, regardless of whether they are stated in metric units or with reference to everyday objects. Comparisons in terms of such objects (such as a pea or a golf ball) represent readily comprehensible references and may be swifter and easier to communicate, e.g., when describing the findings of palpation, which may be hard to measure.

This has practical implications for medical communication, particularly in the documentation of findings and in oral handovers to colleagues.

Occupational experience

Estimation accuracy is not affected by the degree of occupational experience, despite the fact that board-certified specialists rate their accuracy more highly. This corresponds to the findings of other studies, e.g., one in which a cohort of Belgian orthopedists were asked to assess the degrees of freedom of the knee joint (4). There too, surgical experience made no significant difference. Increasing clinical expertise may give medical specialists a subjective feeling of confidence without improving their actual accuracy of size estimation.

Specialty

The choice of specialty had no detectable effect on estimation accuracy in our study. Although the self-assessment data showed that radiologists and especially neurosurgeons were particularly confident of their estimation proficiency, this was not reflected in their actual accuracy.

Sex-specific differences

The women in our study underestimated the sizes by more than the men. In a study on evaluation of wound defects by 50 female and male physicians of greatly varying experience from a wide variety of specialties published in 2012 (5), Peterson et al. found even more pronounced differences. Men tended to overestimate, women to underestimate the size of the defects. At the risk of reinforcing a stereotype, it seems important to be aware of this possible sex-specific difference in perception in our daily communications.

Pinhead

A pinhead was included in the survey as an example of an everyday object with no standardized size. For this reason, it was not part of the overall analysis of estimation accuracy. Pins can be divided into those with flat or spherical heads (typically 1–6 mm in diameter); the classic round heads of pins used in sewing measure about 3–4 mm. However, the average size estimate in our study was only 1.5 ± 0.8 mm. Conversations with the participants revealed that many of them had erroneously estimated the size of the pin’s point or shaft. This illustrates both the problems that may arise and the low interrater reliability when using the sizes of such everyday objects: It is clear neither how the person making the estimate interprets the size of the object named, nor how the hearer/reader comprehends it. This can lead to misunderstanding. For instance, a “pinhead-sized hole in the eardrum” could be interpreted as anything from a minute to a subtotal defect and is therefore not a reliable basis for decisions regarding treatment.

Clinical relevance

The sizes of the everyday objects and SI units presented in our study were predominantly underestimated. Previous studies also concluded that underestimation of, for example, liver volumes or intraoperative blood loss tends to occur more commonly than overestimation (2, 6, 7, 8). In light of our findings, this is no surprise: in our two-dimensional setting, much more accessible to human perception than three-dimensional space, there was underestimation by 15%, so one can anticipate that underestimation of volumes will occur more often than overestimation.

Underestimation of sizes may thus have negative consequences in daily clinical routine, particularly if they are the sole criterion on which decisions are based. We therefore recommend precise measurement whenever possible, using measuring devices as appropriate.

Even everyday surgical decisions such as the distance between sutures demand accuracy of estimation. The STITCH study (2015) showed that a space of 5 mm between sutures was superior to 1 cm with regard to the postoperative hernia rate after laparotomy (9). Physicians should be aware of the potential uncertainty involved in the estimation of such distances. One needs to take particular care and use aids as appropriate in order to make estimates as accurate as possible (Table).

The influence of routine use of measuring devices on the accuracy of estimations

There was no difference in accuracy of estimation between physicians who regularly use and measure metric units, such as radiologists, and those from other disciplines. It was unexpected that this subgroup of physicians—who considered themselves particularly good estimators—was not better at estimating sizes. The hypothesis that frequent measurement improves the accuracy of eyeball estimation was not confirmed. Our findings suggest the conclusion that frequent use of metric units and measuring devices does not automatically lead to better accuracy of eyeball estimation. This indicates that regular use of measuring devices may not promote—or may even impair—proficiency in subjective size assessment, because one relies on the precision of the devices. This discrepancy may be explained by an optimism bias. The perception of the described training effect may also play a part: frequent interaction with precise measurements may reinforce the assumption that one is therefore bound to be good at visual estimation, although this is not necessarily the case.

Acknowledgment

The authors are grateful to Bernd Neumann for technical support and to all their colleagues at University Hospital Düsseldorf who so enthusiastically took part in the study.

Conflict of interest statement
TK is the current president of the German Society of Computer- and Robot-Assisted Surgery (Deutsche Gesellschaft für Computer- und Roboterassistierte Chirurgie, CURAC e.V.).

The remaining authors declare that no conflict of interest exists.

Manuscript received on 23 July 2024, revised version accepted on 15 January 2025

Translated from the original German by David Roseveare

Corresponding author
Dr. med. Lisa Margarete Knipps
Lisa.Knipps@uni-wh.de

1.
Schuld J, Kollmar O, Seidel R, Black C, Schilling MK, Richter S: Estimate or calculate? How Surgeons rate volumes and surfaces. Langenbeck’s Arch Surg 2012; 397: 763–9 CrossRef MEDLINE
2.
McConnell JS, Fox TJ, Josson JP, Subramanian A: “About a cupful”—a prospective study into accuracy of volume estimation by medical and nursing staff. Accid Emerg Nurs 2007; 5: 101–5 CrossRef MEDLINE
3.
Conway RG, O’Neill N, Brown J, Kavic S: An educated guess—distance estimation by surgeons. Surg Open Sci 2020; 2: 113–6 CrossRef MEDLINE PubMed Central
4.
Shetty GM, Mullaji A, Lingaraju AP, Bhayde S: How accurate are orthopaedic surgeons in visually estimating lower limb alignment? Acta Orthop Belg 2011; 77: 638–43 MEDLINE
5.
Peterson N, Stevenson H, Sahni V: Size matters: how accurate is clinical estimation of traumatic wound size. Injury 2014; 45: 232–6 CrossRef MEDLINE
6.
Bose P, Regan F, Paterson-Brown S: Improving the accuracy of estimated blood loss at obstetric haemorrhage using clinical reconstructions. BJOG An Int J Obstet Gynaecol 2006; 113: 919–24 CrossRef MEDLINE
7.
Meiser A, Casagranda O, Skipka G, Laubenthal H: [Quantification of blood loss. How precise is visual estimation and what does its accuracy depend on?]. Anaesthesist 2001; 50: 13–20 CrossRef MEDLINE
8.
Yoong W, Karavolos S, Damodaram M, et al.: Observer accuracy and reproducibility of visual estimation of blood loss in obstetrics: how accurate and consistent are health-care professionals? Arch Gynecol Obstet 2010; 281: 207–13 CrossRef MEDLINE
9.
Deerenberg EB, Harlaar JJ, Steyerberg EW, et al.: Small bites versus large bites for closure of abdominal midline incisions (STITCH): a double-blind, multicentre, randomised controlled trial. Lancet 2015; 386: 1254–60 CrossRef MEDLINE
10.
Bundesministerium für Ernährung und Landwirtschaft: Leitsätze für Gemüseerzeugnisse. www.bmel.de/SharedDocs/Downloads/DE/_Ernaehrung/Lebensmittel-Kennzeichnung/LeitsaetzeGemueseerzeugnisse.pdf?__blob=publicationFile&v=2 (last accessed on 17 February 2025).
Department of Otorhinolaryngology, Medical Faculty of Heinrich Heine University Düsseldorf and University Hospital Düsseldorf: Dr. med. Lisa Margarete Knipps, Prof. Dr. med. Thomas Klenzner
Department of Neurosurgery, Medical Faculty of Heinrich Heine University Düsseldorf and University Hospital Düsseldorf: Dr. rer. nat. Igor Fischer
Relative accuracy of estimation
Figure 1
Relative accuracy of estimation
Average self-assessment by stated specialty
Figure 2
Average self-assessment by stated specialty
Average estimates of the size of a 1-euro coin
Figure 3
Average estimates of the size of a 1-euro coin
Absolute size estimation of a pea
Figure 4
Absolute size estimation of a pea
Overview of the research into the accuracy of quantitative estimates in the clinical setting
Table
Overview of the research into the accuracy of quantitative estimates in the clinical setting
1.Schuld J, Kollmar O, Seidel R, Black C, Schilling MK, Richter S: Estimate or calculate? How Surgeons rate volumes and surfaces. Langenbeck’s Arch Surg 2012; 397: 763–9 CrossRef MEDLINE
2.McConnell JS, Fox TJ, Josson JP, Subramanian A: “About a cupful”—a prospective study into accuracy of volume estimation by medical and nursing staff. Accid Emerg Nurs 2007; 5: 101–5 CrossRef MEDLINE
3. Conway RG, O’Neill N, Brown J, Kavic S: An educated guess—distance estimation by surgeons. Surg Open Sci 2020; 2: 113–6 CrossRef MEDLINE PubMed Central
4.Shetty GM, Mullaji A, Lingaraju AP, Bhayde S: How accurate are orthopaedic surgeons in visually estimating lower limb alignment? Acta Orthop Belg 2011; 77: 638–43 MEDLINE
5.Peterson N, Stevenson H, Sahni V: Size matters: how accurate is clinical estimation of traumatic wound size. Injury 2014; 45: 232–6 CrossRef MEDLINE
6.Bose P, Regan F, Paterson-Brown S: Improving the accuracy of estimated blood loss at obstetric haemorrhage using clinical reconstructions. BJOG An Int J Obstet Gynaecol 2006; 113: 919–24 CrossRef MEDLINE
7.Meiser A, Casagranda O, Skipka G, Laubenthal H: [Quantification of blood loss. How precise is visual estimation and what does its accuracy depend on?]. Anaesthesist 2001; 50: 13–20 CrossRef MEDLINE
8.Yoong W, Karavolos S, Damodaram M, et al.: Observer accuracy and reproducibility of visual estimation of blood loss in obstetrics: how accurate and consistent are health-care professionals? Arch Gynecol Obstet 2010; 281: 207–13 CrossRef MEDLINE
9.Deerenberg EB, Harlaar JJ, Steyerberg EW, et al.: Small bites versus large bites for closure of abdominal midline incisions (STITCH): a double-blind, multicentre, randomised controlled trial. Lancet 2015; 386: 1254–60 CrossRef MEDLINE
10.Bundesministerium für Ernährung und Landwirtschaft: Leitsätze für Gemüseerzeugnisse. www.bmel.de/SharedDocs/Downloads/DE/_Ernaehrung/Lebensmittel-Kennzeichnung/LeitsaetzeGemueseerzeugnisse.pdf?__blob=publicationFile&v=2 (last accessed on 17 February 2025).