
IEC 60601-2-37 is the particular standard that takes the general safety rules for electrical medical equipment and writes the specific ones for diagnostic ultrasound. It governs the surface of the probe that touches the patient and the energy that probe pushes into tissue. Two hazards sit at its centre: the heat the device delivers to skin and to deeper tissue, and the acoustic pressure that can act on tissue mechanically. The standard turns both into limits a maker measures against and into numbers a clinician reads on the screen while scanning. It is a collateral to the general electrical-safety standard rather than a replacement for it, so a probe meets the broad rules first and then answers this narrower document for the hazards that are unique to putting sound into a body. The edition in force has been refined over years of use, and a probe certified to an older one meets a tighter and better-defined set of checks when it comes back for renewal.
Meeting it means a probe’s warming and its output have been bounded and disclosed rather than left to trust.
The applied part is the surface that meets the patient: the acoustic lens and the housing around it. The standard caps how warm that surface may become, because skin held against a hot surface for minutes is injured.
The figure people usually quote is forty-three degrees Celsius for a part in contact with a patient, and the standard is far more careful than that single number suggests. A part resting against the patient is held to a tighter rise than a part that only sits in air, and an ophthalmic application pulls the ceiling lower still, because the eye carries almost no blood flow to wash heat away and a few extra degrees there does lasting harm. None of this is read off a datasheet. The maker runs the probe at its hottest realistic setting against a phantom built to mimic the thermal behaviour of tissue, lets the surface climb until it stops rising, and reads it with a fine thermocouple placed at the single spot that runs warmest rather than at a convenient average point. The heat being measured arrives from two directions at once, the acoustic absorption inside the lens and matching layers, and the electronics packed a few millimetres behind the array, and the test captures the sum of the two rather than either one in isolation. The ambient temperature of the room is recorded and the result is referred to a patient at body temperature, so a probe measured in a cool laboratory cannot quietly borrow the chill of the room to pass. A device that clears the limit only because it was measured idling, powered but barely transmitting, has been tested in a state no clinic will ever use, and a careful assessor reads the test conditions before he trusts the figure printed beside them.
The reference point matters as much as the ceiling. A rise is measured against a defined starting temperature, so the same probe that looks calm in a cool room still has to prove it stays under the limit against a patient whose skin already sits near body heat, which is the condition that holds in a clinic rather than a bench.
The lens is the surface that gives the standard its teeth.
The standard does more than cap the output. It folds in the output display rules, so the device puts two indices on the screen in real time and the operator manages the dose as it happens rather than learning it afterward from a manual.
The mechanical index is the peak rarefactional pressure divided by the square root of the frequency, a single figure that tracks the likelihood of cavitation, the sudden collapse of gas pockets that can tear tissue. The thermal index estimates heating, and it arrives in three forms for three situations, soft tissue all the way down, bone sitting at the focus, and bone right at the surface as in a scan through the skull, because one beam warms those three targets very differently and a single number would mislead. Both indices update live as the operator turns the gain and the depth, so a clinician who watches the thermal index rise through a long Doppler study can ease the output before the tissue heats, which is the entire reason the figure lives where the hand can see it.
The display itself is governed rather than left to the maker’s taste. The standard sets which index must appear for each mode, requires it to stay legible during the scan instead of hiding in a sub-menu, and fixes how finely it steps so that a small change in output registers as a change on the screen. It also names a floor below which an index need not be shown at all, since a mode that cannot come near the threshold of harm has no need to nag the operator with a number, and that floor keeps the readout meaningful rather than cluttered with figures that never matter. A probe that shows an index which never moves, or that lags the control by a noticeable beat, has satisfied the wording and defeated the purpose, because a number the operator stops believing is a number the operator stops reading, and the dose drifts back out of sight.
An index the operator can watch is a dose the operator can manage.

Beneath the on-screen indices sits a measured dataset the maker cannot talk its way around. The acoustic output is mapped point by point in a water tank with a calibrated hydrophone, then derated to estimate what reaches tissue rather than what leaves the lens face.
The derating follows a fixed rule, close to a third of a decibel per centimetre per megahertz, which stops a maker from quietly applying a gentler figure to drag a hot mode back under the line. The headline ceiling is a derated spatial-peak temporal-average intensity at or below seven hundred and twenty milliwatts per square centimetre, with a far lower cap reserved for the eye, and the mechanical index held at or below one point nine for every application but the eye. Each mode is measured on its own terms, since pulsed Doppler and colour flow drive the tissue much harder than a plain grayscale sweep, and the worst case among all of them is the one the report has to clear rather than some flattering average across the gentle ones. The hydrophone carries its own calibration curve and its own positioning uncertainty, and a laboratory that ignores them reports a figure with a false precision that a reviewer quietly discounts. This acoustic dataset is the heaviest single part of the whole submission, a body of physics measured against a defined method rather than a page of assurances, and a reviewer who finds one mode over the ceiling, or a derating that wanders from the rule, returns the file with the clock stopped until it is fixed.
The figure on the screen is honest only because the dataset under it was forced to be.
The same measurements feed the safety case in other jurisdictions, since the global ceiling the standard names is the one the United States adopted for its application-independent route and the one the Chinese adoption tracks, so a maker that measures once to this method carries the result into three regulatory files rather than testing three times.
The measurement is repeated whenever the device changes, since a new lens compound, a revised pulser, or a firmware change that lifts the drive can each move the output, and a maker that measured once at launch and never again is reporting a probe it no longer ships. The standard treats acoustic output as a living property of the device rather than a certificate framed once, so the test plan is written to be rerun every time the thing that makes the sound is altered. This is the quiet discipline behind a trustworthy index: not the single measurement that earned the first clearance, but the habit of measuring again each time the probe is touched in a way that could push more energy into a patient than the label promises.

A number on a screen is only half the disclosure. The standard also dictates what the accompanying documents have to state, so the buyer and the clinician can see what the probe does before they ever switch it on.
The instructions for use have to publish the acoustic output in defined tables, naming the indices, the modes they were measured in, the derating that was applied, and the uncertainty of the measurement, rather than a single tidy figure stripped of its conditions. They have to state the intended clinical applications, since the limits and the warnings differ between an abdominal scan and an examination of the eye, and they have to carry the precautions a user follows to keep the dose low. The principle the whole document serves is to keep exposure as low as reasonably achievable, the idea that an operator should use no more output than the image requires, and the on-screen indices exist precisely so that idea can be acted on rather than admired. A label that omits the output tables, or buries the indices, has met the letter of a checkbox and missed the point of the standard, since the disclosure is the safety feature as much as the limit is. The reviewer who reads the manual against the measured data is checking that the two tell the same story, and a manual that promises a gentler output than the tank measured is a manual that earns the file a rejection.
The manual is where the measured truth is handed to the people who use the probe.
A handheld probe seals its electronics millimetres from the lens, with no fan and no room to spread the heat.
A wheeled cart had a large body and active cooling to carry warmth away from the contact point, while a thumb-sized wireless probe has neither, so the acoustic heat and the electrical heat pile up against the one surface this standard watches the closest. The maker who adds beamforming channels for a sharper picture, or a larger cell for a longer scan, pours more heat into that same sealed space, and the surface-temperature limit is the wall every one of those choices runs into. When the probe can no longer shed heat fast enough, its firmware lowers the output to hold the lens under the cap, which keeps the patient safe and dims the picture at the same instant, so the thermal design quietly decides not only whether the probe is safe but how long it holds full performance once a real exam gets going.
On a cart that limit sat far away; on a handheld it is always close underfoot.
The test that proves all this is unforgiving by design. It runs the probe at the output and the mode that run the hottest, against a phantom whose attenuation is set on the conservative side, and it reads the surface only after the device has run long enough to stop warming rather than in the first cool minute. A probe tuned to look good through a short demonstration and to throttle quietly a few minutes later is the probe this test is built to catch.
The standard is the floor the probe stands on, and reading it well is reading the conditions beside every number rather than the number alone. A maker who treats it as a box to tick instead of a constraint to design around ships a probe that passes on the bench and disappoints in a long study, throttled and dim by the time the exam needs it the hardest. The probe a buyer should pick clears the limit with margin still in hand, so the clinician scans for as long as the case demands without ever feeling the wall the standard quietly built around the lens.