The polygraph as a truth detector

The B.C. Civil Liberties Association believes that there is convincing evidence to suggest that the use of the polygraph is arbitrary, subjective, biased toward accusations of guilt and claims of very high validity are scientifically indefensible. However, even if one is not willing to be persuaded by evidence on these matters, one must admit, at the very least, that there is no scientific opinion whatsoever concerning the validity of polygraph testing. In fact, there is extremely wide divergence over the validity of the test.

In these circumstances, the onus is clearly on the proponents of the polygraph test to establish a convincing scientific case for the claims of high validity that are made by polygraph operators. In other words, the burden of proof rests with the lie detector industry to satisfy the scientific community and legislators that there is convincing evidence to support claims of ninety percent or greater accuracy that are commonly made by polygraph operators. Without such agreement, it seems utterly irresponsible to allow the use of such a device in situations where it may ultimately interfere with the liberty of innocent citizens.

The B.C. Civil Liberties Association urges the Government of British Columbia to follow the example of Ontario in banning the mandatory use of polygraphs by employers in the province. We would go further: since the evidence we have presented throws considerable shadows of doubt on the usefulness of the polygraph test per se, we see no useful purpose for the procedure as either screening procedure for police candidates, or in the court system generally, both of which are uses allowed by Ontario, though there is no convincing evidence in support of the test in any situation. Ontario’s compendium of information, including the Morand Report, in their 1983 amendment to their Employment Standards Act leads one to conclusions very similar to the B.C. Civil Liberties Association’s: the polygraph test is a humbug of subjective, arbitrary and contradictory procedures that does not detect lies or guilt any more effectively, and in many cases not as well (because of procedural and machine bias), as interviews and cross examination that are already common tools of psychology, police work and the courts. The compounded danger in the instances of polygraphs lies in the sanctioned role that untrained persons with crude devices play in harrying innocent persons in commercial and legal settings. To paraphrase an expert, it is the idiocy of idiocies. We urge its removal as an avenue of arbitrary persecution.

The background

The polygraph procedure and machine is an accretion of 1930s technology and popularized psychology rooted not in practices of modern science, but rather based on the traditions of polygraph testing itself. In that sense, a polygraph examination is a self-fulfilling process, “measuring” a series of physical signs without the machine and drawing subjective, psychological sounding conclusions in varying ways depending upon the mental state and set and training of each, and the rapport between both, the examination subject and the examiner. The result of this exercise, associative of the monitoring machines of Scientology, is a series of conclusions about the veracity of specific statements or guilt generally, conclusions unsupportable by consistent scientific logic or by confirmation by other means. In fact, there is convincing evidence to suggest that the procedure is much more likely to create victims of false allegation than it is likely to detect purveyors of falsehood or paragons of guilt—however falsehood or guilt may be defined. The statistical illustration at the end of the paper provides an illustration of this phenomenon.

The Morand Commission findings have been summarized by a number of authors and associations, some of which are listed in this paper’s bibliography. A good deal of the Morand work has become a seminal part of an increasingly large body of research and findings that suggest the polygraph test is deficient and misleading.

The Commission report itself, as well as other supporting compendium material for Ontario’s revision of its Employment Standards Act, clearly shows that Ontario had a strong case in 1983 for restricting use of this crude apparatus. Ontario chose to strongly curb the imposition of pseudo-scientism as a coercive technique in the workplace. We would go further: the device is harmful both in the workplace and in criminal investigation; the direction of the following thesis will support this contention.

This position paper will to some extent revive some of the evidence and arguments of various published sources on the subject, but it will look at this issue from some new perspectives as well, particularly in light of both the persistence of the use of such equipment and the absence of legislative action in B.C., where the procedure is used on an increasing scale inside and outside the criminal justice system.

While the Association’s principal concern lies with polygraph use in employment or commercial situations and in criminal investigations, the objectionable indefensible nature of the test and its assumptions apply in most instances equally well to any application of the polygraph. It may serve the discussion at hand to develop an explanation of the machine and the rationale of its use, as in the next section.

The machine and procedures

The standard polygraph device has four “channels” whose functions are to provide graphic output on physical responses to interrogation. Two of the channels indicate breathing movements from tubes strapped on the chest and abdomen; a third channel, the cardio one, is connected to a blood pressure cuff on the upper arm, monitoring heart rate and pulse pressure in a fairly crude way in comparison with modern medical procedures and equipment. The fourth channel requires connection of a pair of metal electrodes to, say, two fingers of one hand; some indication of the changing electrical resistance of the skin is demonstrated by this channel, in a phenomenon often called galvanic skin response (GSR). A moving paper chart and a series of pens, like a seismic device, indicate the progress of the various channels’ responses over time, and, at least in theory, also indicate in some way the test subjects’ physiological responses to questions put to them.

These physiological responses are supposedly given meaning by a two or three stage process of interrogation, whereby the examiner first establishes a recognizable pattern of responses, unique to the subject, in the face of “truth” and “falsehood”. Using control questions, the examiner adduces how the subject will react to his or her utterance of truth or falsehood in question areas not apparently pertinent to the issues at hand. Often this procedure involves the use of marked cards to deceive the subject into believing that the polygraph can always tell when is is lying about a playing card he chose. Apparently, the machine cannot, since the cards are marked! This particular development of credibility for the machine in the eyes of the subject is absolutely critical: if the subject does not believe that the polygraph can detect his or her duplicity, the subject will, at least in the theory of the test, not react physically to her own lying or truthful statements. An equally important part of that first phase consists of the pre-test interview, in which the examiner typically takes extensive note of the subject’s physical quirks, and, using guidelines about how supposedly to detect lies, draws preliminary conclusions as to the general guilt, innocence or trustworthiness of the test subject. Of course, these preliminary pre-test conclusions, along with the control questions, provide critical and self-fulfilling influence for the analysis of the graph results of the second phase, as this paper will describe later. As early as the first entry into the office area of a testing centre, a subject may undergo analysis by the interviewer or receptionists, thereby setting up a subjective arbitrary data gathering and inference chain that is strung together to support itself as the test phases proceed. Among the guidelines that this pre-test and first phase questioning procedure might use, one typically finds the theoretical idea that a truthful person will be more concerned with control questions than the case—pertinent or critical questions, since he or she has been convinced by the examiner that an innocent person can only fail the test if he or she does not answer the control questions adequately.

In contrast, the guilty person will supposedly look to the critical questions as a source of anxiety, recordable as it supposedly is, on the attached needles. The pre-test phase will also typically look to find guilt in those who, say, rub their hands together, or criticize the test, squirm in their chairs, avoid eye contact, or perform other “typical” acts displaying guilt. Polygraph testers apparently are unable to substantiate why these subjective cues are more effective indications of duplicity, if in fact they are effective by themselves or as part of the interrogation process.

In the second phase, the subject is asked a series of questions that apparently relate to the case of situation at hand, but it a trial, or employment screening situation, or personnel investigation. These questions are usually asked several times, sometimes in various orders, and the results are graphed by the four channels. The question pattern is meant to establish the autonomic and other physical responses to questions designed to reveal lying.

The examiner then scrutinizes the test results in light of the pre-test interview and the first phase control questioning and establishes a “credibility imprint” for the procedure. And what data are physically available as results? Primarily a series of marked lines, complimented by observation and interpretation. The polygraph produces a series of lines of various degrees of smoothness, calibrated by time intervals that relate to apparent autonomic reactions to questions. But the problem of interpretation is immense. Justice Morand expressed considerable frustration at understanding how the graphical results can be interpreted:

At no time was Mr. Reid [a polygraph expert] able to explain to my satisfaction the difference between his perception of the lines on the graph and mine (pp 252, Morand Report).

After a series of examples of testing and results, the Justice came to the conclusion that the graphs are by themselves a small part of the examination process:

The choice of which response was decisive appeared to be based largely on the impression of the subject that the examiner had formed after the pre-test interview… the very facts that the charts are a mere approximation of response and that the analysis is made by eye and estimation alone negate the possibility that the polygraph test is a physiological procedure (pp 253, Morand Report).

As Justice Morand indicates, and as the polygraph experts generally will agree, the polygraph test attempts to physically monitor autonomic responses to questions with the purpose of detecting lying, but in fact the testers rely very heavily on subjective evaluation both at the pre-test stage, the first and second stage, and even at the post-test analysis stage, to reach their conclusions. However defensible or not, the physiological results from the graphs are far from the sole basis of the evidence for establishing a case for lying. In fact, the whole procedure is very subjective, based as it is on physical data open to uncertain interpretation, and on particular interpretive skills of each examiner at all stages. Clearly the subjective evaluation and the physiological readings of the polygraph both have a number of practical and conceptual weaknesses that cast a pall over the polygraph’s own capability to detect truth or falsehood. Some of these weaknesses are described below.

Major problems

It should be understood that the criticisms of the polygraph testing procedure that are made in this section refer to models for polygraph use that have been design specifically for detecting deception in criminal investigations. Although these criticism may apply to uses of the polygraph in other circumstances, no model that we are aware of has ever been specifically designed or scientifically tested for use in employment. After an exhaustive study of all available scientific literature on polygraph use, a recent Congressional study “The Scientific Validity of Polygraph Testing: A research Review and Evaluation” (November, 1983), confirmed this by concluding that there was no evidence whatsoever concerning the use of polygraphs for employment purposes. The study also found that the uses to which the polygraph was put in both employment and criminal matters were sufficiently different, so that the available evidence on the use of polygraphs in criminal investigations—most of which is of highly questionable scientific value, as we shall see—could not be used to support the use of polygraphs for employment purposes. For example, in using polygraphs for job screening or other employment situations, a polygraph operator may attempt to determine whether a job applicant or employee is honest, trustworthy, reliable or a suitable employee. This is a more complex judgment, or at least require different assessment techniques, than those involved in assessing whether a person is guilty or innocent of a crime.

Since no evidence has been forthcoming regarding the validity of polygraph use in employment, polygraph operators who make their services available to employees have supported their work by using evidence gathered during polygraph use in criminal investigations. When informed by a British House of Commons Committee on Employment that his research was being used by polygraph companies to support the use of the polygraph in employment testing, the response of a major proponent of use of polygraphs in criminal situations, Professor David Raskin, was:

For them to use my research in support of any commercial application of the polygraph is quite outrageous. In fact, I regard these tests as dangerous. I calculated in one program nine out of ten people found to be deceptive would actually be telling the truth… 18 states now prohibit the use of employee vetting, because there is no scientific evidence to support using them in the commercial sector, and also because the way in which polygraphs are used can be both abusive and counterproductive (Sunday Times, May 27, 1984).

Dr. Douglas Carrol from the psychology department of England’s Birmingham University also told the British Commons Committee that employee screening via polygraph testing was not good procedure and likely to be unjust.

Polygraph screening is likely to identify remarkably few security or employment risks. What it will do, though, is to implicate as risks an exceptionally large number of honest individuals (Sunday Times, May 27, 1984).

Dr. Carroll cited evidence showing accuracy rates of only 63 to 76 percent for the polygraph; Carrol’s evidence also indicated that as many as 55 percent of innocent people could be found guilty by the polygraph procedure. Carroll went further:

On the scale of idiocies, I think its application in industry and commerce is one of the biggest, most dangerous idiocies of our time. It shatters people’s confidence and can ruin their career opportunities (Sunday Times, May 27, 1984).

The Commons Committee heard a good deal of evidence from both opponents and proponents of polygraph testing, both sides agreed that claims about the accuracy of polygraph testing, even with the most highly trained technicians, and in commonly agreed-upon conditions, were typically exaggerated by commercial firms in their advertising. A Martin Seligson of Polygraph Security Services, for example, admitted his firm’s accuracy claims were distortions, and conceded that from published reports on polygraphs, “I admit to having pulled out some favourable stuff”. Hardly reputable. Or credible.

In light of the comments of the Morand Commission, the recent Congressional study in the U.S., and major proponents and critics of polygraph use in criminal investigations—all of whom agree there is no scientific evidence supporting the validity of polygraph testing in employment and have recommended the abolition of its use for these purposes—the B.C. Civil Liberties Association urges the provincial government to abolish the use of the polygraph in employment in this province.

We now turn to the problems of the models for polygraph use that have been designed for lie detection in criminal investigations.

Modern medical research indicates that certain physical responses traditionally through to be automatic or uncontrollable by mental effort are in fact not automatic at all, or at least are only partially so. Individuals vary in their innate abilities to control blood pressure, heart rate, sweating, breathing and other physical responses, but practice will teach all subjects to control responses to some extent. However, a fundamental tenet of polygraph testing is that such control is not possible, otherwise the physical part of the tests can be fixed by the subject’s control of the four monitoring channels of the polygraph.
The ability to alter body functions is known as biofeedback. Reading guilt or innocence into stress-induced physiological reactions is a tenuous business when the reactions themselves can be controlled. In fact, the entire case for the value of polygraph testing can be dismissed on the basis of evidence that automatic responses are largely discretionary, and the degree of control can be improved. The test can be fixed by anyone, to varying degrees.
Humans react emotionally to their environment and to any stress in that environment in different ways depending on:
- individual mental traits (lability)
- the time of day
- what has happened immediately before
- their rapport with the interviewer
- how their physical characteristics react with the test apparatus (for example, they may be obese or a blood pressure cuff may be too tight, or the room is hot, or they have skin pigmentation or they sweat more, which affects the galvanic skin response readings)
- how they see the world around them, or
- how their body deals with stress.
Out of all this variation in human response comes a clear indication that human reaction in the polygraph testing situation, using the crude physical and psychological measures of the procedure, can lead to no conclusions about its effectiveness, since responses are simply not comparable to a defensible paradigm.

Humans react differently from one another, and they react differently at different times, even if the environment is apparently unchanged. As Justice Morand says:

Even if the environment can be so controlled that the only stimulus is the test question, the evidence overwhelmingly supports the conclusion that the mental processes resulting in physiological change are prompted by complex and often unconscious reactions to a stimulus that is, in part, the arousal value of the question, the test itself and, of course the tensions and fears concerning the situation that has required the subject to take the test in the first place. In order to determine whether a response is significant, the operator must perform a complicated psychological analysis of the many inferences at work and this, of course, includes the particular mental makeup of the individual subject Morand Report, pp 222).

Certainly, the examiner is in the position of trying to discount or filter out so many extraneous factors from the physical evidence of the polygraph test that in the end, he or she seems to have little reliable physical or empirical evidence that is not warped by subjective factors. The examiner is then left to develop subjective and arbitrary value judgments on already compromised data. So much for scientific integrity!
Control questions, which are used to establish a patter of reaction by the subject to truth and lies, must have equal arousal value in the critical questions in the last phase of the test. This is necessary for comparability of arousal level in the two machine test phases. The examiner does this to determine whether the subject is lying in the face of questions pertinent to the case at hand. However, even the procedure for choosing the questions is subjective because it depends on the examiner’s ability to select appropriate questions.
There are serious questions about whether examiners have such skills, and whether it is possible to expect a person to react in the same way to preliminary questions and when the main part of the test is obviously under way. Further, it is obvious that these control questions have to be individually tailored ot the psychological makeup of the subject. Yet it appears that the common practice is to use standard control questions that ignore different responses across individuals.

Yet, while individual variation is ignored in the use of standard questions, the tendency is to err on the side of guilt when individual traits emerge:

Of course, if the subject is not aroused, little or no reaction will show on the graph and it will be possible and even likely that a comparatively greater response will be indicated on the critical questions. Mr. Reid was asked how a lack of response to a control question would be an indication of deceptive behaviour (Morand Report, pp 224).

Yet polygraph operators insist that it is possible to establish an accurate enough reading of a subject’s mental profile so that subjects will react in a similar enough manner to control questions and critical questions that they can accurately identify lies. Justice Morand remained unconvinced:

In my opinion, there is a real possibility that many innocent persons would be unconcerned with what has been suggested to me are good control questions in comparison with the actual accusation. I have no doubt that some people do react as polygraph operators insist they must, but I am not convinced that this latter group of people would be an overwhelming proportion of our population (Morand Report, pp 225).

Without that reliable comparability between control questions and questions asked during the body of the test, any conclusions about falsehood, duplicity or guilt are spurious and indefensible.
Fundamental to effective testing is the credibility of the testing procedure in the eyes of the subject. The pre-test interview, the control questions, and such arcane procedure as the marked card are primarily meant to convince the person undergoing a polygraph that if he or she lies, the polygraph will detect the falsehood. Yet if the subject has knowledge of behaviour or of polygraph testing, it is unlikely that the person being testing will believe that is the case. Opinions vary among experts interviewed by the Morand Commission about whether intelligent or well-educated people, however one might define such terms, are more or less likely to believe in polygraph testing. But evidence is that credibility with the subject is essential.
Clearly, subjects who distrust the polygraph, or do not care what the polygraph output reveals will not be good subjects for the test and will discredit the evidence in their testing procedures, even in the eyes of polygraph supporters. As Morand says:

…the polygraph, which is usually presented as a psychological test of deception using various psychological inputs is, in fact, a psychological test, because psychological emphasis affects it radically. (Morand Report, pp 227).

And of course that emphasis is not systematized in polygraph procedures generally, and so one polygraph test will not necessarily corroborate with the results of another, or both tests may not in fact accurately detect the same sources of apparent anxiety.
So-called accuracy rates for the polygraph machine vary wildly. It appears that in polygraph testing, laboratory results are inferior in accuracy to field tests. This is, of course, the reverse of usual scientific experience. There are a myriad of other problems of statistical defences and claims (for an excellent explanation of these problems see Lykken’s A Tremor in the Blood, pp 63–82).
In citing accuracy levels, much is made of reliability and validity. The former describes the degree to which various polygraph results corroborate one another; the latter characterizes the ability of the polygraph to actually detect falsehood. As one would expect, the reliability of the test is better than the validity because testers tend to have prior knowledge of test outcome and wish to verify each other. The validity of the procedure ranges from no better than chance for exonerating an innocent person, up to 80 percent for detecting a guilty, or at least apparently deceptive, person in a given setting.

While there are many claims of very high accuracy levels, up to 99 percent, Morand and polygraph researchers such as Lykken could find no evidence to substantiate such high accuracy levels for detecting either guilty or innocent persons. Morand was at pains to point out that “decisions based on physiological data alone were no better than chance” (Morand Report, pp 237), and that any higher level of detection came from psychological observation by testers and by using prior information about the subject.

While in some cases test results show better results than chance, certainly any procedure that involves a relatively intelligent person questioning a subject about his actions over a protracted period would result in a degree of effective detection; in fact, that is what the court system or job interview does in any case, perhaps even more effectively than the polygraph—and without the danger of false accusation in a test that is presented as scientifically based and is therefore given undue credence.
The evidence that the polygraph machine transmits to paper is at best ambiguous. The polygraph device itself, which has essentially been unchanged since the 1930s, transmits physical evidence that is subject to many physiological idiosyncracies of the subject and test environment.
The measurements themselves are aprocryphal: the blood pressure monitor in the cardio channel is both crude and inaccurate as a measure of blood flow or heart activity. Morand expresses severe reservations about what the measurement itself is:

The channel measures some complex physiological resultant in the cardiovascular activity, consisting of a mixture from blood pressure, heart rate, pulse volume, blood flow to the muscles and flow to the skin, and is then displayed in a relatively crude but very graphic manner (Morand Report, pp 241).

While the GSR channel, used with modern, sensitive equipment and in the hands of well trained personnel, may be a useful indicator of origins of stress, scientists who are proponents of this avenue of channel research concede that GSR testing is very limited in its use, and certainly not useful to identify the source of such stress.

Justice Morand tended to be persuaded that while the principles that lay behind the physical testing channels may be sound, most polygraph equipment is a crude parody of such principles:

Nonetheless, the polygraph machine is clearly crude and its operators unsophisticated in using it as a scientific instrument. Even if polygraphy is based on valid principles, I am unimpressed with the primitive standards and lack of progress in the performance of what is called by its proponents a “science” (Morand Report, pp 243).

Since the physical evidence itself is apparently not coherent or unambiguous, its interpretation, based as it is on this quivering base, is bound to rely on very subjective information. It does.
The physical evidence leads to conclusions solely in the context of subjective polygraph testing sessions. Each polygraph operator finds new meaning in the marks on polygraph papers in relation to his or her impression of the subject. Apparently, polygraph operators cannot readily explain their interpretations, either:

Nonetheless, common sense tells me that if a line shoots upwards, then given some basic knowledge of the principles of the polygraph tests, even a layman such as myself could comprehend the explanation for its importance or unimportance. I was not impressed with the fact that any queries along this line were answered by the fact that it is simply self-evident to any trained polygraph operator that one line is significant and another is not… on many occasions, conflicting responses existed on different measures on a graph. The choice of which response was decisive appeared to be based largely on the impression of the subject that the examiner had formed after the pre-test interview (Morand Report, pp 253).

As suggested in point six (above), the physical measures and evidence from the polygraph machine are at best controversial; clearly the use to which the data is put has more relation to subjective impression and bias than it does to systematic deduction. Charitable terms fort his procedure would be circular reasoning, or perhaps divine inspiration.

A possible avenue of mitigation for the subjectivity of machine results would be the blind analysis of the results by someone other than the pre-test interviewer or test administrator. Such blind analysis is rare, partly because the machine output is too crude for such analysis, and partly because the blind analysis that is possible calls results into question.

Attempts to provide coherent explanation of the dubious physical evidence produces a controversy in its own right. There is no consistent way to find meaning in the physical evidence that is isolated from the subjective impressions of the tester, and it appears that the physical output from the machine is little more than a visual aid for the subjective conclusions drawn from amateur psychoanalysis and cross-examination undertaken before and during testing. The machine itself may, like a stage prop, add nothing more than effect to the entire procedure.
It is a commonly held truism among polygraph spokespersons, particularly those involved in pre- and post-employment screening, that the poygraph’s so-called success comes not from detecting guilt or innocence from readings or any other signs. Rather, the inquisition-like process often prompts an admission of guilt from test subjects. The admissions themselves may or may not be for serious transgressions, or are often confessions for acts not normally thought to be crimes or sins. People can be made to confess anything. In any case, the procedure, not the machine or the test procedure itself have detected the supposed crime:

In these periodic tests [in a job situation] as in pre-employment screening, adverse reports are most often based on the respondent’s own admissions. If someone is fired as a result of a polygraph session, it is usually because he [or she] has admitted some dischargeable offence and not just because of the pattern of chart reactions (Lykken, pp 187).

So here again, one sees the polygraph testing as window dressing for coercion, but notas a diviner of truth about guilt or innocence. Coercion may lead to confession, but there is no reason to expect that a confession extracted under the duress of a polygraph is truthful, nor is there any scientific bases for polygraph testing to be found inn such confessions:

If all polygraphs were stage props it is likely that just as many admissions or confessions would be elicited. Certainly much of the popularity and ability of the polygraph derives from this incidental effect (Lykken, pp 215).

Lykken suggests that the polygraph industry’s maintenance of such crude equipment and subjective, inconsistent and arbitrary methods are results of the realization that the test is a charade:

This may be why so many polygraph operators show little interest in research on the actual validity of the various forms of polygraph test: even if tis validity is no better than chance, so long as most people believe int he lie detector or voice stress analyser, these tools will continue to elicit admissions and confessions, as that is their principle purpose (Lykken, pp 215).

Of course, any form of coercion will extract confessions, whether based on fact and true guilt or not.

…these methods commonly inflict great stress and emotional disturbance on the innocent and guilty alike—that is why it works. And because it works so well, one should distrust any confession obtained by modern interrogation methods, whether the polygraph was employed or not, unless the confession can somehow be confirmed (Lykken, pp 215).

Surely fundamental principles of social justice protect a citizen from arbitrary interrogation and its penalties, particularly when there is no evidence that the guilty will be winnowed out of such a procedure anyway. The situation here is rather reminiscent of the crusader who told his armies to kill all whom they came into contact with, and let God sort out the sinners.

In the case of the use of polygraphs for employment purposes, Lykken could find no evidence that polygraph tests reduced theft in the workplace; those who confessed may have been innocent, those who did not confess may well have stolen. Polygraphs didn’t help—it just caused stress and elicited confessions.

Board of Directors Minutes, October 22, 1984

BCCLA member Brian Buchanan presented a brief that he had prepared on the use of lie detector tests. The brief argued that the use of polygraphs was scientifically indefensible and that, therefore, they ought not to be used in employment or criminal investigations. The brief was approved by the Board; however, there was some suggestion that further information had to be argued to bolster the conceptual argument for preventing the use of polygraphs in these cases on the basis of lack of scientific consensus regarding their validity.

CIVIL LIBERTIES CAN’T PROTECT THEMSELVES

Issues