Last Update: August 13, 2010
Henry Templeman
henry
Validation Study
"No amount of experimentation can ever prove me right; a single experiment can prove me wrong."
Any fingerprint match probability (FMP) model that anyone designs, proposes, or even applies to criminal casework, should be considered as nothing more than an "idea", and the sole test of the validity of any idea is experiment.
Experiments to test T-Model assumptions were performed carefully and are considered well controlled, reproducible, and honest. All results from experiment were reported to help others to judge the value of the contribution; not just the information that leads to a judgment in one particular direction or another.
Experimentation to refine and/or refute the discriminative values accorded to individual ridge features and clusters of ridge features based on data gathered as a result of past history is on-going. Results from experimentation will be posted on this page and any changes or refinements made to the discriminating values for ridge features will be reflected on the Updates page.
"The true method of knowledge is experiment."
On Experiment
With regards to any information gathered as a result of experiment, it should be noted that regardless how interesting, nice, or appealing any particular fingerprint model may appear, and regardless of the authority, credentials or educational background of the individuals responsible for it's design, if the model does not agree with experiment, then it's wrong.
Readers unfamiliar with the importance and limits of experiment are encouraged to review the following quotes by Richard Feynman, PhD (awarded the Nobel Prize in Physics) on the Uncertainty of Science and to view the below documentaries:
"The rules that describe nature seem to be mathematical. This is not a result of the fact that observation is the judge, and it is not a characteristic necessity of science that it be mathematical. It just turns out that you can state mathematical laws, in physics at least, which work to make powerful predictions. Why nature is mathematical is, again, a mystery.
I come now to an important point. The old laws may be wrong. How can an observation be incorrect? If it has been carefully checked, how can it be wrong? Why are physicists always having to change the laws? The answer is, first, that the laws are not the observations and, second, that experiments are always inaccurate. The laws are guessed laws, extrapolations, not something that the observations insist upon. They are just good guesses that have gone through the sieve so far. And it turns out later that the sieve now has smaller holes than the sieves that were used before, and this time the law is caught. So the laws are guessed; they are extrapolations into the unknown. You do not know what is going to happen, so you take a guess.
For example, it was believed--it was discovered--that motion does not affect the weight of a thing--that if you spin a top and weigh it, and then weigh it when it has stopped, it weighs the same. That is the result of an observation. But you cannot weigh something to the infinitesimal number of decimal places, parts in a billion. But we now understand that a spinning top weighs more than a top which is not spinning by a few parts in less than a billion. If the top spins fast enough so that the speed of the edges approaches 186,000 miles a second, the weight increase is appreciable--but not until then. The first experiments were performed with tops that spun at speeds much lower than 186,000 miles a second. It seemed then that the mass of the top spinning and not spinning was exactly the same, and someone made a guess that the mass never changes.
How foolish! What a fool! It is only a guessed law, an extrapolation. Why did he do something so unscientific? There was nothing unscientific about it; it was only uncertain. It would have been unscientific not to guess. It has to be done because the extrapolations are the only things that have any real value. It is only the principle of what you think will happen in a case you have not tried that is worth knowing about. Knowledge is of no real value if all you can tell me is what happened yesterday. It is necessary to tell what will happen tomorrow if you do something--not necessary, but fun. Only you must be willing to stick your neck out."
Richard Feynman
The Meaning Of It All, Pg. 24 and 25
"It is necessary and true that all of the things we say in science, all of the conclusions, are uncertain, because they are only conclusions. They are guesses as to what is going to happen, and you cannot know what will happen, because you have not made the most complete experiments."
Richard Feynman
The Meaning Of It All, Pg. 26
"The sole test of the validity of any idea is experiment."
Richard P. Feynman, Six Easy Pieces
Video of Richard P. Feynman: Take the World from Another Point of View
Video of Richard P. Feynman: Key to Science
The Fingerprint Experiments
The author applied, in large part, the following suggestions by Glenn Langenburg and Christophe Champod how to commence validation studies involving friction ridge close matches or "look-alikes" by experiment:
"I would recommend that you do calculate some very simple probabilities for 3-5 minutiae in arrangements using your approach. Predict the value based on your model, and then go and search and see how often they appear."
"If you calculate a small 3 or 4 arrangement is not likely to appear 1/1000, then go and look at 1000 fingerprints. If you find it 20 times, you're off a bit. If you find it 0-3 times, you might be in the ballpark. Its tedious, but that's the only way to see how well this independence holds up. Personally I think you are overestimating by assuming this, BUT maybe...not by much. Empirically testing will help determine the soundness of the assumptions."
Glenn Langenburg 1/1/2008
‘What is required [for validation] is a clear definition of a “close match”. This can only be done in relation to distortion tolerances. My advice would be as a first health check: pick 10 configurations of 3-4 minutiae from marks of known donors. Define the tolerances associated with them (using multiple known impressions from these donors). The tolerances fix the search parameters. Look in a large collection (say 10,000) how many close matches as defined can be found. If you match probability is in the order of 1/1,000, on average you expect to find 10 close matches. If that sample test fails, then it is not a good sign for the modeling assumptions.”
Christophe Champod 1/6/2008
"Close match" experiments are considered extremely important because only when the examiner, or fingerprint model, can reliably predict there will be less than 1 close match or look-alike, and not more, in the relevant fingerprint population for the case at hand can there be valid basis to infer "identification to a single source".
It is significant to note that previously, the author performed 50 experiments using the following flat fingerprint population groups: 126, 37.5, 208, 263 and 113. The results of these experiments were for the most part quite encouraging and helped to develop T-Model v. 7.0. In February 2010, the author performed similar fingerprint experiments involving a fingerprint population size of 1000, e.g., a larger fingerprint sample and therefore better, more definitive experiments. The results of these experiments forced significant modification to the T-Model 7.0 formulae which resulted in the present version: T-Model v. 8.0.
The author also tested the ability of latent print examiners to accurately estimate numbers of fingerprint friction ridge look-alikes present in fixed fingerprint populations using solely professional judgment, e.g., training and experience. The results from these tests were compared with estimates made by T-Model v 7.0 and T-Model v. 8.0. Finally, and most important, the author counted the actual number of close matches present in the 1000 flat fingerprint sampling for each arrangement. The results from these tests are as follows:
Close Match Fingerprint Experiments
Fingerprint Experiments 1 - 3 View PDF
(3 Ending Ridges in a Funnel)
Fingerprint Experiments 4 - 6 View PDF
(3 Ending Ridges Not in a Funnel)
Fingerprint Experiments 7 - 9 View PDF
(3 Bifurcations in a Funnel)
Fingerprint Experiments 10 - 12 View PDF
(3 Bifurcations Not in a Funnel)
Fingerprint Experiments 13 -20 View PDF 1 View PDF 2
(2 Single Dots, 2 Cluster Dots, 3 Cluster Dots, 4 Cluster Dots, 5 Cluster Dots and 6 Cluster Dots)
Summary Conclusions (Based on results from Fingerprint Experiments 1-12)
The number of close matches estimated by T-Model v. 8.0 is relatively and conservatively close to the actual number of close matches found by experiment (see below graph).
T-Model v. 8.0 is more accurate than latent print examiner professional judgment, e.g., training and experience, and more accurate than any other statistical model tested, to estimate numbers of close matches for a given arrangement of fingerprint ridge features present in a given flat fingerprint population group.
The following inferences can be drawn from the above conclusion:
* Fingerprint Experiments 20 - 22 on "cores" are currently in progress *
Fingerprint Experiments Are Testable
All fingerprint experiments performed by the author are testable. The experiments can be independently performed by you, the reader, to see whether or not the actual numbers of close matches reported by the author are relatively accurate.
Readers are encouraged to find out for themselves whether or not the numbers of close matches found during experiment by the author are relatively accurate and whether or not the numbers of close matches estimated by latent print examiners are more accurate than the numbers of close matches estimated by the T-Model.
It is an acknowledged truth in philosophy that a just theory will always be confirmed by experiment.
Corroboration of the T-Model by Independent Experiment
The following experiment corroborates, in part, the use of the Osterburg study to define frequency values for ridge features, and corroborates the idea that the ratio of the most common, significantly weighted friction ridge features used in fingerprint identification, e.g., the ending ridge, bifurcation and dot, are the same across all fingerprint regions, and corroborates the assumption that Osterburg consolidated immature, e.g., incipient, ridge features with mature, e.g., normal ridge features in his frequency study:
The Michel–Tallerico–Verceluz (MTV) Experiment
Dawn Michel, Frances Tallerico and Cesar Verceluz, Latent Print Examiners at San Jose Police Department Central Identification Unit, performed an independent frequency study of 218 fingerprints that were largely comprised of right flat thumbs from different individuals (the experiment was conducted as a classroom training exercise for new examiner trainees). The impressions were selected at random from criminal ten-print records. Only clear, reliable, e.g., absent distortion marker, flat fingerprint impressions were used. Individual mature and incipient friction ridge features, e.g., ending ridge units, bifurcating ridge units and single ridge units (dots), were combined and counted in each sample. The results were then verified in the rolled fingerprint sample on the same ten-print record.
The results from the MTV study were compared to the extrapolated results from the Osterburg study (based on a "ridge unit" approach). The ratio of the frequency results for the different ridge feature types were as follows:
Osterburg Study
Ratio of bifurcations to ending ridges - 1 : 1.88
Ratio of bifurcations to dots - 1 : 3.89
Ratio of dots to ending ridges - 1 : 6.42
MTV Study
Ratio of bifurcations to ending ridges - 1 : 1.87
Ratio of bifurcations to dots - 1 : 3.84
Ratio of dots to ending ridges - 1 : 6.64
Conclusions
The MTV study corroborates the theory that numerical ratios for the most frequently used friction ridge features used in fingerprint identification defined by the Osterburg study are valid.
The Osterburg study utilized roughly a 221mm2 surface area, e.g., roughly 13mm x 17mm, to count the number of ridge features observed in each fingerprint sample. The area used to place the millimeter grid was the center portion of each fingerprint. The center portion of fingerprints generally contains the core and not the periphery regions of fingerprints. The MTV study utilized predominantly right thumbs which are known to contain significant amounts of distal periphery regions of the thumb, e.g., the tip area. Subsequently, the MTV study corroboratesthe theory that the ratios of the most common, significantly weighted fingerprint ridge feature types, e.g., the ending ridge unit, the bifurcating ridge unit and the single ridge unit, e.g., dot, are the same across all fingerprint regions.
Osterburg did not distinguish between the immature, e.g., incipient, ridge unit type and the mature or normal ridge unit type in his frequency table (see Osterburg Frequency Table). The MTV study corroborates the theory that Osterburg likely consolidated these ridge feature types.
The T-Model is More Accurate than ACE-V
The following experiment was performed to test the accuracy of expert fingerprint examiners to correctly identify amounts of corresponding ridge features in a look-alike as insufficient to identify:
Experiment
The Chesapeake IAFIS Non-Match, the largest and best look-alike ever seen, was used for this experiment (see Chesapeake IAFIS Non-Match) . The Chesapeake look-alike images were rotated 90 degrees clockwise and horizontally mirrored (to help guard against recognition). Nine (9) sections of the look-alike were cropped using Adobe Photoshop in a manner so that only corresponding ridge features were left showing. Each section displayed incrementally larger numbers of corresponding Level II ridge features, so that in each image there were from 4 to 12 matching Level II ridge features.
Then, nine (9) expert fingerprint examiners (including 6 CLPEs) were shown each image in succession and asked whether or not the amount of matching ridge detail present in the two images was enough to establish positive identification based on conventional "ACE" fingerprint methodology, e.g., utilizing only human intuition and subjective judgment to render an "expert opinion".
Note: The fingerprint examiners tested conformed to no pre-determined minimum standard needed to establish positive fingerprint identification. In addition they were not given any prior knowledge regarding the latent v exemplar in terms of relevant population (the fact that the "match" was found as a result of an FBI IAFIS search of a fingerprint database containing roughly 530 million fingerprints was not revealed to the examiners).
The results for the above experiment were as follows:
At different stages during the experiment, three (3) expert latent print examiners (including at least 1 CLPE) were prepared to establish "positive identification" based on the amount of matching ridge features displayed in the look-alike images. As a result, 1 out of 3, or 33.3%, of the expert fingerprint examiners tested were fooled by the look-alike.
Conclusion
The 33.3% error rate for "ACE" alone reflects significant inability for expert latent print examiners to reliably and accurately identify amounts of matching ridge features in two impressions as insufficient to establish positive identification when faced with the largest and best look-alike ever recorded.
As a result, the error rate for "ACE-V", which represents the error rate for 2 expert examiners performing an independent examination when faced with the largest and best look-alike ever seen, is calculated as 1/3 x 1/3 = 1/9, or 11.1%.
Note: It is significant to note here that sufficiency to establish positive identification with a reasonable degree of scientific certainty depends on relevant population because as the population increases so does the number of look-alikes.
In contrast to the above study, the T-Model was able to accurately and reliably identify each successive amount of matching ridge features in this look-alike, including every largest and best look-alike ever recorded as well as the the most notable erroneous fingerprint identifications ever recorded, as insufficient to establish positive identification [Link].
As a result, when faced with the largest and best look-alikes ever recorded, the T-Model has a 0% error rate.
Prediction
The T-Model empirical probability approach to fingerprint identification is more accurate than conventional ACE-V fingerprint methodology, e.g., fingerprint expert decision-making based solely on human intuition and subjective judgment, to reliably and accurately identify amounts of matching ridge features in two impressions as insufficient to establish positive identification when faced with the largest and best look-alike ever recorded.
Based on deductively testing both theories, the T-Model theory, while unfalsified, is better than conventional ACE-V fingerprint methodology, e.g., an ACE-V which fails to define pre-determined minimum threshold probability estimates needed in two impressions in order to establish inference for positive identification and also fails to consider what is the relevant population for the case at hand.
Based on the above results, T-Model Theory has greater predictive power than conventional ACE-V fingerprint methodology. This idea is consistent with the fact that the T-Model is grounded in empirical content, e,g., simple experimentation. For Karl Popper, one of the greatest philosophers of science in the 20th century, the following statement supports this idea:
"[For Popper] any theory X is better than a ‘rival’ theory Y if X has greater empirical content, and hence greater predictive power, than Y."
[Stanford Encyclopedia of Science]
It is significant to note here that conventional ACE-V is not entirely replaced by T-Model theory but rather refines it with statistical probability theory with the following significant exception:
The "V" for "verification" in "ACE-V" comes from the word "verify" which means "to prove true" or "authenticate". That is the root meaning and true implication of the word. Not only can no scientific theory be proved true or authenticated (this idea is supported by Karl Popper and noble prize scientist Richard Feynman), but also no latent fingerprint identification can be proved true or authenticated. There is always the theoretical chance the "identification" can be a look-alike and you can be wrong.
The National Academy of Science report makes it clear that fingerprint identification is probabilistic in nature and therefore fingerprint identification can never be verified or proved true with absolute certainty. As a result, the "verification" in ACE-V is misleading and exaggerates the weight of fingerprint evidence. Subsequently, the idea that fingerprint examination requires the examination by two independent examiners in order to be "scientific" is a fallacy. For purposes of quality assurance, and for that reason only, should a second examiner perform a technical review or check of the initial examiner's work. The proper term to replace "verification" is either "corroboration" or "agreement".
Lastly, it is significant to note that unlike conventional ACE-V fingerprint methodology, the T-Model forbids conclusions of positive identification for amounts of corresponding ridge features in two impressions that fail to meet the minimum probabilistic threshold for the case at hand. All true scientific theories are prohibitive.
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
A validation study to determine the error rate for the T-Model to perform fingerprint analysis is presented in a format in line with the National Academy of Sciences, e.g., NAS Report (See Error Rate in Terms of Best Look-alikes).
NEXT PAGE >>>
Henry Templeman encourages the broad scientific community to test the results from the experiments describes on this web site, to perform broader experiments in order to corroborate, falsify or refine the T Model.
Validation studies should be documented in a manner to ensure that any qualified individual could evaluate what was done and replicate the validation process. Documentation should be in the form of hard copy fingerprint cards, photographic, or digital records of fingerprint samples used, with notes or reports of findings, which includes reference material. Documentation of external validation must identify the name and professional affiliation of the person(s) conducting the study, date, as well as the research question, procedures, results and conclusion(s).
Any independent experiments that corroborate or falsify results from experiments performed by the author should comply with SWGFAST guidelines established for Validation of Research and Technology [61].
* * *
It is significant to note that previously the author performed 50 experiments to test the ability of the T-Model to estimate numbers of fingerprint friction ridge close matches or look-alikes present in fixed flat fingerprint population groups. Flat fingerprint samples instead of rolled fingerprint samples were tested because they better reflect the size of an average latent print.
For the above experiments, it is significant to note that ridge edge contours and widths in each look-alike failed to precisely agree. However, the absence of such agreement is not a factor that reduces the weight for the ridge formations and/or ridge unit types in agreement. The absence of ridge edge contour and width agreement means no additional weight is factored into the aggregate value for the total ridge formations observed. Since the absence of level III ridge detail agreement is not a negation or exclusionary factor, the values assigned for the above ridge formations found in agreement were interpreted as the same and therefore assessed the same aggregate value.
* * *
The reader is encouraged to perform the same below experiments to find out for themselves how many close matches for a given arrangement of "ridge features types in position" are present in a relatively clear 1000 flat fingerprint population group. The experiments are easy, fun to do, and a great way to learn about the extent friction ridge look-alikes exist in fingerprints.
Henry Templeman
henry