Methodology is not a sexy topic of discussion. But it should be.
Nothing is more important to an accurate retelling of diplomatic history than good methods. What we believe happened, how it happened, and why it happened, all come from good methods.
Knowledge of methods is equally important for consumers of diplomatic history. If you cannot recognize poor methods, you are more likely to read something and believe it. But you should not read something and believe it. You should read something, analyze it, and then decide to believe it (or not).
Diplomatic history suffers from several methodological blind spots. These lead to an incomplete telling of the past and the propagation of myths. Some of these blind spots may well apply to other historical disciplines, but as a diplomatic historian, I’ll speak to the field I know best. Specifically, I study the Gilded Age and Progressive Era, so I’ll anchor most of my critiques in examples from the diplomatic history of that era. (I will limit those critiques to the work of senior scholars and famous past historians, whose careers are less vulnerable to criticism.)
Blind Spot #1: Causal Inference
“The study of history,” E. H. Carr once wrote, “is a study of causes.” The field of history explains what happened in the past and why. Historical works are not merely dry accounts of occurrences, they also all make causal arguments.
A causal argument (or causal claim) can be boiled down to the following: some X causes some Y. It gets more complicated than that, but that’s the basic idea. A book usually has an overarching causal argument, but there are also hundreds of smaller causal arguments that serve as the scaffolding for the larger argument.
Understanding causal inference, the discipline of proving causal claims, is of great importance to every diplomatic historian, but it is neglected. Few, if any, top history departments teach graduate-level courses on causal inference. In the most recent annual meetings of three major professional organizations of historians, between the hundreds of panels and thousands of presentations, only two presentations centered on causal inference. Some scholars have made efforts to resolve this lacuna, though they are typically social scientists who already study causal inference. The gap between international relations scholars and diplomatic historians (who once were quite close) only widens.
It is not that diplomatic historians do not ponder causation, as the Carr quote shows. Instead, the concept is treated informally in the practice of writing diplomatic history. I have encountered a couple of reasons why. One is that some diplomatic historians assume causal inference is for statisticians and social scientists, forgetting that statistical methods have underlying qualitative logic. Another is that some diplomatic historians believe works that more greatly emphasize causal arguments would lose the messiness and contingency of diplomatic history. I’d invite those scholars to read some social science. It’s very messy.
A third response is that identifying an “independent variable” is impossible, because all variables are dependent on other variables. It is a fair point. Life is complicated and we cannot independently separate every feature of existence when explaining history. But that betrays a misunderstanding about the value of studying methodology, as extremes are easy to write off. Of course, one can study causation while admitting complicating factors. Causal inference has an entire vocabulary for doing so. What is more is that diplomatic historians do identify independent variables, all the time. Historian John Lewis Gaddis, for instance, identifies independent variables often to make causal arguments, despite arguing that independent variables do not exist. As one example in his most famous work, Strategies of Containment, he writes: “It was this shift in the perception of power relationships that caused a sense of weakness in the West… ” (emphasis original). That is nothing but a causal claim that identifies an independent and dependent variable. Why not study these concepts of causal inference?
The consequences of this neglect are significant. Scholars’ methods of proof are inchoate, causal claims are less recognized and footnoted (and therefore, less proven), and together this leads to misleading diplomatic history. Readers can browse their favorite books and find causal claims on nearly every page. The issue is knowing how to spot them and differentiating them from statements of fact. Scholars are less likely to add evidence in support of unrecognized causal claims, and readers will be more likely to believe them. (That last sentence, for instance, contains two causal claims, both of which I have not yet proved.)
Here are some examples of historians making causal claims but not proving them. In 1898, U.S. President William McKinley decided to annex the Philippines after the War of 1898, when the United States defeated Spain, which held the Philippines as a colony. That is a fact. In his 2020 book, historian Christopher Capozzola explains why. He writes: “McKinley’s move was meant to counter the revolutionaries’ obvious political power and mask American weakness.” That is not a fact, it’s a causal claim: McKinley’s recognition of Filipino power and the desire to mask U.S. weakness caused his decision to annex the Philippines. It is hard to exaggerate how significant an argument this is. U.S. annexation of the Philippines was one of the most consequential policies toward Asia in U.S. history.
Capozzola’s sentence is neither cited, nor preceded, nor followed by anything that discusses McKinley’s decision-making process. It is an unsupported causal claim. (It is also off the mark). Capozzola’s book was published by a trade press (widely believed to encourage fewer footnotes), but academic presses are also less attentive to issues of causal inference. Witness Michael Green’s explanation of the same decision, where Green emphasizes the role of McKinley’s advisor John Hay in making the final decision to annex. While Green cites many facts surrounding the issue, he leaves McKinley out of the explanation of McKinley’s own decision. Green suggests that it was Hay who changed his mind and made the final demand to annex the Philippines. But Hay, though an important advisor, was wary about annexing the entire archipelago. McKinley used Hay to cable his demands to other advisors — it was not Hay’s idea. Green’s focus on Hay would only work if McKinley was ultimately brought into the fold of the argument, but he is not.
McKinley’s decision is not the central focus of either Capozzola’s or Green’s books. Their choice not to prove the cause of it is therefore understandable, but dangerous. A book’s overarching arguments are supported by hundreds of smaller arguments throughout. True, one should not reasonably expect every causal claim to be fully explained without sacrificing readership (or millions more trees), but we are quite far away from those realities. And McKinley’s decision was not a small decision.
A greater emphasis on causal inference in history would naturally encourage scholars to identify (and prove) the causal claims they already make. This in turn would produce greater clarity for the reader and better history.
Another benefit is exposure to concepts, techniques, and methods of proof. Colliders, confounders, and matching analysis are among a number of ideas that sharpen intuitions about causal analysis. All of them have underlying logic and insights applicable to qualitative historical analysis.
Blind Spot #2: Forgetting the Denominator
An informal criticism among historians is that political scientists sometimes cherry-pick historical data and cases to support their theories. It can be a fair criticism, but historians can unwittingly cherry-pick too. They do so more subtly by “forgetting the denominator” (i.e., judging data without judging the representativeness of the data). A good example of this is the common use of historical newspaper articles as evidence of public opinion.
Walter LaFeber, one of the most influential diplomatic historians, forgot the denominator in his pathbreaking book, The New Empire. LaFeber argued that leading financial periodicals tacitly supported war against Spain in 1898, and that support helped convince McKinley to pursue war. LaFeber emphasizes a small number of newspaper excerpts. It leads the reader to believe that this attitude was widespread, reflecting the denominator of the financial community. Years later, Jonathan Kirshner assiduously proved that the financial community was actually generally and consistently opposed to war. LaFeber presented a denominator. Instead, he really only found a small numerator.
LaFeber published his book in 1963, but forgetting the denominator is still an issue. Jill Lepore, in her major history of the United States, also forgot the denominator when implying that the sensationalist, so-called “yellow” press reflected American opinion and caused the War of 1898. (In fact, most papers in the United States were not yellow.)
Blind Spot #3: Neglecting “Negative History”
Diplomatic historians tend to disproportionately analyze “positive history,” or events that happened. We spend less time appreciating “negative history,” which are events that did not happen but easily could have. Much as positive and negative space determine the contours of our visual environment, so too do positive and negative history determine the contours of experience.
For example, how often does one encounter analysis of the “not-war” with Chile in 1891? That year, the United States and Chile nearly went to war over a drunken street brawl in Valparaiso. Outside of the True Blue Saloon, an altercation between American and Chilean sailors eventually led to the deaths of two U.S. sailors and the imprisonment of several dozen more. Each country blamed the other.
Chile refused to apologize and pay reparations to bereaved U.S. families. After making a few unanswered threats, U.S. President Benjamin Harrison eventually asked Congress “for such action as may be deemed appropriate.” Historians consider it a threat, if not a formal request, to declare war. The Chileans certainly did. Within a day, the Chilean foreign minister apologized and eventually agreed to pay $75,000 to U.S. families.
War (or at least some use of force) nearly happened. U.S. warships were already stationed in Chilean ports as a precaution in response to political instability in the country. The U.S. military began moving more ships to the Pacific, purchased additional weapons and ammunition, and planned a blockade of Chile’s primary ports.
Some books, especially on U.S.-Chilean relations, do analyze this near-miss, but not many. Had war happened, U.S. history would have changed significantly in either the event of a Chilean or U.S. victory. But implied in that judgment is the argument that the absence of war — equally — affected U.S. history. That is negative history. It is studying the effect of an absence of war (a factual non-event), not merely studying what happened in Valparaiso (positive history), or what would have happened if a war had broken out (counterfactual analysis).
Understanding negative history helps us understand underlying causes of future events. But, like the “not-war” with Chile, they can be difficult to identify. (Some scholars do study them, though.) They can be especially difficult to identify because the list of factual non-events is infinite. Where does one start? But that difficulty should not stop us from trying to understand significant non-events in U.S. history. It would help us understand why the world looks the way it does.
Blind Spot #4: Falling Prey to “Academic Telephone”
“Telephone” is a children’s game where kids pass along a message, one by one, around a circle by whispering it into each other’s ears. When the message completes the circle, the originator compares the final iteration to the first. Despite good intentions, the message will often change over its journey around the circle. Laughter ensues.
In history, telephone is more a risk, less a game. A scholar wishing to cite an original quote might understandably quote a secondary source where they read the quote. That secondary source might also quote a secondary source, and so on. Of course, if each scholar perfectly captures the quote from the previous secondary source, there is no issue. But that does not always happen, much like in telephone. In the process, the message gets garbled. In some cases, it leads to a fabrication of truth.
As I have written about elsewhere, consider the case of Thomas Bracket Reed, Speaker of the House, and his opinion about annexing Hawaii in 1898. Reed opposed it. In 2017, Stephen Kinzer wrote that Reed “told a friend that the United States might as well ‘annex the moon.’”
But actually, Reed never said this. How did Kinzer make this error? The lens of “telephone” helps us understand why.
Kinzer’s error clearly was not intentional. He cites a 2011 book by James Grant. Grant, meanwhile, cites another book, from 1931, by Walter Millis. And Millis cites a 1914 biography, by Samuel McCall, and that is where it ends.
McCall did not quote Reed. Instead, he speculated that Reed probably felt it was no more necessary to annex Hawaii than it was to annex the moon. Millis quotes McCall but does not cite McCall, leading Grant (after reading Millis, but not McCall) to believe that Reed actually said those words. Kinzer took it one step further by imagining that Reed told his friend he felt this way.
In a century, therefore, McCall’s opinion became Reed’s words.
The stakes in Reed’s case were fairly low. But it draws attention to the casual rules around the footnoting of sources and the risk of propagating myths. Scholars often cite quotes from secondary sources. And so long as they can show they got their quote from somewhere, it does not matter if it is fake. There’s a low likelihood an editor will double-check.
An easy solution is to require scholars, if they cannot cite a primary source, to cite the first secondary source where the quote appears. Doing so would minimize the risk of academic telephone by shortening the line of interpreters.
Blind Spot #5: “Explanation Bias”
History is difficult to tell. It is especially difficult because of our knowledge of the past.
Hindsight bias is a well-known concept that refers to the exaggeration, in hindsight, of what foresight could have predicted. For instance, in hindsight, a person might say they had a feeling Donald Trump would win the 2016 election, when at the time they thought there was almost no chance he would win.
Two years ago, Richard Zeckhauser and I theorized another bias that emanates from hindsight, concerning historical explanation. We call it “explanation bias.” We argue that all people, including professional historians, are likely to exaggerate a specific explanation of history precisely because they know what happened. “Outcome knowledge,” as it is called in decision science, skews historical explanation by making it harder to imagine the past as it existed at the time, with all its attendant uncertainties. Based on the historical facts we learn, we cling to plausible explanations of historical events, mistakenly treating those explanations as probable.
For instance, specific actors who become prominent in historical memory might serve as plausible explanations for events they had nothing to do with. Because the “path of what happened is so brightly lit,” we overvalue the causal role of actors and events along it. For example, because we know that Theodore Roosevelt became president, we are more likely to exaggerate his influence before he had power. Eric Hobsbawm, a prolific 20th-century historian, fell prey to this trap in The Age of Empire. In it, Hobsbawm writes of a “ruling elite” led by Roosevelt that mobilized support for war against Spain in 1898. William McKinley, who was president at the time, would have been shocked to learn that his sub-cabinet official, Roosevelt (who had not yet even become governor of New York), led “a ruling elite” in 1898.
This kind of bias is everywhere in historical analysis. Many historians have overstated Roosevelt’s role, at the expense of McKinley’s.
What to Do: More Footnoting, More Causal Inference
Several of these blind spots can be resolved with better footnoting and greater awareness of causal inference. You might think: “Better footnoting?! Diplomatic historians footnote better than anyone!” That’s partly true. I believe diplomatic historians do footnote better than anyone, but they also footnote differently. They do not footnote causal claims as well as they footnote historical facts.
The discipline’s norms around citing historical facts are strong. We need the same norms for citing causal claims. Doing so would stem the proliferation of causal myths that propagate in the discipline, and it would also lead to greater transparency for readers. Because causal claims are treated informally, they can escape footnoting. A part of this process necessarily involves greater training in causal inference. Incorporating intuitions from causal inference would only strengthen methodology, given that causation is so central to diplomatic history’s guiding purpose. Doing so may additionally help scholars remember the denominator and consider negative history more often. Representative samples and populations are key concepts in causal inference, which also draws attention to the influence of unidentified and unmeasured variables.
Meanwhile, more rigorous footnoting would also help resolve issues of academic telephone and explanation bias. A simple change in footnoting convention (outlined above) for direct quotes would significantly minimize the risk of misinterpretation. That’s on publishers, editors, and other gatekeepers to demand (who also, I hasten to add, should end the preference of endnotes over footnotes, since evidence should be presented as close to claims as possible, not hiding 300 pages away). Footnoting and training in causal inference can also mitigate explanation bias by encouraging a higher threshold of proof for explanations that may be distorted by hindsight.
Methods of diplomatic history have changed significantly over the last century. But there’s still room for improvement. Resolving the several issues explained above would not just benefit historians and readers today. By developing methodology, we’d also be helping future diplomatic historians tell the histories of pasts that have yet to transpire.
Aroop Mukharji, Ph.D., is a visiting scholar at the Center for Strategic Studies at Tufts University’s Fletcher School of Law and Diplomacy, a non-resident fellow at the Eurasia Group Foundation, and an associate of the Applied History Project at the Harvard Kennedy School, where he received his doctorate.
Image: Library of Congress