During the course of World War II, the Western Allies made sustained efforts to determine the production rate of German Panzer tanks. By analyzing just a small sample of serial numbers from the available captured and damaged Panzers, Allied mathematicians estimated the production at 246 tanks-per-month. This approximation contrasted sharply with traditional intelligence reports, which had placed the monthly production at a much higher 1,400 tanks. It turned out that the statisticians were far more accurate than the spies: German records recovered after the war revealed they had produced 245 tanks each month between June 1940 and September 1942. “The German Tank Problem” remains one of the most famous examples of applied mathematics in the 20th century, used in schools and universities to demonstrate how to gauge the size an entire population based only on a limited sample.
Roughly 80 years later, the Israel Defense Forces (IDF) operating in Gaza have adopted a statistical approach to assessing civilian harm before conducting aerial strikes. This unprecedented methodology parallels the Allies’ strategy in the German Tank Problem but has severe limitations in the manner that it is being deployed by the IDF, highlighting the tension between intelligence-based decision-making and statistical inference in warfare while causing legal and ethical unease.
Reports indicate that the IDF relies heavily on a rudimentary statistical model to assess the presence of civilians in buildings designated for targeting. The military has long divided Gaza into 620 sectors, each the size of several city blocks. An automatic system estimates the number of active phones in each based on signals received by cellphone towers. By comparing cellphone and Wi-Fi usage to prewar volumes, the IDF approximates the proportion of residents remaining in each sector. Then, to estimate the number of civilians in a specific building, intelligence officers assume that its prewar residents had escaped at the same rate as the rest of the neighborhood. For example, if the IDF estimates that half of a neighborhood’s population had evacuated, a building that typically housed ten residents would be presumed to contain five people. The IDF would typically not surveil the homes destined to be bombed to evaluate how many people are actually inside, instead relying on this rough estimation to conduct the proportionality analysis required by international law: that is, determining whether the expected collateral damage of a strike is excessive in relation to the direct military advantage anticipated. Although the IDF may have gathered intelligence from other sources in the early stages of establishing whether a particular individual is a target, it appears to be exceptionally rare that supplementing intelligence has been utilized to determine civilian presence at the actual time of a strike.
Even at its best, this methodology provides data that could be significantly outdated by the time an airstrike was carried out. In May 2024, Human Rights Watch (HRW) discovered data posted publicly online by the IDF (apparently erroneously, according to HRW), that included “what appears to be operational data related to the systems used for monitoring the evacuation and movement of people through Gaza.” The data contained population figures consistent with a 10-year-old census (particularly problematic given that over a quarter of Gaza’s existing population were not yet born a decade ago). Additionally, the accuracy of the system’s predictions necessarily depends on people consistently having enough electricity to power their phones, and a working phone network. Frequent power and network outages in Gaza often made that an unlikely scenario. Furthermore, the method also disregards how, especially during times of war, people often cluster together in large groups rather than conveniently dispersing evenly across the territory. Finally, it is unknown how or if the model accounts for the presence of young children, elderly people, or persons with disabilities, who may not typically carry cellphones. And of course, a statistical model built in this fashion likely does not account for the particularly vulnerable risk profiles of these individuals in the way that live, human analysis might.
The Legal Duty to Take Feasible Precautions
International law prescribes requirements that States must follow when making targeting decisions that are clearly applicable in this context. The International Humanitarian Law (IHL) obligation to take “feasible” precautions requires that States must take “constant care” to spare the civilian population, and that when planning or approving an attack a State must “do everything feasible to verify that the objects to be attached are neither civilians or civilian objects. . .” Both the United States and Israel consider this obligation to be a part of the proportionality analysis required by international law, although Israel’s explanation of the precautions principle differs slightly from its conventional definition in Additional Protocol I to the Geneva Conventions (AP I). Notably, while AP I requires parties to take “all” feasible precautions, this word does not appear in the Israeli or American definitions. The obligation to take “constant care” to spare civilians does not feature either. The International Committee of the Red Cross (ICRC) and others view both of these AP I components as customary international law, and therefore views them as applicable even to nations not party to the Protocol like Israel and the United States.
Regardless of the variances in these definitions, Israel’s practice of statistical proportionality as described above, should be considered to violate the legal duty to take feasible precautions in attacks both ex-ante and ex-post. First, by not gathering further feasibly available intelligence concerning the presence of civilians before initiating the attacks. Second, by ignoring warnings and apparent evidence of the system’s shortcomings, and failing to conduct post-attack reviews that may discredit the method altogether (or at the least help refine it to achieve more accurate results).
The Duty to Gather More Intelligence
The International Criminal Tribunal for the Former Yugoslavia (ICTY) committee established in 2000 to review the NATO bombing campaign, concluded in its report that the obligation to take all feasible precautions to spare civilians, required military commanders to “set up an effective intelligence gathering system.” The effectiveness of the intelligence operation should be determined objectively, by examining the methodologies of data management the operation employs, and holding intelligence collectors to a standard of reasonability. Therefore, the duty to effectively gather intelligence can be violated by inadvertent non-action, negligence, or willful ignorance – such as failing to obtain information crucial to the protection of civilians that was feasibly available.
Israel’s own legal history contains multiple indications of a legal obligation to thoroughly collect and verify information before carrying out attacks. In the famous targeted killings case, in which the Israeli Supreme Court authorized the practice of targeted killings in the context of ongoing armed conflict, it stated that “Information which has been most thoroughly verified is needed…” The Sehade Special Investigatory Commission, established by the government after the 2002 assassination of Hamas leader Salah Sehade also killed 14 Palestinian bystanders, likewise highlighted in its report the importance of obtaining “positive” intelligence for the purpose of conducting a proportionality analysis, specifically and especially intelligence on the presence of civilians near the target. The commission concluded that had appropriate resources been allocated to the examination of the expected collateral damage, the attack might not have been undertaken.
While the IDF has not explicitly addressed any legal issues in the context of statistical proportionality, one can imagine that it would likely argue that, particularly in the early days of the Gaza war, there was no objectively feasible opportunity to obtain more precise intelligence. In such circumstances, the IDF may have deemed a statistical approach to be preferable to having no data at all before attacks. Additionally, the IDF would likely contend that Hamas’ use of human shields and its failure to fulfill its own obligation to protect Palestinian civilians from Israeli attacks significantly impacted the feasibility of precautionary measures the IDF could take.
While both of those arguments may hold true in some cases, perhaps even in many, the obligation to take feasible precautions before each and every strike, and the derived duty to maintain an effective intelligence system and continuously verify information gathered before attacks, means that a blanket general authorization to attack without considering reasonably obtainable intelligence on civilian presence cannot be lawful. Even if Israel’s pre-existing (and already comparatively low) benchmark for precautions is lawful, this does not excuse the current failures to review the circumstances on a case-to-case basis before launching each attack, even by Israel’s own standards.
Furthermore, there is good reason to be skeptical of arguments regarding the non-feasibility of intelligence-gathering in the Gaza context. It is important to note in this respect that the IDF has the capability to surveil any building in Gaza within minutes or even seconds, particularly during active hostilities. Gaza is significantly smaller than the theaters in which the United States operates in its so-called War on Terror—measuring only about 25 miles in length and 3.7 to 7.5 miles in width from the Israeli border to the Mediterranean Sea. Its densely populated urban centers, such as Khan Younis, Rafah, and Gaza City, are under constant surveillance by dozens of drones and high-resolution cameras stationed along the border, including advanced facial recognition technology. Throughout the war, surveillance drones cover with ease all parts of the strip, 24 hours a day. Given these capabilities, a systematic failure to even attempt visual confirmation of a building’s occupants before bombing – beyond isolated cases of emergency strikes – cannot be justified by a lack of feasible means to do so as a standard practice.
A Way Forward: Incorporating Post-Strike Civilian Harm Analysis
Media reports indicate that senior IDF officials were warned repeatedly both externally (by their American counterparts in the Joint Forces) and internally that the statistical method was inherently inaccurate, leading to catastrophic results. Naturally, testing the system could have potentially provided IDF chiefs with evidence confirming whether the method was or was not, in fact, effective.
However, post-strike analysis was scarce in the Gaza war. Even when conducting reviews of its attacks, the IDF rarely tried to count how many civilians had been killed in an attack, making it impossible for officers to assess the model’s accuracy. This is entirely inconsistent with the “constant care” standard, as verifying the system’s precision is surely a reasonably feasible precaution that could have been taken to ensure that information relied upon to verify the presence of civilians was in fact accurate, particularly when such verification could take place after the strike and under no urgency. If such reviews would have shown the model to be indeed inaccurate, the IDF could have stopped using it or modified the formula as necessary to correct inaccuracies. But the IDF failed to do this.
Employing statistical proportionality as the IDF did in Gaza, i.e., without exhausting feasible precautions available including the gathering of obtainable information, may never be legally justifiable. Still, there are operational advantages of the methodology; primarily its speed, but also its potential to yield insights that could, in some cases, surpass traditional intelligence (think of the German Tank Problem) and thereby promote the safety of civilians, if employed for that purpose. These advantages are worth considering. However, these potential benefits are difficult to realize without methodical, ongoing evaluations for accuracy. The method’s effectiveness could be enhanced by integrating post-strike civilian harm assessments to dynamically refine future estimates. The resulting feedback loop could strengthen and enrich proportionality assessments.
Many modern militaries, including the U.S. military, have adopted post-strike evaluations to assess civilian harm. Although under stress in the current U.S. domestic context, the U.S. DoD’s Civilian Harm Mitigation and Response Action Plan (CHMR-AP) mandated comprehensive post-strike data collection and review processes to improve future targeting decisions. The (non-binding) Political Declaration on Explosive Weapons in Populated Areas (EWIPA), endorsed by 88 States, acknowledges
“the importance of efforts to record and track civilian casualties, and the use of all practicable measures to ensure appropriate data collection.” The purpose of which according to EWIPA is to “help to inform policies designed to avoid, and in any event minimize, civilian harm […] and enhance lessons learned processes in armed forces.“
While there is no clear treaty requirement overtly requiring post-attack views, and the above does not represent “opinio juris” reflecting a legal obligation to conduct post-attack reviews (as one necessary, but not sufficient, factor in establishing customary binding international law), this is clearly considered to be “good practice” by many States and their militaries. Such assessments are arguably even more important when engaging in statistical proportionality. Accumulated information about civilian harm in different types of strikes, utilizing different types of weaponry, and even varying circumstances like types of buildings or terrains, could be incorporated into the “loop,” complicating the formula but enhancing the possibility of good results.
There may be a place for statistical analysis in conducting proportionality assessments, but it is not (currently) one that is effective in isolation. Instead of replacing surveillance with predictive models, as the IDF has done, militaries could use reconnaissance missions, drone footage, and ground intelligence to validate statistical predictions before executing strikes, and vice versa with post-strike intelligence informing the statistical formula. A hybrid approach would preserve some of the potential efficiency of statistical modeling, while enhancing the likelihood of compliance with IHL. Data developed in this manner could even be shared among allies, and – potentially – provide a State stronger moral and legal standing for their engagement in a war from a political perspective. Continuously updating the statistical formula on the basis of pre- and post-attack intelligence will in fact exemplify “constant care” to spare civilians from harm, and better inform commanders tasked with proportionality analysis before strikes.