By Adrián Fernández
September 24th, 2021
On September 16th, the World Bank (WB) announced the discontinuation of the Doing Business Report, joint with the publication of an investigative report that confirmed data manipulation events. This was an inglorious end for an index considered for decades a rock star on countries’ business climate evaluations.
Some months ago, the WB commissioned WilmerHale, an international law firm, to investigate allegations of data manipulation in the Doing Business (DB) results. Their report mentions several senior leaders of the WB at the time, Ms. Kristalina Georgieva among them, her brilliant career in development institutions tainted by these events.
Beyond individual responsibilities, in this article we argue that a fertile ground was created for these inappropriate events by departing from good practices in various areas of the WB, particularly in the DB and Research (DEC) areas. Corporate culture aspects, prioritizing communication and the impact on public opinion over methodological rigor and producing knowledge, are at the root of the problem.
DB is a system of indicators to measure a country’s «business climate.» The principle is simple: the hurdles derived from procedures and regulations that a typical company must face to carry out operations are estimated annually. For example, time, cost, number of procedures, etc., to obtain a permit to construct a warehouse. In the end, with indicators that cover 12 different areas, a kind of Frankenstein is generated that synthesizes in a single indicator (the «Ease of Doing Business,» EDB) the country´s business climate.
Published since 2003, the final result, especially the resulting ranking of countries, began to have a life of its own: it was picked up by the media worldwide, credit rating agencies considered it, other multilateral organizations used it in their operations (among them, the IMF). DB results were increasingly used in WB loans and strategic engagement with countries, as conditionalities to be achieved by governments. The report had important repercussions on public opinion in each country. Countries´ position in the ranking led to significant attention from governments, who began pressuring both the DB team and Bank authorities to improve their position in the rankings. Some pressures, as we now know from the WilmerHale report, were «successful.»
Alongside the investigation, the WB also commissioned an External Review of the methodology. The group, led by Mr. Mauricio Cardenas, produced a report dated September 1st, 2021, which recommended, in its first paragraph, “The Doing Business project is a unique source of comparable global data, … and potentially of great value to inform decisions by governments and firms. However, to unleash that potential the current methodology should be significantly modified, implying a major overhaul of the project.” The panel also recommended removing the aggregate index and country rankings, and improving Doing Business´ transparency and oversight.
In 2013, another group of experts, led by Mr. Trevor Manuel, presented a report warning of severe methodological and procedural deficiencies. The report also advocated removing the aggregate index and the ranking. Collecting indicators but not including them in the EDB is not new, even for the DB practice: indicators of labor regulations were collected for years, but they were not included in the EDB.
The WB only implemented some of Manuel´s report recommendations, and probably the least «conflicting.» Today, we can guess that full implementation of the panel´s recommendations in 2013 would have saved a lot of headaches to WB´s Management.
In our opinion, the WB is correct to discontinue the DB report, given the reputational damage that has already occurred. From the communique, it seems pretty clear that the aggregate index and rankings are cancelled. Nevertheless, WB should avoid ending the preparation and publication of business climate indicators. The communique informs the cancelation of the report. Maybe this is a semantic issue, but we would agree to maintain the calculation of the basic indicators, even some of them need significant changes in methodology.
Fertile ground for pressures
In August 2020, the WB Management announced the suspension of the publication of DB (due in October) after complaints of data manipulations. In December of that year, an internal investigation confirmed irregular manipulations occurred in the 2018 and 2020 editions (published in 2017 and 2019, respectively) for four countries. WilmerHale found, in one of the cases, Azerbaijan, that changes were due to arbitrary decisions by the head of DB due to distrust («skepticism») in the reforms undertaken, damaging the country’s results. In two other cases, Saudi Arabia and the United Arab Emirates, the manipulation would have occurred as a way to «reward» countries for significant reimbursable advisory programs (i.e., paid by governments) with the WB. And finally, for the fourth country, China, the report suggests a quid-pro-quo linked to contributions for the recapitalization of the WB. The report´s wording is very cautious (see paragraphs 3 and 4) and places the Management’s concerns regarding the contribution of China only as a “background” for the manipulation of data. In any case, the actual manipulation of the Chinese data allowed that country to improve seven positions in the ranking, compared to the situation without modifications.
It was not the first time that complaints of DB´s data manipulation have been filed. In January 2018, Paul Romer, then Chief Economist of the WB and later that year, Nobel Prize Awarded, denounced the manipulation of DB data to harm Chile, while Ms. Bachelet was the President. The WB commissioned an external audit that finally found no significant elements of manipulation. As a footnote, the differences in external investigator´s conclusions (the one in 2018 and the recent one) should alert us to a cautious approach to these findings.
At the WB, as in other multilateral organizations, governments are the owners (the «shareholders») of the institution. In this matter, as in many others, tensions emerge between «pressures» from shareholders (about reports, projects, etc.) and the Staff and Management’s independence on current operations. The problem is well known, and multilateral organizations follow «good practices» to cope with these pressures.
However, in the DB case, conditions were in place for the tension to be resolved in favor of governments pressures. The confluence of several «bad practices» contributed decisively to the failures in the integrity of the process.
First, one of the most important weapons that these organizations have to mitigate pressures is transparency. An indicator such as DB should have a transparent calculation procedure, primary data known, etc. In other words, final results should be verifiable (or audited). Discussions about the methodology cannot be prevented, but regular results should emerge automatically and be free from claims.
On the contrary, DB’s methodology was opaque, the raw data were unknown (even without identifying the source), managers had a high degree of discretion. These issues finally created the terrain for «abnormal» situations: as it became clear later, there were manipulations in the data and «inappropriate» changes in the methodology to favor or to harm specific countries.
Second, another crucial element for the institution’s independence is not to link the results of the studies with other activities, particularly those that generate income for the organization. In this sense, the Bank’s Reimbursable Advisory Services (RAS) played a negative role in these events (see paragraphs 45 and 46 of WilmerHale´s report).
Finally, part of the blame is for the WB policy to boost DB´s public profile, emphasizing the “sexiest” parts of the Report, the rankings. And, in certain aspects, fueling the competition between countries. To name some of these amplification actions: the launch of a new edition was carried out simultaneously in the different regions of the Bank with the presence of senior officials; the figure of the Top Performer or Top Improver, champion of reforms, was created, and it was a distinction for a country to belong to that list every year.
A system of indicators, which in itself can represent a valuable contribution for countries to assess the status of their regulations, gradually became a monster, which ended up devouring its creators.
Weaknesses of the DB system
The original edition of DB, in 2003, included five areas for 130 countries. The report´s last edition, released in October 2019 (the 2020 Edition), comprised twelve areas for 190 economies. Ten of these areas are included in the Ease of Doing Business (EDB) score and corresponding ranking.
One of the main criticisms of the DB system corresponds to its summary indicator, the EDB. To understand the problem with the EDB, let us consider the first area of the indicator: starting a business. Four specific indicators are collected: number of procedures, time, cost, and minimum capital requirements. The specific data of a country for each indicator are measured and compared with certain maximums and minimums in the world, measuring the distance to the best regulatory performance, in a 0 – 100 range, called “score.” Finally, scores for the four indicators of starting a business are averaged, and the summarized result (the area´s score) is obtained. The operation is repeated in each area, and then these values are one more time averaged to get the summary indicator, the EDB.
The different averages that are made in all cases are simple averages. And this is the main methodological weaknesses, known as the “aggregation problem.” Like the Frankenstein character, components that may be noble in the origin end creating a kind of monster. There is no theoretical basis to the same weight for each indicator as its contribution to the result. This is questionable at the level of a specific area, but much more as the procedure to calculate the EDB. Why the area of »starting a business,» an action that a typical entrepreneur should face once or twice in her life, should have the same weight as other areas that represent permanent operations, such as paying taxes, getting credit, etc.?
Mr. Martin Ravallion, with a long and prestigious trajectory in WB´s Research, recently mentioned on his Twitter account: “Manipulation of the data is one thing, but DBI was flawed from the start. An arbitrary mashup of partial and deceptive indicators.» His 2011 working paper, entitled Mashup Indices of Development, is a must-read to delve into these methodological problems.
Another criticism to the DB is the ideological bias. Much has been argued that DB’s philosophy is «less regulation is better.» It has been compared to the Washington Consensus that prevailed in multilateral institutions in the 80s. According to the 2020 report, “Research demonstrates a causal relationship between economic freedom and gross domestic product (GDP) growth …” This is a highly questionable premise, to say the less, but is clear about DB´s core philosophy.
For example, in «paying taxes,» countries with lower corporate tax rates will receive better scores, regardless of the public services (roads, security, qualification of the workforce, etc.) the government may provide. As another example, the DB system does not include an evaluation of the regulations to reduce greenhouse emissions. In fact, if they were in force in any of the DB´s areas, they would have a negative evaluation to the extent that they «delay» or make the procedures “more expensive.» The Cardenas´ report makes specific recommendations regarding this issue.
The criticisms on methodological aspects are numerous. DB has not reported sample sizes, variances in quantitative variables, etc., so it is impossible to assess estimates’ precision. A relevant issue is the informants to the DB. A rigorous approach would correspond to a survey on a probabilistic sample of people or companies (of the target group) who have carried out these procedures. In contrast, DB consults a few informants (not a probabilistic sample), mainly people who work in large legal or accounting firms. Clients of these large firms are primarily large companies, with regulatory requirements, tax payments, etc., very different from DB’s target companies (small and medium-size).
We have not analyzed yet methodological problems in specific areas. See the Cardenas´ Report for an exhaustive discussion. To name just one more issue, the procedure to select the export product for the cross-border trade area is inappropriate in many cases. Consider Uruguay: meat exported to Russia is the selected destiny – product. DB team collects estimations of time and cost for border and documentary compliance for exports and imports. Then, by the distance to the frontier mechanism, compared with the 190 countries´ indicators of time, costs, etc., DB obtains the score of the area. The problem: exporting meat is demanding in time and costs due to sanitary and health regulations, both in the export and the destination country. The average score for the four export indicators in the 2020 Edition is 42.8 (in one indicator, cost to export: border compliance, Uruguay has one of the worst scores in the world: 2.8 in 100), while the four indicators of imports perform much better: a score of 74. Can we conclude that regulations and other “red tape” in exports are damaging the Uruguayan economy? We do not know. Perhaps Uruguay is one of the most efficient countries exporting meat (to Russia or other countries). But it compares poorly with indicators of much “simpler” (or unregulated) export products of the rest of the world. It is paradoxical that with DB´s obsession to compare and rank countries, it does not allow, in this case, to contrast Uruguayan´s performance with others.
Given their relevance and trajectory, the press and social networks are likely to focus on high-ranked people involved in these events. These individual behaviors should be investigated, but we want to convey that certain aspects of WB´s corporate culture, in the DB and DEC areas and in the whole Bank, contributed to this outcome.
We mentioned three elements that failed to achieve a quality product, and not weakened the Bank against pressures: the lack of transparency; the independence of knowledge production from income-generating activities; and dominance of the impact on public opinion vs. the production of relevant knowledge in a rigorous way.
Additionally, the DB saga should urgently prompt a thorough review of all «knowledge products» in the WB with the same methodological problems, the mashup style that Ravallion criticizes. The success of the DB report all these years encouraged several areas of the WB to produce DB-type indexes, with all its collateral damages. Perhaps the clearest example is the Enabling the Business of Agriculture index, but this is just one indicator of many developed “as a mirror” of the DB.
Finally, a review of the governance of these indicators is also necessary. One of the few recommendations adopted from Manuel´s report at the time was the transfer of DB under the supervision of DEC, the WB Research area. If recent events guides us, this change in governance was not enough.
 In previous editions it was called «distance to frontier». A higher score means a smaller distance to the frontier of “best” regulations.
 For imports, the selected product is Parts and accessories of motor vehicles imported from Brazil, a very reasonable choice.
[ii] Adrián Fernández is an Economist, professor at the Facultad de Ciencias Económicas (UdelaR – Uruguay), a researcher at the Centro de Investigaciones Económicas (cinve). Until October 31, 2020, Executive Director at the World Bank nominated by the Government of Uruguay and five other South American countries (Argentina, Bolivia, Chile, Paraguay and Peru). What is expressed in this article reflects the author’s opinion exclusively, without compromising the institutions mentioned above.