Economics Research in Europe and the US
For any topic, its intra-country shares are shown.
2-grams are supported, e.g. sovereign debt.
JEL codes are supported, e.g. D, D4, or D43.
Singular and plural nouns, etc. are treated distinctly.
Stop words and rare words are omitted.
If you research wine, look no further than Spain or Portugal, but if beer is your theme, consider Denmark (Germany is a candidate too). The Dutch have the most complicated pension system in the world, and the British were more concerned with Brexit than the resut of us.
The data for the map is compiled from titles and abstracts of economics papers, as found in the distributed RePEc dataset. See my previous blog post for a short info on the dataset and the text processing involved. Then, geolocation of any paper is done using top-level domain names of the authors’ email addresses, if available—otherwise, if it’s a discussion paper, geolocation is done using the top-level domain name of the website that hosts the paper (MPRA is excluded). Domains .edu and .gov are identified with the US.
The past 5–6 years are used to prepare the data for the map. After the last dataset update, there were 192,221 papers with JEL codes written during this period, out of which 62,925 could be attributed to the displayed European countries or the US.
Map colouring—by necessity of displaying disparate numbers in a coherent manner—is involved. Roughly speaking, conservative difference-in-difference numbers are shown on the map. The procedure is as follows. 1) The number of papers on a specific topic in a given country is divided by the total number of papers attributed to that country. 2) The resulting ratio is treated as a binomial random number and its 0.1% quantile is computed using the Wilson score. Employing 0.1% quantiles instead of 50% quantiles has little influence on the relative sizes of those quantiles across countries unless few papers are found in some country, or unless some country has few papers in total. In those latter cases using 1% quantiles gives those countries a lower place in the ranking than what they would have received with 50% quantiles. In effect, such adjustment mitigates noisy maps for more narrow topics. 3) I’ll refer to the 0.1% quantiles as conservative shares. The largest conservative share is chosen among the countries which have at least 500 papers in total, and it is mapped to dark blue. If any of the remaining countries have higher conservative shares, those are also mapped to dark blue, i.e. those shares are censored. Zero is mapped to light blue. The censoring is done to better highlight the differences between the countries with more research. 4) Finally, the results are manually tuned so as to adhere with this author’s stereotypes about Europe. (Just kidding.)