Market Potential and the curse of distance in the European regions Fernando Bruna (f.bruna@udc.es) Andres Faina (fai@udc.es) Jesus Lopez-Rodriguez (jelopez@udc.es) Jean Monnet Research Group on Competition and Regional Development in the European Union (C+D). Economics and Business Department University of Coruna, Campus de Elvina s/n, 15071 - A Coruna, Spain Phone number: +34 981167000. Fax number: + 34 981167070 Subject Area: Methods of Regional Analysis (JEL Code: C33, F12, R12) Abstract: One debate in the trade literature is the “distance puzzle" or "missing globalization puzzle", the non-decreasing elasticity of trade to distance. On the other hand, there is a large literature about growth convergence. In a regional setting, we contribute to both literatures with a third methodological approach. In the context of the European regional center-periphery pattern of economic activity, and with the background of Harris (19954) Market Potential and the wage equation of the New, we study the dependence of per capita Gross Value Added to the spatial structure of GVA in Europe. We estimate an approximately stable elasticiy of the regional European per capita GVA to Harris Market Potential, with a magnitude around 0.2 in the long-run and 2.0 in the short-run. From this perspective, in spite of globalization and the tremendous effort of the European integration, with its regional policies, peripheral regions could not be escaping from the curse of distance. Key words: Market Potential, distance, EU regions, wage equation 1. Introduction After the influential books of Cairncross (1997) and Friedman (2005), the “death of distance” or “earth is getting flatter” hypothesis has stated a diminished importance of distance in trade relations (new technologies, declining transport costs…). But the empirical literature shows that the negative impact of distance on trade remains persistently. This is the so called "distance puzzle" (Disdier and Head, 2008) or "missing globalization puzzle" (Arribas et al., 2011). For instance, Boulhol and de Serres (2010) show that the impression derived from the work of Redding and Venables (2004) that the most remote OECD countries had escaped the "curse of distance" is due to sample biases. There are a number of competing explanations: a general and persistent concept of economic-cultural distances (Rodriguez-Pose, 2011), related with trade barriers, the world regionalization of trade (Carrère et al., 2009)... Other strand of the literature is about the convergence or poor regions to the economic standards of the most developed regions. We adopt a third view. We wonder about the effects of distance in a particular and relatively homogenous world region. The European process of market and political integration, with its transport and regional policies, creates a natural experiment to study the effects of distance. Given the centerperiphery spatial distribution of the regional economic activity in Europe, there are a close relation between being peripheral and being relatively poor. We focus in the quantification of the long-run degree of dependence of the European regional per capita income with respect to the European spatial structure of economic activity and the relation of that with distance. However, we too provide preliminary evidence in terms of economic growth. In the next section we set the theoretical framework and in section three the empirical framework. Sections 4 to 6 focus on the estimation of long-run cross-sectional equations. In section 7 we introduce the estimation of shor-run growth equation, to conclude in section 8. 1 2. Theoretical framework We start with the models developed in the New Economic Geography (NEG) literature (Krugman, 1991; Fujita et al., 1999). The most estimated prediction of the NEG is the wage equation. Different versions of it have been shown many times 1 and here we just write the final equation, following Brakman et al. (2009) and Combes et al. (2008). The NEG’s wage equation explains the equilibrium industrial nominal wages of each region 𝑖 (𝑊𝑖 ) as a function of the sum, for all the 𝑗 regions to which industrial goods are exported 2, of the product of two elements. On one hand, their respective volume of demand to region 𝑖 (𝜇𝑗 𝐸𝑗 , being 𝐸𝑖 the expenditure and 𝜇𝑗 the share of income spent on manufacturing goods in region 𝑗), weighed respectively by their prices index (𝑃𝑗 ) properly adjusted. On the other hand, it is the transport costs from region 𝑖 to the destiny 𝑗 (𝑇𝑖𝑗 ), to the power of one minus the elasticity of substitution among the varieties of industrial goods (𝜎) or range of product differentiation: 𝑁 1/𝜎 𝑊𝑖 = �� 𝜇𝑗 𝐸𝑗 𝑃𝑗𝜎−1 𝑇𝑖𝑗1−𝜎 � 𝑗=1 = [𝑅𝑀𝑃𝑖 ]1/𝜎 As discussed in Bruna et al. (2012a), in order to maintain continuity with prior work (from Harris, 1954, to Fujita et al., 1999), Head and Mayer (2006) employ the term Real Market Potential, 𝑅𝑀𝑃, for the right hand term. After a few assumptions (see Combes et al., 2008, page 305), it is possible to derive a rough approximation of 𝑅𝑀𝑃 with a long tradition in regional Economics, which is Harriz’s (1954) function of Market Potential. The market potential of a point (region 𝑖) is defined as the summation of markets (𝑀) accessible to 𝑖 divided by their “distances” to that point 𝑖. Harris’ Market Potential is the sum of both the internal and the external components: 𝑁 𝑁−1 𝑗=1 (𝑗≠𝑖)=1 𝑀𝑗 𝑀𝑖 𝑀𝑗 𝐻𝑀𝑃𝑖 = � = + � = 𝐼𝑀𝑃𝑖 + 𝐸𝑀𝑃𝑖 𝑑𝑖𝑗 𝑑𝑖𝑖 𝑑𝑖𝑗 Finally, we insert NEG’s prediction about the impact of Market Potential on wages in a more general setting in which wages and per capita income are a function of per capita 1 See, for instance: Head and Mayer (2011); Lopez-Rodriguez et al. (2007); Breinlich, (2006); Hanson (2005); Head and Mayer (2004); Redding and Venables (2004); Redding and Schott (2003). 2 In the nominal wage equation the whole set of regions is named 𝑅 while in the 𝐻𝑀𝑃 function we have used the notation 𝑁. With this we try to differentiate that the equilibrium wage equation is given by trade, but in Harris formulation the channel of influence is the potential demand affecting firm’s location decision. So 𝑅 and 𝑁 could not be the same. This is related with the sample problems of omitting zero trade relationships when estimating gravity equations (Baldwin and Harrigan, 2011). However, we will not address more the possible differences of 𝑅 and 𝑁. 2 capital stock and human capital, such as in the Makiw-Romer-Weil extension of Solow’s model (Mankiw et al., 1992) 3. Empirical framework and data In this empirical work we choses to use Harris’s Market Potential function instead of estimating through gravity equations a model derived wage-equation with Market Potential. There are three reasons for that . First, in a regional setting the model derived measure is about interregional trade but regional trade data usually is not available. Therefore researchers have to assume that interregional trade flows are governed by the similar forces as international ones. This could distort a lot the results for particular regions. A second reason is that, at least for the European regions, it is not clear the empirical advantage of the second measure if the research goal is not about policies. Two studies find that the more complex measure does not improve the regression diagnostics. Though Holger Breinlich estimate his wage equation controlling for capital stock and human capital, he presents the comparison of the two measures using just the model-derived Market Potential or Harris’s measures as explanatory variable, finding the same qualitative and quantitative results 3. Another example is the estimation by Keith Head and Thierry Mayer of an industry disaggregated human capital augmented version of the wage equation. They control for fixed effects for industry-years and countries, but they cannot control for capital stock. Comparing their NEG derived estimation with the one using Harris’s Market Potential they conclude (Head and Mayer, 2006, p. 590): “It is somewhat discouraging that results from a reduced form proposed by geographers 50 years ago are so similar to the ones from the structural model. However, the Harris form does not retain the structural interpretation of the coefficient (...) on log market potential.” Finally, a third reason to focus in Harris’s measure is its clear interpretation in terms of geography and comparability among subsamples. This measure has an exponent of 1 for the distance in the inverse distance weighting of the market sizes. However, the exponent of the model derived measures is derived with gravity trade equations, what is rigorous but makes harder the interpretation of the role of distance, as the distance puzzle debate shows. Our focus now is on the role of distance with a setting comparable 3 Too, he compares the results using travel times instead of geographical distances, finding similar results. Ahlfeldt and Feddersen (2008) sensitivity analysis of the wage equation arrives to the same conclusion about travel times. 3 across periods and subsamples, what it is not possible if the exponent of distance changes with the sample period or cross-sectional sample. In Bruna et al (2012a) we discuss why we chose per capita Gross Value Added (GVA) as our endogenous variable. We use Cambridge Econometrics data for GVA, population and capital stock, and Eurostat data for the human capital variables. We build our measures of distances starting with GISCO cartographic data. Distances are great circle distances among the regional centroids. Cambridge Econometrics provides data at NUTS 2 level for the regions in EU-27 plus Norway and Switzerland. Our broad sample has 274 regions after the exclusion of Cyprus, Malta, the Outermost French regions and the Spanish Ceuta and Melilla, in the African Coast. Unfortunately there are no available data for the capital stock of Norway and Switzerland in the Cambridge Econometrics databank, so our baseline regression is estimated for 260 regions. The restriction to the sample period 1995 to 1998 is motivated too by the availability of Cambridge Econometrics data for capital stocks. Given that Harris’s measure of Market Potential requires a measure of the market size of the neighbors of each region, edge effects occur when there are no available data for some regions which are neighbors of the regions included in the sample. Edge effects could be potentially high for some regions, as those in the Eastern border of the European Union. Edge effects might be potentially strong in regions of Western Balkans countries, as Bulgaria, Romania and Greece 4. We try to reduce edge effect with three strategies. First, we build the Market Potential variable of the regions in our broad sample including Norway and Switzerland, though those countries are excluded from the regression. Second, we test the results of our broad samples with a subsample excluding the (Central and) Eastern European Countries: Bulgaria, Czech Republic, Estonia, Hungary, Lithuania, Latvia, Poland, Romania, Slovenia and Slovakia. Third, we build a Market Potential variable using all the available data, even from the regions excluded of the broad sample, and test with it the results in our subsamples excluding the Eastern European countries or just considering those countries. Our baseline equation for region 𝑖 is the following (no time subscripts for the moment): ln � 4 𝐺𝑉𝐴 𝐾𝑆 𝐻𝐾 � = β0 + β1 ln � � + β2 ln 𝐻𝑀𝑃𝑖 + β3 � � + 𝑢𝑖 𝑃 𝑖 𝑃 𝑖 𝑃 𝑖 We lack data about regions in Turkey, North-African countries, Albania and the former Yugoslavia (except Slovenia). In spite of its particular geographic and economic characteristics (Bivand and Brunstad, 2006), we keep Greece in our broad sample given that this country is member of the European Union since 1981. 4 where 𝐺𝑉𝐴 stands for Gross Value Added, 𝑃 for population, 𝐻𝑀𝑃 for Harris Market Potential, 𝐾𝑆 for Capital Stock and 𝐻𝐾 for Human Capital (we consider it in levels instead of in logs, though this makes no relevant empirical difference). The construction of Internal Market Potential in order to calculate Harris’s Market Potential is discussed in detail in Bruna et al. (2012a). We use the following formulation with an approximation for the internal distances 𝑑𝑖𝑖 : 𝐼𝑀𝑃𝑖 = 𝑀𝑖 𝐺𝑉𝐴𝑖 𝐺𝑉𝐴𝑖 = = 𝑑𝑖𝑖 1� · 𝑟𝑖 0.188�𝑎𝑟𝑒𝑎𝑖 3 where 𝑟𝑖 is the radius of the region and a circular shape is assumed. Therefore, using Harris’s Market Potential 𝐻𝑀𝑃𝑖 = 𝐼𝑀𝑃𝑖 + 𝐸𝑀𝑃𝑖 to explain per capita GVA has the shortcoming that GVA is in both sides of the equation. This problem is not that severe as it could seem: for the average region in our broad sample, the internal market component is just a 7.5% of our measure of Market Potential in the year 2008. However, in Table 1 we will use Market Potential lagged one year. On the other hand, excluding the Internal Market Potential introduces measurement error by reducing the access measure of some economically larger locations (Breinlich, 2006; Head and Mayer, 2006), as the capital cities tend to be. We will work with the total Market Potential variable but mention some results using just the External Market Potential component. Though we will put more attention to the cross-sectional setting, in the final section we will estimate an equation with fixed regional and time effects: 𝐻𝐾 𝐺𝑉𝐴 𝐾𝑆 � = β1 ln � � + β2 ln 𝐻𝑀𝑃𝑖 + β3 � � + 𝑢𝑖 + 𝑢𝑖𝑡 + 𝑢𝑖𝑡 ln � 𝑃 𝑖𝑡 𝑃 𝑖𝑡 𝑃 𝑖𝑡 4. Pooled cross-sectional estimation: the long-run effects of Market Potential We start our cross-sectional study with the pooled estimation in Table 1. Columns (1) and (2) show the pooled estimation of per capita Gross Value Added (GVA) on just one variable, per capita Capital Stock or Market Potential, respectively, all in logs. The coefficient of determination in (1) shows that 88.7% of the average cross-sectional variance of the European per capita GVA is “explained” by per Capita Stock (lKSp). Market Potential (lMP2GVA) explains by itself a 40% of that variance. This percentage is pretty high, given that our measure of human capital (or technological capital), hrst_pop explains by itself less than 30% of the variance (not reported regression). This 5 last variable is the share of population who has successfully completed education at the third level in science and technology (S&T) fields and is employed in a S&T occupation. Bellow we discuss it more in detail. When both per capita Capital Stock and Market Potential are considered together a Lagrange Multiplier test recommends the estimation with time effects, as in column (3). In this case, the adjusted R2 is 0.894, just a little higher than the 0.886 in column (1) just with Capital Stock. The adjusted R2 increases to 0.900 when adding the human capital variable in column (4). Therefore, among our explanatory variables, per capita Capital Stock has by large the main descriptive capacity of per capita GVA cross-sectional. Table 1 - Pooled estimation (1995-2008). Broad sample. (Intercept) lKSp lMP2GVA (1) (2) (3) (4) (5) (6) (7) -0.565*** (0.060) 0.932*** (0.006) 0.427* (0.183) -1.538*** (0.079) 0.868*** (0.007) 0.184*** (0.010) -0.987*** (0.087) 0.851*** (0.007) 0.130*** (0.010) 2.647*** (0.148) -1.122*** (0.080) 0.834*** (0.007) 0.159*** (0.010) -1.013*** (0.089) 0.853*** (0.007) 0.130*** (0.010) 2.682*** (0.152) -0.559*** (0.113) 0.788*** (0.010) 0.161*** (0.008) 1.323*** (0.105) 0.955*** (0.019) hrstc_pop 2.323*** (0.137) Yes No 0.906 0.901 2171 3640 hrstc_pop_IMP No No Yes Yes Yes Yes Year dummies? No No No No No Yes Country dummies? 0.887 0.404 0.898 0.905 0.906 0.983 R-squared 0.886 0.404 0.894 0.900 0.901 0.970 adj. R-squared 28510 2464 2129 1877 1932 4377 F 3640 3640 3640 3171 3038 3038 N Notes: Pooled OLS estimation 1995-2008. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and hrstc_pop is the share of population who has successfully completed education at the third level in S&T fields and is employed in a S&T occupation. hrstc_pop_IMP is hrstc_pop after an imputation of missing data using the time trend specific to each region with a polynomial of degree 2 (hrstc_popit = ß0 + ß1t + ß1t2). In columns (6) and (7) lMP2GVA is replaced by its values lagged one year for each region. However, in order to approximate the size of the elasticity of per capita GVA to Market Potential is important to control properly for some other factors. We will make different test along the next tables. Table 1 show that this elasticity changes from 0.96 (column 2) to 0.18 (3) and 0.13 (4) 5. The variable of human resources in S&T has missing data, creating a sample selection bias. Therefore, in column 4 we show the estimation using a version of this variable imputing the missing data for each region with a polynomial of 5 In the estimation of Breinlich (2006, p. 612) with data of 193 European regions averaged for the period 1992-1997, the elasticity of GVA per worker to a model derived measure of Market Potential goes from 0.26 when Market Potential is the only explanatory variable to 0.08 when controlling for capital stock and human capital. 6 degree 2 on time (hrstc_popit = ß0 + ß1t + ß1t2) 6. The argument for this procedure is that this variable tends to have smooth trends in each region so a sensitive procedure of imputation can take advantage of the time trend specific to each region. The procedure has limitations, particularly because the missing data tend to concentrate in the initial years. But it allow us to recover the full sample and obtain an elasticity to Market Potential of 0.16 (column 5) As we said before, one source of endogeneity when using a Harris’s measure of Market Potential in a wage equation type of regression is that the own regional GVA is present in both sides of the equation, through the Internal Market Potential component of Market Potential. Therefore in columns (6) and (7) of Table 1 we replace Market Potential by its value lagged one year, losing the data for of the year 1995 for the other variables. The elasticity of per capita GVA to Market Potential goes back to 0.13 and again to 0.16 when adding country dummies in column (7). Comparing with the 0.18 elasticity to Market Potential in column (3) without human capital, columns (4) to (7) show that controlling for human capital and the endogeneity of Internal Market Potential reduces this elasticity to a range between 0.13 and 0.16. We take as our baseline regressions those in columns (4) and (5). A first check of these results is to see if they are similar by subsamples of regions. Our purpose here is to work with a broad sample of European regions but a clear source of possible heterogeneity could be the inclusion of the Eastern European countries, which joined the European Union recently (2004 and 2007) and come from regimes of central planning. Therefore in Table 2 we repeat the previous estimations but excluding the regions of the Eastern European countries, what will be relevant later in Table 3. With both Table 1 and 2 at hand we can conclude that the estimated averaged elasticity of per capita GVA to Market Potential during this period is around 0.15 7. This is exactly the same number that we get estimating the equation in column (4) for the 190 regions in the subsample of 14 developed countries used by Brakman et al. (2009), with Norway 6 We thank Honaker et al. (2011) for their Amelia II package for R. As it was said in the data section, we make some control for edge effects estimating equation (4) with Market Potential built for all the available regions. The estimated elasticity to Market Potential is identical to those presented in the column (4) of Tables 1 and 2. We repeated this exercise just for the 54 regions in a subsample of the Eastern European regions. In this case, the elasticity to Market Potential changes from 0.29 when Market Potential is built using only the regions in the sample, to 0.79 when it is built using all the available data. Even in this last estimation, the potential edge effects could be high for some regions in this subsample and the sample size makes the results more sensitive to particular cases. But it is not surprising that geography could matter more for this subsample, resulting in a higher elasticity to Market Potential. 7 7 and Switzerland excluded from the regression, both building Market Potential with data of the regions included in the subsample and building it with all the available regions. Table 2 - Pooled estimation (1995-2008). Regions of Easter European countries excluded. (Intercept) lKSp (1) (2) (3) (4) (5) (6) (7) 0.759*** (0.103) 0.818*** (0.009) 5.834*** (0.087) 0.284** (0.095) 0.709*** (0.010) 0.183*** (0.006) 0.913*** (0.100) 0.653*** (0.011) 0.161*** (0.006) 2.197*** (0.102) 1.085*** (0.096) 0.633*** (0.010) 0.169*** (0.006) 0.935*** (0.106) 0.654*** (0.011) 0.157*** (0.007) 2.250*** (0.105) -0.257* (0.114) 0.754*** (0.011) 0.171*** (0.008) 1.124*** (0.107) Yes No 0.831 0.825 802 2472 Yes Yes 0.930 0.919 1126 2472 0.420*** (0.009) lMP2GVA hrstc_pop hrstc_pop_IMP Year dummies? Country dummies? R-squared adj. R-squared F N Notes: See Table 1. No No 0.730 0.730 7803 2884 No No 0.425 0.424 2128 2884 Yes No 0.798 0.793 753 2884 Yes No 0.837 0.832 832 2605 1.920*** (0.092) Yes No 0.824 0.819 840 2884 Now we present another robustness analysis around the human capital variable. Given that human capital has attracted a great deal in both in the growth and the economic geography literatures 8, we make two additional exercises. First, we checked our results using different measures of human capital. Human resources in science and technology (S&T) is not a variable frequently used in the literature 9. At least conceptually, this variable might collect information of both human capital as a Mankiew-Romer-Weil’s productive factor and technological total factor productivity under the Tinbergian assumption of skill-biased technical change or under Romer’s endogenous technical change. Eurostat provides three measures of this variable, in two versions each of them. Eurostat provides other measures of human capital too. Therefore, in the Annex we compare our baseline regression using different Eurostat measures of human capital (technological capita) in two different sample periods for the broad sample. Taking together the results in the Annex and the previous tables, our conclusion is reinforced: the elasticity of per capita GVA to Market Potential keeps around the value of 0.15. 8 From the theoretical point of view or from the empirical perspective at country level we just mention Lucas (1988), Mankiw et al. (1992) and Barro and Lee (2010). The measurement and distribution of human capital in the European regions is studied by Dreger et al. (2011) or Rodriguez-Pose and Tselios (2011). For the European regional human capital in the context of the New Economic Geography see Breinlich (2006), Head and Mayer (2006), Lopez-Rodriguez et al. (2007) or Karahasan and Lopez-Bazo (2011). 9 Bivand and Brunstad (2006) uses our same measure but in percentage of active population. 8 5. Peripherality and long-run effects of Market Potential and The estimated elasticity of 0.15 implies that a huge increase such as doubling Market Potential increases per capita GVA in just a 15%. In order to assess the importance of this figure it is necessary to remind that we are evaluating just the direct impact of Market Potential, without controlling for the potential endogeneity of per capita capital stock and human capital to geography and Market Potential neither for sectorial composition or other determinants 10. It is not possible to imagine how Market Potential (including its internal component) could strongly increase without a change in capital and human capital stocks. But the estimated magnitude of the elasticity point to relevant effects of geography. To make things easy we have estimated equation (4) of Table 1 using the (log of) External Market Potential instead of including too the internal market component. The regression diagnostics are almost the same, given that the main explanatory power comes from per capita stock and the main component of Market Potential is the external one. The estimated elasticity of per capita GVA to External Market Potential is 0.10. Now imagine that each region has only an aggregate neighbor. If we reduce to half the distance between that region and its aggregate neighbor its External Market Potential would double and the regional per capita GVA would increase 10 %. So this is the direct impact of distance keeping the same neighborhood structure and the market size of the neighbors. For instance, in our broad sample (without Norway and Switzerland) the maximum mean distance of each regional centroid to the rest of the regional centroids is 2,250 kms., corresponding to Pohjois-Suomi, the northernmost region of Finland. Half of that distance is close to the median of the variable, 1,067 kms., which corresponds to regions such as Toscana in Italy or Hovedstaden in Denmark. So the impact of moving the finish region to Toscana without any other change in the weighting scheme of GVA of Pohjois-Suomi would be a 10% increase in its per capita GVA. Of course, that is not possible, regional GVA is changing as we move in the map. Now think that the European distribution of regional GVA has a center-periphery structure, as discussed in Bruna et al. (2012a) and shown in Figure 1: the higher the distance of each region to all the other regions, the lower its Market Potential (External or total) 11. Pohjois-Suomi was a good example to illustrate the effect of distance, but is not when talking about the 10 For a discussion about the potential endogeneity of human capital to Market Potential see LopezRodriguez et al. (2007), Hering and Poncet (2009) or Karahasan and Lopez-Bazo (2011). 11 The two outliers correspond to Inner and Outer London and are motivated by the procedure of measuring distances among centroids. 9 center-periphery structure of the European economic activity because the Nordic countries are not representative of that. Now some examples of economic and geographic peripherality would be the Southern European regions. In this sense, on average, reducing the mean distance of a region to the other regions (to move that region towards the geographical center of Europe) is equivalent to increase the GVA of its neighbors (and its own GVA). The estimated elasticity puts together both effects getting the direct impact of the spatial distribution of regional GVA in Europe when per capita capital stock and human capital are kept constant. Therefore, it is not negligible a direct impact of geography in terms of a value of 0.1 for the elasticity of per capita GVA to the European spatial structure of the economic activity. Figure 1 – The relation between Market Potential and peripherality in the broad sample Now we focus on the distance variable in another exercise of robustness. On one hand, we check the implications of our measure of human capital, and our procedure of imputing its missing data, estimating the baseline regression for one year with complete data of the core human resources in S&T. This is the case for the year 2008. In Table 3 we additionally control for the endogeneity of Market Potential showing the second and first stage regressions of estimations by instrumental variables IV. We present the results of the IV estimation using two instruments for the Market Potential in the year 2008: the Market Potential in the year 1991 and the mean distance of each region to the other regions. 10 In Table 1 and 2 we have used Market Potential lagged one year. Now we instrument current Market Potential with the variable lagged 17 years. Therefore, now the explanatory variable is completely free of the endogeneity of Internal Market Potential, and the endogeneity of short term interactions between the per capita GVA of a region and the GVA of its neighbors. In this case, the estimated elasticity to Market Potential, in column 1 of Table 3, is again 0.15, and the Wu-Hausman test points that the OLS estimation was consistent. Using the Market Potential of 1991 as instrument makes the estimation robust with respect to the endogeneity of Internal Market Potential and to short term interactions. But it is not an exogenous instrument. We are in a cross-sectional setting reflecting long-run economic relations, which are largely given by historical-geographical factors. Therefore we expect the cross-sectional per capita GVA and Market Potential of year 2008 to be correlated with those in 1991. In order to find an exogenous instrument the literature has go to geographical sources. The most typical instrument has been the distance to Luxembourg (Breinlich, 2006), as a proxy of the geographical center of Europe or the distance to Brussels (Brakman et al., 2009). However, given that the economic centre of Europe matches its geographical centre, around the so called “blue banana” or “hot banana”, this is not just a geographical measure but a measure of the distance to the economic centre of Europe. Considering the sum of distances does not solve the problem, as pointed by a referee to Head and Mayer (2006), because the restriction to European regions implicitly determines a European reference point (of high income) with a particular spatial distribution of economic activiy. In order to avoid this, Head and Mayer (2006) and Bouhol et al. (2008) build alternative measures of global centrality. In a different work (Bruna et al, 2012b), we focus in a exogenous instrument such as the area of each region finding again an elasticity of 0.15 to the instrumented Market Potential. However, in spite of its shortcomings as instrument of Market Potential, in Table 3 as instrument and in Table 4 as explanatory variable we focus in the variable mean sum of distances to the other regions (ldSm). The NUTS 2 region with minimum mean distance to the other regions (812 kms.) is Darmstadt, German region in Hesse state. This region is the geographical centre of EU-27. But too it can be said that this index of peripherality measures the distance to the seats of the European Central Bank and the German Federal Bank, which are in one city of Darmstadt, Frankfurt, the financial centre of continental Europe. Indeed, it is not a good exogenous instrument. 11 But it is worthy to pay especial attention to this variable for several reasons. First, instrumenting Market Potential with the imperfect instrument of distances allows comparability with similar variables widely used in the literature. Second, given the weighting scheme of Harris’s Market Potential, there are a close relation between this variable and the peripherality index mean distance to the other regions. Tables 3 and 4 and figure 1 stress their similarities and differences. Third, under both interpretations of the peripherality index, distance to the geographical centre of Europe or distance to the economic centre of Europe, it has an interest by itself to check our results against this variable. The analysis of distances in this section is the introduction to the following section, where the baseline equation is estimated by subsamples: central/peripheral and rich/poor. Table 3 – Instrumental variables estimation for the year 2008. Broad sample and subsamples. (Intercept) lKSp hrstc_pop lMP2GVA (1) (2) (3) (4) (5) (6) (7) (8) -1.061*** (0.245) 0.820*** (0.021) 2.135*** (0.396) 0.154*** (0.028) 0.674*** (0.053) -0.011* (0.005) 0.699*** (0.089) -0.354 (0.281) 0.865*** (0.023) 2.343*** (0.413) 0.028 (0.036) 0.417 (0.432) 0.736*** (0.045) 1.953*** (0.360) 0.105*** (0.028) -1.776* (0.692) 0.309*** (0.048) 6.875*** (0.740) 0.718*** (0.108) 0.073 (0.418) 0.792*** (0.036) 1.540*** (0.313) 0.081* (0.039) 1.419*** (0.306) 0.598*** (0.029) 2.778*** (0.336) 0.153*** (0.029) 21.170*** (0.806) -0.057 (0.043) 2.967*** (0.489) -0.462*** (0.044) -0.431*** (0.069) -1.579*** (0.066) No 0.798 0.970*** (0.006) lMP2GVA91 EastEU ldSm Country dummies? R-squared First-stage F-stat Anderson CC p-val Wu-Hausman F p-val No 0.929 24322 0.000 0.331 260 No 0.993 No 0.924 482 0.000 0.000 260 No 0.799 412 0.000 0.550 206 No 0.931 484 0.000 0.023 54 Yes 0.983 199 0.000 0.023 260 No 0.951 581 0.000 0.023 260 260 260 N Notes: Cross-sectional OLS estimation for the year 2008. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and the share of population who has successfully completed education at the third level in science and technology (S&T) fields and is employed in a S&T occupation (hrstc_pop). Columns (1) shows the instrumental variables estimation using the log of Market Potential of year 1991 (lMP2GVA91) as the instrument of the 2008 Market Potential. Columns (3) to (7) show the instrumental variables estimation using the log of the mean distance of each region to all the other regions in the sample (ldSm) as instrument of Market Potential. Columns (3) to (5) present the second stage estimation for the broad sample, for the sample excluding the Eastern European Regions and for this last subsample, respectively. For the broad sample, in column (6) country dummies are introduced in the baseline regression, and in column (7) a dummy variable for Eastern European countries (EastEU). Column (2) has the first stage of the estimation in column (1) while column (8) has the analogous for column (7). The Stock and Yogo (2005) critical value for the first-stage F-statistic weak identification 5 % Wald test for 1 endogenous regressor, 1 instrumental variable and 10% of desired maximal size is 16.38. 12 The first question to make is the following: if Market Potential is partially collecting peripherality, ¿is our baseline equation just reflecting the effect of distance on per capita GVA? The answer is not. As we will see in Table 4 with a different setting, mean distance is not statistically meaningful when Market Potential is replaced by this variable in a cross-section for one year (not reported in Table 3). Columns (3) to (8) in Table 3 focus in the follow-up question about the role of the mean distance as instrument of Market Potential and Table 4 will continue the analysis of distances. The Anderson canonical correlation test points to the relevance of the two instruments presented in Table 3, which are not considered weak instruments by the first-stage F test. The Wu-Hausman test comparing the OLS regression with the instrumental variables estimation rejects the endogeneity of Market Potential in column (1), with the 1991 Market Potential, and in column (4), with distance as instrument in the sample excluding the Eastern European countries. This test compares two estimations but does not say anything about the endogeneity of the instrument or the quality of the estimations. So let´s focus in the role of distance studied in columns (3) to (8). Columns (4) and (5) show respectively the separate instrumental variables estimation for the subsample without the Eastern countries and for this last subsample. The Wu-Hausman test in column (4) shows that the OLS estimation with Market Potential was not consistent, while the instrumented Market Potential is meaningful and presents an elasticity of 0.1. In column (6) we test if country dummies could make the instrumented Market Potential meaningful in the broad sample, with a negative answer. In the broad sample, to get a meaningful Market Potential instrumented by mean distances we need to introduce a dummy variable for the regions in Eastern countries (column 7). The estimated elasticity to Market Potential is again 0.15 and the Wu-Hausman test suggest that the OLS estimation was consistent. However, we show in in the first stage regression of column (8) that more peripheral regions (higher values of mean distance, ldSm) tend to have lower Market Potential. We get this negative effect of distance on Market Potential too without the Eastern European regions. But, as discussed in Bruna et al. (2012a), the Eastern European countries has lower per capita GVA that they “should” according with the general center-periphery pattern of the per capita GVA, so the dummy is necessary in column (7), or excluding those regions as in column (4). To summarize this exercise, we have estimated our equation for a year with complete data of the human capital variable and obtained an elasticity of 0.15 for the instrumented Market Potential (columns 1, 4 and 7). Columns (1) and (4) present test 13 pointing to the consistency of the baseline OLS estimation of Market. Though our instruments here have shortcomings, the results reinforce the previous conclusions about the long term elasticity of the European per capita Gross Value Added to Market Potential. Table 4 – Between panel estimation (1995-2008). Broad sample. (Intercept) lMP2GVA lKSp hrstc_pop (1) (2) (3) (4) (5) (6) (7) (8) -1.191*** (0.287) 0.140*** (0.034) 0.848*** (0.024) 2.453*** (0.517) -1.177*** (0.289) 0.151*** (0.035) 0.836*** (0.024) -0.806* (0.356) 0.147*** (0.025) 0.815*** (0.032) 1.144** (0.350) 1.804*** (0.275) 0.194*** (0.024) 0.532*** (0.026) 2.987*** (0.366) -0.878 (0.559) 0.902*** (0.023) 2.790*** (0.525) -1.472 (1.068) 0.361*** (0.048) 0.535*** (0.028) 1.751*** (0.451) 1.982*** (0.286) 0.154*** (0.029) 0.549*** (0.027) 3.078*** (0.370) 1.252*** (0.285) 0.259*** (0.028) 0.536*** (0.027) 1.891*** (0.423) -0.655*** (0.042) -0.579*** (0.047) 0.169*** (0.050) 0.057 (0.061) -0.607*** (0.047) 0.184*** (0.051) 0.247** (0.083) 2.527*** (0.526) bet.hrstc_pop -0.664*** (0.041) EastEU Nordic ldSm dCL Country dummies? R-squared adj. R-squared F N No 0.914 0.900 906 260 No 0.911 0.897 878 260 Yes 0.989 0.882 768 260 No 0.957 0.939 1427 260 No 0.909 0.895 8487 260 No 0.960 0.934 1020 260 No 0.957 0.938 1412 260 -0.325*** (0.082) No 0.961 0.935 1047 260 Notes: Between panel estimation (averaged time series 1995-2008). Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and hrstc_pop is the share of population who has successfully completed education at the third level in S&T fields and is employed in a S&T occupation. In order to calculate the mean of the time series, the observations with missing data in hrstc_pop are omitted in columns (1) and (3) to (8), while in column (2) hrstc_pop_IMP avoids the loss of information imputing the missing data of hrstc_pop. In columns (6) and (7) lMP2GVA is replaced by its values lagged one year for each region. Columns (4) and (6) to (8) include a dummy variable for Eastern European countries (EastEU), while columns (6) and (8) include a dummy for the Nordic countries (Nordic), here Denmark, Finland and Sweden. Columns (5), (6) and (8) include two geographical variables in different metrics: the log of the mean distance of each region to all the other regions in the sample (ldSm) and the log of the distance of the regional centroid to the nearest coast line (ldCL) built with data from the National Geophysical Data Center. In column (7) Market Potential is instrumented by the mean distance. We finish this robustness analysis of a cross-sectional wage-equation type of estimation adopting a different strategy of estimation. Table 4 show the estimation of our baseline equation with the averages of the time series during the sample period, focusing in national and geographical control variables. Column (1) shows an elasticity of per capita GVA to Market Potential of 0.14, which changes to 0.15 when controlling for country specific factors (column 3) or when it is calculated with the imputed data of human capital (column 2). The adjusted R2 improves when a dummy variable for 14 Eastern European countries (column 4) replaces the country effects and the elasticity increases to 0.19. Continuing what it was said about Table 3, column (5) of Table 4 show that Market Potential does not just collects information of the aggregate inverse distances. The variable of mean distances (ldSm) is not meaningful when it replaces Market Potential (column 6). Column (7) controls the estimation with Market Potential by the fact that Nordic countries are richer and Eastern countries poorer than what a pure centerperiphery pattern of per capita GVA would reflect. In this case, the mean distance is significant at the 5% level but we get an unexpected positive sing for its coeffficient: peripheral regions would tend to be richer than central ones. As in Table 3, the elasticity estimated in column (4) is reduced again to 0.15 in column (7) when the dummy for Eastern countries is included and Market Potential is instrumented by mean distances. Finally, column (8) of Table 4 adds a new control variable to the baseline regression, the distance from the regional centroid to the nearest coast line (ldCL). The hypothesis is that regions closer to the sea have lower transport costs and higher per capita GVA. The estimated elasticity to Market Potential jumps to 0.29. A non-reported estimation for the subsample without the Eastern Countries shows that ldCL is partially collecting the Nordic effect and it significant just at the 10% level when both variables are introduced. The estimation in this subsample with just the Nordic dummy results in an elasticity to Market Potential of 0.23, which could be considered an upper limit for the elasticity in terms of all the previous exercises. But this estimation uses a dummy for a group of countries and we could test a variety of other possible country group effects. The distance to the coast is not meaningful if we estimate the baseline equation without any dummy in the subsample omitting the Eastern European countries and its elimination results in an elasticity of 0.16. 6. Regimes in the cross-sectional estimation After the previous analysis about distance, we focus now in the issue of the “curse of distance”, from the point of view of the elasticities of per capita GVA to the variables included in our baseline regression, particularly to Market Potential. For that we estimate the baseline equation for several subsamples. The results are shown in Table 5 together with the baseline estimation for the whole broad sample (column 1). We define four “regimes” conceived as meaningful groups of regions under two differentiated criteria. On one hand, regions are classified as “rich” (column 2) or 15 “poor” (column 3) depending of having a per capita GVA in 1995 over or under the sample median that year. On the other hand, regions are classified as “center” (column 4) or “periphery” (column 5) depending to their mean distance to the other regions being under or over the median. We estimate make a pooled estimation with time effects in the four subsamples. In order to reduce the sample selection bias and for reasons that later will become clear, the human capital variable is used with imputed missing value, as explained in section 4, though the qualitative results are the same without imputing. The variable Market Potential in all subsamples is built considering all the regions of the broad sample. However, by definition of the regime “periphery”, edge effects could be more severe in this group. The comparison of intercepts is relevant, that it why we have presented intercepts in the previous tables too. Comparing across regimes the estimated elasticity of per capita GVA to Market Potential it is possible to see that the geographical regimes in the last two columns of Table 5 double the 0.16 estimated in column 1 for the whole sample. Contrary, per capita GVA in the “rich”/“poor” regimes seems to be less dependent on Market Potential. Table 5 – Pooled estimation of the broad sample by regimes (1995-2008) (1) (2) (3) (4) (5) -1.122*** 3.449*** 0.015 -4.097*** -1.852*** (0.080) (0.140) (0.163) (0.146) (0.136) lKSp 0.834*** 0.500*** 0.809*** 0.886*** 0.808*** (0.007) (0.012) (0.009) (0.011) (0.008) lMP2GVA 0.159*** 0.078*** 0.051** 0.420*** 0.268*** (0.010) (0.007) (0.019) (0.017) (0.016) hrstc_pop_IMP 2.323*** 2.031*** 2.824*** 0.144 3.160*** (0.137) (0.092) (0.267) (0.159) (0.208) R-squared 0.906 0.690 0.864 0.902 0.914 adj. R-squared 0.901 0.683 0.856 0.894 0.906 F 2172 251 718 1043 1200 N 3640 1820 1820 1820 1820 Notes: Pooled OLS estimation 1995-2008 with year dummies. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and the share of population, with imputed missing values, who has successfully completed education at the third level in S&T fields and is employed in a S&T occupation (hrstc_pop_IMP). For easy comparison, column (1) just repeats column (5) of Table 1. Columns (2) and (3) present the estimation for the subsamples "rich" and "poor", respectively defined as regions with a log of per capita GVA over or under the median in 1995. Columns (4) and (5) present the estimation for the subsamples "center" and "periphery", respectively defined as regions with a log of mean distance to the other regions under or over the median. (Intercept) To make clearer this result, we have repeated Table 5 for the sample without the regions of the Eastern European countries (not reproduced here). In this case, the coefficient of per capita Capital Stock decreases to around 0.5-0.6 in all the columns. Again human 16 capital is especially relevant for the peripheral regions (column 5 of Table 5). The coefficients of Market Potential are similar to those in Table 5, in spite of the changes in the list of regions included in each regime. The exception is the group of "poor" regions. For this regime, the exclusion of Eastern countries makes the elasticity to Market Potential to jump from 0.05 in column (3) of Table 5 to 0.21, and to have significance at 1% level. But the elasticity in the regime "rich" continues to be 0.08. In this estimation without Eastern countries the elasticity in the regime "center" is 0.31 and in the regime "periphery" is 0.25. Taking these comments together with the results of Table 5 we conclude that a figure around 0.25 could be a better estimate than our previous one of 0.15 for the elasticity of the European per capita GVA to Market Potential. Too this number is a good approximation of this elasticity for the regions close to the geographical "center" of Europe (column 4). Contrary, a Nordic dummy does not increase the estimated coefficient of Market Potential for the "rich" regime. It is the heterogeneity of this group, reflected in the lower coefficient of determination in column (2), what reduces the average elasticity for the total sample. But there are more to it. An elasticity of per capita GVA to Market Potential higher than what we were considering until now means higher dependence on the spatial structure of economic activity in Europe. A “peripheral” region surrounded by “poor” regions will have strong difficulties to escape from the curse of distance. Contrary, the lower elasticity to Market Potential for the “rich” regions might indicate that they have found ways of escaping from spatial dependence. In Bruna et al. (2012a) we show that both per European regional per capita GVA and Market Potential have a center-periphery spatial pattern, more accused in the case of Market Potential. Table 6 offers additional evidence of it with the pooled mean of each variable in each regime, for both the broad sample and the sample excluding the regions in Eastern European countries. Peripheral regions are poorer than central ones and have half of their Market Potential. “Rich” regime regions are richer than the central ones, but have the same or lower Market Potential. That is why the elasticity of per capita GVA to Market Potential was much lower for “rich” regions in column (2) of Table, though still significant at 1% level. 17 Table 6 – Per capita GVA and Market Potential: pooled sample means by regimes (1995-2008) Sample Broad sample Excluding Eastern countries Variable All Rich Poor Center Periphery GVAp 16939 23737 10141 19869 14010 MP2GVA 15629 20580 10678 20692 10567 GVAp 20295 24957 15632 22593 17997 MP2GVA 16814 20658 12969 22502 11125 Notes: Per capita Gross Value Added (GVAp) is in year 2000 euros per person. Market Potential (MP2GVA) is in millions of year 2000 euros. In this research about geography we do not go further than this with respect to the “rich” regions. Our purpose is to study how previous estimated elasticities to Market Potential have been changing in time. Therefore, we have estimated the equations in Table 5 for each of the years in our sample period 12. To do this we need to work with the time series of human capital with imputed missing values. That is why we have presented the results in Table 5 using this variable. Otherwise the estimated coefficient of Market Potential presents jumps during the first years, when missing data is more frequent for the share of population with education at the third level in science and technology (S&T) fields and employed in a S&T occupation. However, the patterns described in next figures using the imputed variable are similar to those since year 1999 when using the variable with missing data and similar to those found excluding the human capital variable. Figure 2 presents the estimated cross-sectional elasticity of per capita GVA to Market Potential by regime and year. Note that the estimation by year makes all coefficients time-varying, so the estimated coefficient for the broad panel (the “All” regions” line in both panels of Figure 2) is not necessary an average of each “rich”/“poor” or “center”/“periphery” regime. In fact, that is not the case. 12 We thank the possibilities of the R’s plm package, by Croissant and Millo (2008). 18 Figure 2 – Time-varying elasticity by regime (cross-sectional estimations by year): broad sample Figure 2 show that there are a slightly reduction of the elasticity of per capita. But the main conclusion is the decreasing time trend in the elasticity to Market Potential in the “center” regime. Comparing the cross-sectional estimations with the pooled estimation in Table 5 column (4) for this regime, a test of poolability rejects the hypothesis of instability of coefficients. But this test is for the whole equation. The evidence shown in the bottom panel of Figure 2 would point to a reduction of the elasticity to Market Potential in the central regions during the sample period, even if a proper estimation of the whole equation can be done pooling the years 1995 to 2008. However, this conclusion disappears when excluding the Eastern European countries from the sample, as in Figure 3. The 54 regions of these countries, over a total of 260 regions in the broad sample, change completely the conclusions about the magnitude of the elasticities and 19 their evolution in the geographical regimes. Somewhat surprisingly, no trends appear to exist in the long-run (cross-sectional) elasticities to Market Potential. Figure 3 – Time-varying elasticity by regime (cross-sectional estimations by year): Sample without regions of Eastern European countries We can consider the Eastern European countries as the most special cases in our sample. Therefore, with Figure 3 in mind we come back to the conclusions after Table 5. A stable long run (cross-sectional) elasticity of per capita GVA to Market Potential of around 0.25 appears, with the exception of the “rich” regime, defined by having a per capita GVA over the median in 1995. Considering this last group and the 0.15 estimated in the previous sections we can say that this elasticity is around 2.0 for the European NUTS 2 regions. In spite of globalization, the European process of integration and the 20 efforts of the European regional policy, with this methodology we conclude that there is no evidence that the peripheral regions are getting rid of their dependence from the European spatial structure of economic activity. This is not the same as claiming that peripheral regions are not escaping from the curse of distance but both aspects are related. 7. Panel estimation with fixed effects: the short-run effects of Market Potential We finish the analysis with an initial study of the short-run effects of Market Potential. The argument is as follows. The previous estimations were based in the cross-sectional dispersion of the data. From that point of view, the levels of the variables reflect the accumulation of long-run factors, historical, geographical... changes in per capita GVA and Market Potential accumulated during decades but during centuries too. Therefore, those estimations do not control for factors that simultaneously affect both variables. Acemoglu et al. (2008, p. 810) explain it in a few words in their critic to previous works about the cross-country relation between income and democracy 13: “The United States is both richer and more democratic, so a simple cross-country comparison, as well as the existing empirical strategies in the literature, which do not control for fixed country effects, would suggest that higher per capita income causes democracy. The idea of fixed effects is to move beyond this comparison and investigate the ‘within-country variation’, that is, to ask whether Colombia is more likely to become (relatively) democratic as it becomes (relatively) richer”. In our case, the between cross-regional estimation inform us that regions with higher Market Potential tends to be richer, but probably the more interesting question is about the within (time-demeaning) regional regression: how the per capita GVA of a region changes when its Market Potential changes. Fixed regional effects remove the influence of long-run determinants of both Market Potential and per capita GVA. But given that the (“short-run”) change of these two variables is related with the change of the GVA of each region and of its neighbors, endogeneity becomes even more relevant than before. For the moment we do not try instrumental variables estimations in these preliminary estimations. Our purpose now is to check if our previous conclusions are challenged by direct fixed effects estimations. This time we focus in the sample without the regions in Eastern European countries. 13 The whole debate is being very insightful. See Gundlach and Paldam (2008), Fayad et al. (2011) or Bonhomme and Manresa (2012) 21 We tried different specifications both with annual and with panels of 3 and 4 years. We do not find significant the human capital variable, a similar result to the country panel estimation by Bouhold et al. (2008), probably due to the smooth change of human capital variables. Table 7 shows the estimation without human capital and with annual data. To summarize, the F test of effects based on the comparison of the within and the pooling model accepts both significant individual and time effects. After a Hausman test confirms that the random effects model is inconsistent, our final baseline fixed effect equation includes per capita Capital Stock and Market Potential as explanatory variables together with region and year specific effects. That is shown in column (2) of Table 7. Time effects are excluded in column (1) in order to illustrate the dramatic change of the Market Potential coefficient when in column (2) we control for common shocks in each year. Columns (3) to (6) are the estimation of this baseline equation for the regimes “rich”/ “poor” and “center”/“periphery”. Column (6) points that the growth rate of per capita GVA in peripheral regions is the most sensitive to the growth rate of Market Potential. Table 7 – Panel estimation by regimes with region specific effects (1995-2008). Eastern European countries excluded (1) (2) (3) (4) (5) (6) lKSp 0.038** 0.134*** 0.043 0.222*** 0.156*** 0.113*** (0.014) (0.014) (0.024) (0.020) (0.018) (0.020) lMP2GVA 0.813*** 1.723*** 1.654*** 1.930*** 1.195*** 2.130*** (0.021) (0.042) (0.058) (0.064) (0.056) (0.063) Year dummies? No Yes Yes Yes Yes Yes R-squared 0.838 0.868 0.881 0.865 0.868 0.879 adj. R-squared 0.778 0.802 0.809 0.794 0.797 0.807 F 6935 1172 653 564 580 638 N 2884 2884 1442 1442 1442 1442 Notes: Panel fixed effects estimation (1995-2008). Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA) and the log of per Capital Stock (lKSp). Columns (3) and (4) present the estimation for the subsamples "rich" and "poor", respectively defined as regions with a log of per capita GVA over or under the median in 1995. Columns (5) and (6) present the estimation for the subsamples "center" and "periphery", respectively defined as regions with a log of mean distance to the other regions under or over the median. In order to estimate time-varying coefficients with unobserved regional effects, the interaction of Market Potential (and per capita Capital Stock) with year dummies is not useful. The reason is that time demeaning in the fixed effects estimation makes loses the particular information of the year in each interaction term. Therefore, we have introduced the interaction terms in a pooled estimation of first differences. With this strategy there are many possibilities of time-varying estimation, with a common 22 intercept or time effects, and with time-varying coefficients of Market Potential with or without time-varying coefficients of per capita Capital Stock. Figure 4 shows the analogous of figure 3, with all the coefficients being time-varying. No trend appears in the coefficients of Market Potential. Figure 4 – Time-varying estimation in first differences with annual data 1995-2008 Sample without regions of Eastern European countries Of course, this result could be due to the fact of working with annual data. Therefore we have dropped the data of the year 1995 from our sample period of 14 years in order to build two panels of first differences in 3 or 4 years intervals. Just for illustrative purposes Table 8 present the estimation with all time-varying coefficients. 23 Table 8 – Time varying estimation in first differences of 4 years, by regimes Eastern European countries excluded (1) (2) (3) (4) (5) (Intercept) -0.164*** -0.115*** -0.248*** -0.103*** -0.229*** (0.021) (0.023) (0.039) (0.026) (0.033) year2004 0.084*** 0.046 0.154*** 0.047 0.129** (0.024) (0.027) (0.045) (0.031) (0.039) year2008 -0.005 -0.077* 0.088 -0.160** 0.097* (0.033) (0.039) (0.056) (0.048) (0.046) year2000xlKSp 0.062 -0.108 0.195** 0.069 0.015 (0.049) (0.088) (0.070) (0.059) (0.080) year2004xlKSp 0.195** 0.050 0.270** 0.185 0.211* (0.069) (0.111) (0.101) (0.108) (0.090) year2008xlKSp 0.212** 0.099 0.321** -0.012 0.293*** (0.068) (0.089) (0.102) (0.123) (0.087) year2000xlMP2GVA 2.123*** 1.948*** 2.580*** 1.607*** 2.716*** (0.153) (0.173) (0.271) (0.191) (0.241) year2004xlMP2GVA 1.728*** 1.767*** 1.821*** 1.320*** 2.021*** (0.186) (0.265) (0.277) (0.275) (0.287) year2008xlMP2GVA 2.651*** 3.110*** 2.369*** 4.082*** 2.068*** (0.331) (0.422) (0.506) (0.558) (0.425) R-squared 0.517 0.615 0.461 0.539 0.528 adj. R-squared 0.509 0.597 0.448 0.523 0.512 F 82 60 32 44 42 N 618 309 309 309 309 Notes: First differences estimations for a panel of the years 1996, 2000, 2004 and 2008. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA) and the log of per Capital Stock (lKSp). Columns (3) and (4) present the estimation for the subsamples "rich" and "poor", respectively defined as regions with a log of per capita GVA over or under the median in 1995. Columns (5) and (6) present the estimation for the subsamples "center" and "periphery", respectively defined as regions with a log of mean distance to the other regions under or over the median. Surprisingly, the difference of the log of per capita Capital Stock loses significance. For instance, in a non-reported estimation with the 3years panel, a common intercept and a constant coefficient of per capita Capital Stock, this coefficient is only significant at 1% level for the total sample (column 1) and the “poor” regime (column 3). But the time varying coefficients of Market Potential are always significant at the 1% level. On the other hand, if the results in column 5 of Table 8 could point to a decreasing coefficient of Market Potential for the peripheral regions, the estimated coefficients in this case are 1.573, 1.732, 2.099 and 1.921, i.e, they would be stable or approximately increasing. Taking together the results in Table 7 and Table 8 and other non-reported estimations, we can estimate a kind of “short-run” impact of Market Potential. The short-run coefficient would be around 2.0, ten times more than the 0.2 estimated in section 6 for the long-run. The estimation for the long-run is preliminary and the time series are not long, but in all these exercises no consistent trend in this coefficient is found for the peripheral regions. Maybe it is the curse of distance. 24 8. Conclusions The present work has several shortcomings. Currently, we are studying heterokedasticity and serial correlation in our estimates and the endogeneity of the growth rate of Market Potential. Additionally, a Moran on the residuals of our baseline cross-sectional estimation shows that they are spatially autocorrelated. Therefore, we are implementing spatial econometrics techniques to take account of this. If the relevant model to correct for spatial autocorrelation is a spatial error model, the estimations in this paper are inefficient but not bias, so the discussion about the magnitude of the elasticity of per capita GVA to Market Potential remains valid. If the relevant model includes the spatial lag of the endogenous variable, our previous estimates are biased. And things would get more complex because of the joint endogeneity of Market Potential and the spatial lag of per capita GVA. However, the results presented here provide arguments for the discussion in at least two aspects. On one hand we utilize a particular methodology to evaluate the dependence of income per capita with respect to the spatial economic structure of economic activity, what it is related with the so called “curse of distance”. On the other hand, we provide evidence of an approximately stable elasticiy of the regional European per capita GVA to Harris Market Potential, with a magnitude around 0.2 in the long-run and a more preliminary estimation of 2.0 in the short-run. Bibliography Acemoglu, D., S. Johnson; J. A. Robinson and P. Yared (2008): Income and Democracy; American Economic Review, 98(3), 808-842. Ahlfeldt, Gabriel M. and Arne Feddersen (2008): Determinants of spatial weights in spatial wage equations: A sensitivity analysis; Hamburg contemporary economic discussions 22. Arribas, I., F. Pérez and E. Tortosa-Ausina (2011). A New Interpretation of the Distance Puzzle Based on Geographic Neutrality. Economic Geography 87 (3), 335-362. Barro, Robert and Jong-Wha Lee (2010): A New Data Set of Educational Attainment in the World, 1950-2010; NBER Working Paper 15902. Bivand, Roger and Rolf J. Brunstad (2006): Regional growth in Western Europe: detecting spatial misspecification using the R environment; Papers in Regional Science 85:2, 277-297. Bonhomme, Stéphane and Elena Manresa (2012): Grouped Patterns of Heterogeneity in Panel Data, CEMFI, Madrid. 25 Boulhol, H., A. de Serres and M. Molnar (2008): The Contribution of Economic Geography to GDP per Capita: OECD Journal Economic Studies 9:1, 1–37. Boulhol,Hervé and Alain de Serres (2010): Have developed countries escaped the curse of distance?; Journal of Economic Geography 10 (1), 113-139. Brakman, S., H. Garretsen and C. Van Marrewijk (2009): Economic geography within and between European nations: the role of market potential and density across space and time; Journal of Regional Science 49:4, 777-800. Breinlich, Holger (2006): The Spatial Income Structure in the European Union. What Role for Economic Geography? ; Journal of Economic Geography 6:5, 593-617. Bruna, F., A. Faina and J. Lopez-Rodriguez (2012a): Market Potential and spatial autocorrelation in the European regions; University of A Coruna, Spain. Paper presented at the VI World Conference of the Spatial Econometrics Association (SEA), Salvador de Bahia, Brazil. Bruna, F., J. Lopez-Rodriguez and A. Faina (2012b): Market Potential, MAUP, NUTS and other spatial mysteries; University of A Coruna, Spain. Cairncross, F. (1997). The Death of Distance. How the Communications Revolution Will Change Our Lives. Harvard Business School Press, Boston, Massachusetts, USA. Combes, P.-P., T. Mayer and J.-F. Thiesse (2008): Economic Geography; Princeton University Press, New Jersey. Croissant, Yves and Giovanni Millo (2008): Panel Data Econometrics in R: The plm Package. Journal of Statistical Software 27(2). URL http://www.jstatsoft.org/v27/i02/. Disdier, A.-C. and K. Head (2008). The Puzzling Persistence of the Distance Effect on Bilateral Trade. The Review of Economics and Statistics 90 (1), 37-48. Dreger, C., G. Erber and D. Glocker (2011): Regional measures of human capital in the European Union; ERSA conference papers ersa10p337, European Regional Science Association. Fayad, G., R. H. Bates and A. Hoeffler (2011): Income and Democracy: Lipset's Law Inverted. Research Paper 61. Oxford Centre for the Analysis of Resource Rich Economies, University of Oxford Friedman, Thomas (2005) The World is Flat, New York: Farrar, Strauss and Giroux Fujita M., P. Krugman and A. Venables (1999): The Spatial Economy; MIT Press, Cambridge MA. Gundlach, Erich and Martin Paldam (2008): Income and Democracy: A Comment on Acemoglu, Johnson, Robinson, and Yared (2008); Economics Working Papers 200813, School of Economics and Management, University of Aarhus. Hanson, Gordon H. (2005): Market potential, increasing returns and geographic concentration; Journal of International Economics 67:1, 1 -24. Head and Mayer Head, Keith and Thierry Mayer (2006): Regional wage and employment responses to market potential in the EU; Regional Science and Urban Economics 36:5, 573-594. Head, Keith and Thierry Mayer (2006): Regional wage and employment responses to market potential in the EU; Regional Science and Urban Economics 36:5, 573-594. 26 Head, Keith and Thierry Mayer (2011): Gravity, market potential and economic development; Journal of Economic Geography 11:2, 281-294. Hering, Laura and Sandra Poncet (2009): The impact of economic geography on wages: Disentangling the channels of influence; China Economic Review 20:1, 1-14. Honaker, J., G. King and M. Blackwell (2011): Amelia II: A Program for Missing Data. Journal of Statistical Software, 45(7), 1-47. URL http://www.jstatsoft.org/v45/i07/. Karahasan, Burhan Can and Enrique Lopez-Bazo (2011): The Spatial Distribution of Human Capital: Can It Really Be Explained by Regional Differences in Market Access?; IREA Working Papers 02, University of Barcelona, Research Institute of Applied Economics. Krugman, Paul (1991): Increasing returns and economic geography; Journal of Political Economy 99:3, 483-99. Lopez-Rodriguez, J., A. Faina and J. Lopez Rodriguez J. (2007): Human Capital Accumulation and Geography: Empirical Evidence from the European Union; Regional Studies, 42, 217-234. Lucas, Robert Jr. (1988): On the mechanics of economic development; Journal of Monetary Economics 22(1), 3-42. Mankiw, N.G, D. Romer and D. N. Weil (1992); A contribution to the Empirics of Economic Growth; The Quarterly Journal of Economics 107(2), 407-437. Redding, Stephen and Anthony J. Venables (2004): Economic geography and international inequality; Journal of International Economics 62:1, 53-82. Redding, Stephen and Peter K. Schott (2003): Distance, Skill Deepening and Development: Will Peripheral Countries Ever Get Rich? ; Journal of Development Economics 72:2, 515-41. Rodriguez-Pose, Andres (2011): Economists as geographers and geographers as something else: on the changing conception of distance in geography and economics; Journal of Economic Geography 11:2, 347-356. Rodriguez-Pose, Andres and Vassilis Tselios (2011): Mapping the European regional educational distribution; European urban and regional studies 18(4), 358-37. Stock, James H. and Motohiro Yogo (2005): Testing for Weak Instruments in Linear IV Regression, in Donald W.K. Andrews (ed.), Identification and Inference for Econometric Models; Cambridge University Press, ch. 5, 80-108. 27 Annex - Comparison of human capital variables in the baseline long run regression Table 1 compares different measures of human capital in science and technology in our broad sample of European regions. Columns (3) and (4) show that the people who are employed in a S&T occupation (hrsto) is not statistically meaningful because it does not control for their educational levels. The people who have successfully completed education at the third level in S&T fields (hrste) is a variable competing with ours in terms of regression diagnostics, and it has a few less missing data in during the sample period 1995-2008. But we have preferred to use hrstc_pop because of its more restricted meaning: people who have successfully completed education at the third level in S&T fields of study and are employed in a S&T occupation. This variable is called by Eurostat the "core" variable of human resources in science and technology". The choice of share of population instead of share of the active population is because our endogenous variable is per capita GVA, so the share of population controls for different shares of active population in total population. A table 1 - Human resources in science and technology (S&T) lKSp lMP2GVA hrste_pop hrste_act hrsto_pop (1) (2) (3) (4) (5) (6) (7) 0.860*** (0.007) 0.134*** (0.009) 1.098*** (0.072) 0.864*** (0.007) 0.135*** (0.009) 0.891*** (0.008) 0.166*** (0.011) 0.903*** (0.008) 0.187*** (0.011) 0.851*** (0.007) 0.130*** (0.010) 0.860*** (0.007) 0.137*** (0.010) 0.834*** (0.007) 0.159*** (0.010) 0.888*** (0.057) 0.086 (0.113) -0.329*** (0.083) hrsto_act 2.647*** (0.148) hrstc_pop 1.626*** (0.104) hrstc_act 2.323*** (0.137) hrstc_pop_IMP R-squared 0.900 0.900 0.893 0.894 0.903 0.901 0.904 adj. R-squared 0.895 0.895 0.889 0.889 0.898 0.896 0.900 F 9556 9581 9089 9137 9786 9558 11412 N 3207 3207 3274 3274 3171 3171 3640 Notes: Pooled OLS estimation 1995-2008 with (omitted) year dummies. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and variables of human resources in science and technology (S&T). Variables _pop are shares of population and variables _act are shares of active population. hrste is the people who have successfully completed education at the third level in S&T fields. hrsto is the people who are employed in a S&T occupation (hrstc_pop). hrstc (core) is the people who have successfully completed education at the third level in S&T fields (hrste) and are employed in a S&T occupation (hrsto). hrstc_pop_IMP is hrstc_pop after an imputation of missing data using the time trend specific to each region with a polynomial of degree 2 (hrstc_popit = ß0 + ß1t + ß1t2). 28 Table 2 shows an additional comparison among our measure of core human resources in S&T and other human capital measures in a common sample period 2000-2008. The competing variables in this case are the two variables collecting tertiary education attainment (edat2564N2_ter and edLFSs2564_ter). The first of them improves a little bit the regression diagnostics, but it has more missing data even in this restricted sample of years. The main conclusion of columns (3) and (6) is that these two variables keep the elasticity of per capita GVA to Market Potential in the range 0.11-0.16. Alternatively, the main advantage of the variables of human resources in S&T is that there are data available since 1995. A table 2 – Core variable of human resources in S&T, education attainment of population and highest level of education attained by workers (1) (2) (3) (4) (5) (6) (7) 0.837*** 0.848*** 0.848*** 0.842*** 0.883*** 0.870*** 0.807*** lKSp lMP2GVA hrstc_pop_IMP hrstc_pop edat2564N2_ter (0.008) 0.145*** (0.011) 2.532*** (0.161) (0.008) 0.136*** (0.011) (0.008) 0.114*** (0.011) (0.008) 0.210*** (0.011) (0.008) 0.190*** (0.012) (0.009) 0.162*** (0.011) (0.009) 0.283*** (0.011) 2.740*** (0.169) 1.217*** (0.065) -0.554*** (0.032) edat2564N2_sec -0.269*** (0.033) edat2064N2_secter 0.851*** (0.060) edLFSs2564_ter -0.746*** (0.030) 0.918 0.913 8014.368 2148 edLFSs2564_sec 0.920 0.916 0.922 0.921 0.912 0.904 R-squared 0.915 0.911 0.917 0.916 0.907 0.899 adj. R-squared 8921.626 8199.800 8711.682 8557.023 7666.195 6733.989 F 2340 2264 2227 2227 2227 2148 N Notes: Pooled OLS estimation 2000-2008 with (omitted) year dummies. Table display coefficients: * significant at 10% level; ** at 5% level; *** at 1% level. Standard errors are in brackets. The dependent variable is the logarithm of per capita Gross Value Added. The independent variables are the log of Market Potential (lMP2GVA), the log of per Capital Stock (lKSp) and human capital variables. hrstc (core) is the share of population who has successfully completed education at the third level in S&T fields (hrste) and is employed in a S&T occupation (hrsto). hrstc_pop_IMP is hrstc_pop after an imputation of missing data using the time trend specific to each region with a polynomial of degree 2 (hrstc_popit = ß0 + ß1t + ß1t2). edat2564N2 and edat2564N2_sec are shares of persons aged 25-64 with upper secondary education attainment and with tertiary education attainment respectively. edat2064N2_secter is the share of persons aged 20-64 with upper secondary or tertiary education attainment. edLFSs2564_ter and edLFSs2564_sec are shares of employment aged 25-64 with first and second stage of tertiary education and with upper secondary and post-secondary non-tertiary education respectively. 29