Abstract
The association between diversity and development - both negative and positive - has been empirically tested for a limited set of diversity variables despite its centrality to the political economy discourse. Using a unique census-scale micro dataset from rural India containing detailed caste, religion, language, and landholding data (n ~ 13:25 million households) in combination with administrative data on human development, satellite measurements of luminosity as proxy for sub-national economic development, we show that an association between social heterogeneity and economic development is tenuous at best, and is likely an artifact of geographic, political, and ethnic units of analysis. We formally dene the "ethnic-geographic continuum" and develop a cogent theoretical framework for testing validity of theories across varying levels of ethnic and geographic aggregations. We show how our ethnic-geographic continuum framework accounts for the intersections between the Modiable Ethnic Unit Problem (MEUP) and the Modiable Areal Unit Problem (MAUP). We use seventeen different diversity metrics across multiple combinations of ethnic and geographic aggregations to empirically validate this framework, including the first ever census-scale enumeration and coding of elementary Indian caste categories (jatis) since 1931.