Intra-urban epidemiological risk model: Intra-urban malaria infection risk models will be constructed for the 4 cities at very high resolution. Epidemiological, demographic, socioeconomic, climatic and environmental data described above will be combined in a machine learning algorithm in order to (i) identify factors that influence malaria risk within the urban area and (ii) predict malaria risk across the cities. The geographic coordinates of each community survey will be used as unique identifiers to extract factors within 1km radius. Boosted Regression Tree (BRT) modelling will then be used to examine the relationship between parasite prevalence and demographic, socioeconomic, climatic and environmental covariates. BRT is a machine learning technique increasingly used for modelling event distribution in ecology and epidemiology (Elith et al. 2006, Leathwick et al. 2006, Martin et al. 2011). Cross-validation techniques will be used to evaluate model predictive performance, by randomly separating the dataset into a modelling dataset that will be used to fit the model and a testing dataset that will be used for testing the model’s predictive performance. The BRT model will be used to predict malaria parasite prevalence on a 100m grid level for the four cities.

Inter-urban epidemiological risk model: Multi-level models will be used to characterize inter-urban variations in malaria infection risk using child-level, household-level and cluster-level indicators, as described above. The objective of the inter-urban model is to better understand the influence of environmental and climatic variables on the inter-city variability in malaria risk, by controlling the effect of demographic and socioeconomic factors. Epidemiological, demographic, socioeconomic, climatic and environmental variables will be combined in a spatial statistic algorithm in order to identify the main factors responsible for malaria risk in a large set of 20 cities. In addition, variables that quantify urban and peri-urban changes (e.g. rate of urban growth, city densification, deforestation, desertification, extension of irrigated cropland, wetland draining…) will be tested. In a second step, we will evaluate possibilities to predict malaria risk in every African city based on a limited set of covariates extracted from remote sensing. This will potentially lead to a better classification of urbanization in the context of malaria epidemiology and provide guidelines for improving rules to be applied to predict the intensity of urban malaria transmission in continental models.