Abstract:The fascinating scientific questions of how and where species will potentially distribute under current and changing environmental conditions have inspired many biogeographers, ecologists, and managers to predict the potential distributions of plants or animals by quantifying species-environment relationships. The species distribution model (SDM), an essential modeling tool, has been developed. A key challenge in using real species data (presence-absence data and/or presence-only data) for SDM is the uncertainty about where and how the thousands of species distribution data records are attained. The majority of species distribution data sets are derived from herbaria, university databases, museums, or even amateur field workers. Therefore, attaining a reasonable explanation for species distribution in the wild is often hindered by the problems inherent in these large data sets, including species-specific properties (e.g., species prevalence, dispersal barriers, interspecific competition, distribution pattern), biased sampling (e.g., reachability of observation sites, visibility or detectability of observation objects), variability among observation methods (e.g., time interval and spatial range), and habitat types, particularly for data collected over a long time interval and a large spatial range. The use of virtual species could provide a suitable unifying framework to select the most appropriate model for such evaluations, by comparing the predictive accuracy and virtual distributions in a geographic information system model of a real landscape. In recent years, virtual species distribution models have become increasingly important tools to study various problems in the fields of conservation biology, ecology, biogeography, climate change research, and evolution. Virtual species have many advantages, including the ease of attaining a large number of data sets for each scenario, ability to fully control the quality of data, prevention of the over-fitted phenomenon inherent to SDMs, and the ability to independently evaluate the predictive power of SDMs regardless of other factors. There are three common methods to generate virtual species: the additive method, multiplicative method, and comprehensive method. Here, we provide an overview of recent advances in the development of virtual species distribution models by using spatially explicit simulated distribution data to represent the 'true’ species' distributions. We highlight the four main applications of these models, including species-specific characteristics, sampling bias, geographic information, and threshold standard for species occurrence, in evaluating model performance. Considering the current limitations, we propose future directions for the development of virtual species, including avoiding excessive assumptions that do not reflect reality, optimizing the generation of virtual species to avoid the compensatory effect and reflect true species dynamics and biological characteristics, and generating a virtual model organism, population, community, and ecosystem. To help researchers generate virtual species easily and quickly, our research team has developed a software package, SDMvspecies, based on R language. The software package has four methods to create virtual species, including the niche synthesis method, pick mean method, pick median method, and artificial bell-shaped response method. The SDMvspecies software can be accessed with a free download from the website http://cran.r-project.org/web/packages/sdmvspecies/. We further address the need for better integration of virtual species with ecological theory, which is expected to lead to new questions, theories, and an improved mechanistic understanding of ecological systems.