Abstract:Understanding the eutrophication mechanism and quantifying the relationship between key nutrient loadings and the resulting algal blooms has critical scientific and practical significance for effectively improving water quality conditions in Lake Chenghai. Modeling approaches that are typically applied to explore the relationship between nutrients and algal blooms can be grouped into two general categories: (1) data driven approaches, and (2) mechanistic modeling approaches. Although both are potentially applicable approaches for Lake Chenghai, it was determined that only the data driven approach is viable in this case because of the severe data limitations that precluded the development of a mechanistic water quality model. Given the data availability and the need for a universal functional mapping capability, a Neural Network (NN) methodology was selected as the data-driven modeling platform for constructing the Lake Chenghai water quality model. NN models can yield potential deceptive effects caused by inclusion of insensitive parameters in the input nodes. With this in mind, the modeling analysis first used a nonlinear curve-fitting and correlation analytical method to screen all the monitored physical and chemical parameters to identify the key parameters. It was discovered that among all the parameters, only TP and TN were qualified for being included in the Lake Chenghai model because not only did they show very high correlations with the chlorophyll-a concentration, but also, they are immune to the data-time coupling issues experienced by other parameters such as inorganic nitrogen and phosphorus. Following the identification of the key input parameters, a series of NN models with various architectures were developed to explore the quantitative relationship between chlorophyll-a and nutrients in the lake. Through extensive evaluations, it was discovered that when the complexity of the NN model increased to such a level that the number of hidden node was greater than or equal to 3, the NN models started to show the trait of memorization dominance, suggesting that they mimiced the observed data pattern through memorization rather than reasoning, thereby degrading the generalization capacity of the model. A NN model that cannot be generalized is less useful for practical application; therefore, the more complex network structures (those with greater than or equal to 3 hidden nodes) were discarded. As a result, two simple networks, having one and two hidden nodes respectively, were adopted as the final water quality models for Lake Chenghai. Two network structures were simultaneously applied as the basis of further analysis in this study because of the predictive uncertainty associated with network configurations. The two NN models were used to conduct a number of scenarios for analyzing the eutrophication mechanisms in Lake Chenghai. The modeling results showed that eutrophication in Lake Chenghai is controlled by a nested two-level limiting structure, where the dominant level is a nitrogen-phosphorus co-limiting structure, followed by a secondary limiting structure of nitrogen only. Based on the simulation results of the NN-based water quality models, a series of nonlinear functions relating chlorophyll-a concentration to water quality concentration control in Lake Chenghai were derived as a quick reference for eutrophication control decision making in the future.