Abstract: Making informed judgments and maximising the use of the available transportation resources are made easier with accurate bus passenger flow forecasting. A wide range of elements connected to the travel environment that have an impact on passenger flow can be identified using data from various sources. A decent prediction model should resolve the accompanying multicollinearity problem in addition to fully using the latent information hidden in multisource data. Based on this concept, we offer a special scaled stacking gradient boosting decision tree (SS-GBDT) model to anticipate bus passenger flow. The SS-GBDT consists of the prior feature generation module and the following GBDT prediction module. The stacking method was used in the prior module to provide a number of improved multi-source data characteristics using a few basic models with comparable performance. By using a quasi-attention-based mechanism, we explicitly develop a scaled stacking approach (precision-based scaling and time-based scaling). The prediction module improves prediction performance by using the newly developed characteristics as input to calculate the passenger flow using a GBDT model with layered data. On two actual bus routes in Guangzhou, China, the plan is tested. Considering the results, it can be concluded that SS-GBDT is superior in terms of prediction stability and accuracy. Additionally, it is better suited to handle the multicollinearity issue with multisource data. The variables that impact predicting passenger flow can also be sorted. When there are sizable amounts of data, the prediction model is flexible and scalable, enabling the integration of a number of influencing factors.
| DOI: 10.17148/IJARCCE.2022.11749