.. geoxgboost documentation master file, created by sphinx-quickstart on Mon Jan 27 17:34:57 2025. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Geographical-XGBoost documentation ======================== An implementation of XGBoost for Geographical Anlaysis. Geoxgboost is a Python library that implements the Geographical-XGBoost (G-XGBoost) algorithm for spatially local regression. G-XGBoost belongs to the family of Spatial Machine Learning algorithms and modifies the standard XGBoost algorithm (extreme gradient boosting trees) to handle spatial data and spatial heterogeneity. G-XGBoost: • Applies the concept of geographically varying models in XGBoost: This means it creates local models that analyze data within a specified neighborhood using spatial weights. • Creates an ensemble of the global and local models: It utilizes both global and local models for training, validation, and prediction, leading to improved model accuracy. • Calculates local feature importance using spatial weights through the gain function. Beyond being a predictive tool, G-XGBoost is also a valuable exploratory tool for identifying spatial heterogeneity. It evaluates how spatially weighted feature importance varies across different locations, enhancing the model's interpretability. The theoretical presentation, mathematical formulation, and experimental results of G-XGBoost, across six regression models and six benchmark datasets, can be found in this paper: [Insert Paper Citation Here]. .. image:: documentation.png :width: 850 Figure: G-XGBoost ensemble for spatially local regression. A different sub-model is built for every spatial unit (i), including only its neighboring units. The optimal bandwidth value (either distance or number of nearest neighbors) is defined by minimizing the cross validation criterion. Hyperparameters are selected using grid search through nested cross validation of the global model. G-XGBoost results from the ensemble (y_ens) of global (y_gl) and local (y_loc) models using the alpha weight (α) regularization hyperparameter. Feature importance is produced for the local models. Installation ---------- pip install geoxgboost Tutorial ======================== A comprehensive tutorial is available on GitHub, guiding users through the entire process, from project setup in PyCharm to running the demo. No prior Python knowledge is required. The tutorial provides step-by-step instructions on how to: • Download and install PyCharm to create the Demo project. • Download the necessary data and install the geoxgboost library. • Run the geoxgboost algorithm. • Extract outputs in Excel format. • Understand the content of each output file. Like any machine learning algorithm, G-XGBoost requires hyperparameter tuning. Guidelines for tuning are provided in the accompanying paper. The demo utilizes predefined hyperparameter values for convenience. However, users are encouraged to experiment with different hyperparameter combinations to optimize model performance. Geoxgboost facilitates hyperparameter tuning through a built-in function called create_param_grid, enabling efficient grid search. The functions, parameters, and examples of the geoxgboost package are available in https://geoxgboost.readthedocs.io/en/latest/ Demo data ---------- Boston housing dataset Download data from: https://github.com/geogreko/DemoGXGBoost/tree/main The following files are included in the GitHub repository: 1. Coords.csv: Coordinates of the spatial units. 2. Data.csv: Dependent and independent variables. 3. DataDescription.xlsx: Data description. 4. GXGB_call_demo.py: Python script to analyze the Boston housing dataset. 5. PredictCoords.csv: Coordinates of the spatial units for prediction. 6. PredictData.csv: Values of the independent variables for the spatial units where predictions will be made. 7. Tutorial_geoxgboost.pdf: A guide for using the demo. How to cite ---------- Grekousis G, (2025). Geographical-XGBoost: A new ensemble model for spatially local regression based on gradient-boosted trees. Journal of Geographical Systems. https://doi.org/10.1007/s10109-025-00465-4 Geoxgboost is freely available provided the above paper is cited. .. include:: geoxgboost.rst